Presented by Victor Engmark
Copyright © 2018 Catalyst IT, 2020 Victor Engmark
Huge thanks to Catalyst IT and Toitū Te Whenua Land Information New Zealand for letting me release this with an open license
Made with reveal.js
find
,
grep
,
sort
, etc.)
ssh 127.0.0.1
worksPS1='\$ ' PS2='> '
Count commands with quotes in
$username
’s history.
Syntactic double quotes |
count=""
|
Command substitution |
count="$()"
|
Word splitting |
count="$(grep --count)"
|
Syntactic single quotes |
count="$(grep --count
'')"
|
Quoted string |
count="$(grep --count '"')"
|
Tilde expansion |
count="$(grep --count '"'
~)"
|
Count commands with double quotes in
$username
’s Bash history.
Syntactic double quotes |
count="$(grep --count '"' ~"")" ☹
|
Variable expansion |
count="$(grep --count '"' ~"${username}")"
|
Double quoted literal |
count="$(grep --count '"'
~"${username}/.bash_history")"
|
👍, right?
$ username="$USER"
$ count="$(grep --count '"' ~"${username}/.bash_history")"
grep: ~victor/.bash_history: No such file or directory
$ ls ~victor/.bash_history /home/victor/.bash_history
$ wtf bash: wtf: command not found
The order of expansions is: […] tilde expansion, […] variable expansion, […]
In other words, the username needs to be a literal.
Solution:
eval
getent
+cut
+obscure content.
$ username="$USER"
$ user_home="$(getent passwd "$username" | cut --delimiter=':' --fields='6')"
$ count="$(grep --count '"' "${user_home}/.bash_history")"
$ echo "$count"
169
Even POSIX mode does not lead to portable code:
$ type -a [[ [[ is a shell keyword $ bash --posix $ [[ 1 ]]
$ echo $? 0
Readability often suffers — compare
sort -V
with
sort --version-sort
Bashisms are helpful:
read -a
done <(my_script)
$'\n'
Speed:
$ wget --quiet https://norvig.com/big.txt
$ wc big.txt 128457 1095695 6488666 big.txt
$ time grep foobar big.txt
real 0m0.031s user 0m0.031s sys 0m0.001s
$ time while read -r -u 9 > do > : > done 9< ./big.txt
real 0m3.532s user 0m3.113s sys 0m0.266s
No-op is >100 times slower!
Limited data structures:
$ help declare
[Trimmed for brevity]
Set variable values and attributes.
Options:
-f restrict action or display to function names and definitions
Options which set attributes:
-a to make NAMEs indexed arrays (if supported)
-A to make NAMEs associative arrays (if supported)
-i to make NAMEs have the `integer' attribute
-n make NAME a reference to the variable named by its value
Everything else is a string.
No nested data structures:
No exceptions or try/catch/finally, just exit codes.
Very limited functions:
Too much state to keep in mind at all times:
local
, file and exported
variables (including functions)
exec
redirects
$PWD
set
and
shopt
umask
#!/usr/bin/env bash
Works as long as bash
is on
the $PATH
.
The shebang line is only relevant if you call the script directly:
foo.bash
if it’s on the
$PATH
./foo.bash
/full/path/to/foo.bash
Otherwise the shebang line is ignored:
. foo.bash
uses the
interpreter of the parent shell
sh foo.bash
uses
sh
, which may be Bash,
Dash or something else
. foo.bash
and
bash foo.bash
also ignore the
executable flag:
$ echo 'echo "$SHLVL"' > test.bash
$ ./test.bash
bash: ./test.bash: Permission denied
$ . ./test.bash
1
$ bash ./test.bash
2
Can you find any other differences between sourcing and running a script?
$ bash --noprofile --norc
$ cd "$(mktemp --directory)"
$ echo 'declare -p' > test.bash
$ chmod u+x test.bash
$ . test.bash first second > sourced.log
$ ./test.bash first second > run.log
$ git diff sourced.log run.log
Mostly $BASH_
variables.
Read left-to-right:
$ { echo info; echo error >&2; } > result.txt 2>&1
$ cat result.txt
info error
$ { echo info; echo error >&2; } 2>&1 > result.txt
error
$ cat result.txt
info
One file per redirect:
$ echo foo > foo.txt
$ echo bar > bar.txt
$ echo > ./*.txt
bash: ./*.txt: ambiguous redirect
$ cat foo.txt bar.txt
foo
bar
cat
is only needed in corner
cases like combining stdin and a file:
$ echo foo > foo.txt
$ echo bar | cat foo.txt -
foo
bar
Most of the time,
cat FILE | COMMAND
can be
simplified to COMMAND FILE
.
Redirect the rest of the script:
exec > out.log 2> error.log
Print and save the output with
tee
:
$ echo foo | tee output.log
foo
$ cat output.log
foo
--append
to append to file.
Redirect a stream to a command and back again:
$ (echo out; echo foo >&2; echo bar >&2) 2> >(grep bar >&2)
out bar
Different processes will process input at different rates. This way standard error and standard output can get out of sync:
$ { echo first >&2; echo second; echo third >&2; } | tee out.log
first third second
Running du --summarize /*
as a
non-root user prints a lot of
Permission denied
messages.
Silence these. Additionally:
du --summarize /* 2> >(grep --invert-match
'Permission denied' >&2)
Subshell environment for each subsequent command:
$ count=0 $ mount | grep '^tmpfs ' | while read > do > (( ++count )) > done $ echo "$count"
0
Context is lost.
Bring the important command into the current context:
$ count=0 $ while read -r -u 3 > do > (( ++count )) > done 3< <(mount | grep '^tmpfs ') $ echo "$count"
7
Using a file descriptor above 2 avoids input being swallowed by tools such as SSH reading standard input inside the loop.
All the exit codes in a pipeline:
$ (exit 2) | true | false
$ echo "${PIPESTATUS[@]}"
2 0 1
Application specific workarounds to get colour output:
grep --color=always […] | less --RAW-CONTROL-CHARS
Colour codes are characters.
Given a file with an IP per line (you can just use your own IP repeatedly), print the current time on each host.
$ while read -r -u 3 ip
> do
> ssh "$ip" date
> done 3< hosts.txt
Each command, including a pipeline, can only have one exit code. How is that determined? Hint:
$ false | false | false
$ false | false | true
$ false | true | false
…
Some tools have a flag to separate or terminate entries with NUL.
You cannot store NUL in a variable.
You cannot put NUL in a literal:
$ printf '%q\n' $'foo\0bar\0baz\0'
foo
Bash doesn’t like half the world’s files.
Single quotes for any literals without single quote:
$ printf '%s\n' '|\|o e$cape'
|\|o e$cape
Double quotes for strings with single quotes, command substitutions or variables:
$ subject='this' $ printf '%s\n' "Can't $(basename /usr/bin/touch) ${subject}"
Can't touch this
Dollar single quotes for escape sequences:
$ printf '%s\n' $'first\nsecond'
first second
You can mix quotes even in a single word:
$ "e"c'h'$'o' '"'"'"
"'
<< NAME
is almost like a
double quoted context:
$ cpu_count=8 $ cat << EOF > [hardware] > cpu_count=${cpu_count} > EOF
[hardware] cpu_count=8
Does a similar job to
envsubst
.
Backslash in << NAME
is
literal except unescaped before a newline:
$ cat << EOF > a\ > b > EOF
ab
$ cat << EOF > c\d > EOF
c\d
$ cat << EOF > e\\ > f > EOF
e\ f
Quotes are literal in
<< NAME
:
$ cat << EOF > '" > EOF
'"
<< 'NAME'
works like a
single quoted context:
$ result=foo $ cat << 'EOF' > ${result} > EOF
${result}
Here strings are syntactic sugar for
echo WORD | my_command
:
$ IFS=/ read -a directories -r <<< "$HOME" $ printf '%q\n' "${directories[@]}"
'' home username
Also moves the command into current context.
Never modify a running script — they are read in chunks:
$ printf '%s\n' 'sleep 1' 'echo foo' > test.bash
$ bash test.bash & sed --in-place 's/foo/bar/' test.bash
The result is
either foo
or
bar
, but it is not reproducible!
Avoid word splitting in unquoted strings:
$ printf '%s\n' foo bar
foo bar
$ printf '%s\n' foo\ bar
foo bar
\n
is a
printf
escaped
character, not a shell one.
Literal \n
terminators:
$ printf '%s\\n' foo bar
foo\nbar\n
No newline at end of output, so it’s followed
immediately by $PS1
.
Escaping escape sequences:
$ printf %s\\\\n foo bar
foo\nbar\n
The number of backslashes always doubles, because each character has to be escaped separately in the next context.
Avoid multiple escape levels.
envsubst
printf '%q'
$variable
Default value:
PATH="${PATH-/bin:/usr/bin}"
:-
also matches
empty value.
Use with set -o nounset
.
$variable
Replacement value:
result="${1+defined}"
:+
only matches
non-empty value.
$variable
The right hand side can be more complex:
$ csv= $ entry='x' $ csv="${csv:+"${csv},"}${entry}" $ echo "$csv"
x
$ csv=foo,bar $ entry='x' $ csv="${csv:+"${csv},"}${entry}" $ echo "$csv"
foo,bar,x
$variable
— 10m
exercise
Add non-empty $path
to
$PATH
cleanly - there should
be no leading or trailing colons even if
$PATH
started out empty.
$ PATH="${PATH:+"${PATH}:"}${path}"
$@
is where it’s atArguments beyond $9
:
$ set -- {a..z} $ echo ${26}
z
Arrays are zero-indexed, but
$0
is special — it usually
contains the script name.
$@
is where it’s atAvoid $*
:
$ set -- 'a b' 'c d' $ for argument in "$*" > do > printf '%s\n' "$argument" > done
a b c d
$ for argument in $* > do > printf '%s\n' "$argument" > done
a b c d
All arguments as a single word.
$@
is where it’s atUse "$@"
:
$ for argument > do > printf '%s\n' "$argument" > done
a b c d
Each argument as a separate word.
Default for
loop target, no
need for in "$@"
.
$@
is where it’s atNamed arguments handling skeleton:
set -o errexit
arguments="$(getopt --options='' --longoptions='case-sensitive,case-insensitive,help,exclude:' --name='script-name' -- "$@")"
eval set -- "$arguments"
unset arguments
while true
do
case "$1" in
[continued…]
esac
done
$@
is where it’s atBoolean options (“flags”):
--case-sensitive)
case_sensitive=1
shift
;;
--case-insensitive)
unset case_sensitive
shift
;;
$@
is where it’s atUsage instructions:
--help)
echo 'script-name [--case-sensitive|--case-insensitive] [--help] [--exclude=PATTERN ...] [--] FILES'
exit 0
;;
$@
is where it’s atRepeating key/value arguments:
--exclude)
excludes+=("$2")
shift 2
;;
$@
is where it’s atEnd of options separator & unhandled arguments:
--)
shift
break
;;
*)
echo "Unhandled option $(printf '%q' "$1"). Please report to …" >&2
exit 2
;;
$@
is where it’s at — 10m
exercise
What does argument handling do in each case?
bash test.bash --case-sensitive . /some/path
bash test.bash --help
bash test.bash --exclude='.git' --exclude='.svn' -- --actual-filename.txt
bash test.bash --blah /some/path
$@
is where it’s at — solution
What does argument handling do in each case?
bash test.bash --case-sensitive . /some/path
case_sensitive=1
@=('.' '/some/path')
bash test.bash --help
[Prints help message and returns from script with exit code 0]
bash test.bash --exclude='.git' --exclude='.svn' -- --actual-filename.txt
excludes=('.git' '.svn')
@=('--actual-filename.txt')
bash test.bash --blah /some/path
[Prints “script-name: unrecognized option '--blah'” and returns from script with exit code 1 from getopt]
Collect non-$IFS
characters:
$ matches=($(grep --only-matching . <<< $'some foo\nother foo')) $ echo "${matches[@]}"
s o m e f o o o t h e r f o o
No quotes to enable word splitting.
Collect $IFS
-separated
or -terminated “words”:
$ while IFS=$'\t' read -a cells -r > do > echo Line > printf '%s\n' "${cells[@]}" > done <<< $'column 1\tcolumn 2\nvalue 1\tvalue 2'
Line column 1 column 2 Line value 1 value 2
Append to an array:
$ characters=({a..z}) $ characters+=({0..9}) $ echo "${characters[@]}"
a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9
Can be sparse:
$ characters=(a b c) $ unset 'characters[1]' $ characters+=([25]=z) $ echo "${characters[@]}"
a c z
$ echo "${#characters[@]}"
3
Don’t loop from 0 through
$(( "${#name[@]}" - 1 ))
.
Create an array of all the executables on your
$PATH
.
$ IFS=: read -a paths -r <<< "$PATH"
$ for path in "${paths[@]}"
> do
> executables+=("$path"/*)
> done
Associative arrays use:
$ declare -A abbreviations=(['GNU HURD']="GNU's not Unix! HIRD of Unix-replacing daemons" ['sed']='stream editor') $ declare -p abbreviations
declare -A abbreviations=([sed]="stream editor" ["GNU HURD"]="GNU's not Unix! HIRD of Unix-replacing daemons" )
$ printf '%s\n' "${!abbreviations[@]}"
sed GNU HURD
$ echo "${abbreviations['sed']}"
stream editor
Not insertion ordering.
Not numerically indexable.
Must be declare
d:
$ example=([key]='value') $ declare -p example
declare -a example=([0]="value")
Print ‘key [length of value]’ for each key:
declare -A nicks=(['Bill Hicks']='William Melvin Hicks' ['Gandhi']='Mohandas Karamchand Gandhi')
[Your code]
Gandhi 26
Bill Hicks 20
$ for name in "${!nicks[@]}"
> do
> printf '%s %s\n' "$name" "${#nicks[$name]}"
> done
Old style conditionals are easy to break:
$ bash --noprofile --norc -o xtrace $ [ $foo = 'bar' ]
+ '[' = bar ']'
bash: [: =: unary operator expected
Only two arguments (ignoring “]
”) so Bash assumes a unary operation.
Old style conditionals have no way of grouping expressions such as “(A or B) and C”.
Use command conditionals and grouping:
{ [[ "$foo" = 'a' ]] || [[ "$foo" = 'b' ]]; } && [[ "$bar" = 'c' ]]
-a
and
-o
Left associative:
foo || bar && baz
≡
{ foo || bar; } && baz
$ false || echo failure && echo success
failure success
foo && bar || baz
≡
{ foo && bar; } || baz
$ false && echo success || echo failure
failure
Break it up:
if some_command
then
echo success
else
echo failure
fi
Write a single expression to check whether
$x
is between 0 and
$x_max
or
$y
is between 0 and
$y_max
(-ge
is ≥ and -le
is ≤).
{
[[ "$x" -ge 0 ]] && [[ "$x" -le "$x_max" ]];
} || {
[[ "$y" -ge 0 ]] && [[ "$y" -le "$y_max" ]];
}
Run in the same shell as the parent script:
$ shell_pid() { > echo $$ > } $ diff <(shell_pid) <(echo $$)
$ echo $? 0
Use return
to return an exit
code from the function without exiting the script:
$ escape_key_value_pairs() { > if [[ $# -eq 0 ]] || [[ $(( $# % 2 )) -ne 0 ]] > then > echo "Use: ${FUNCNAME} [KEY VALUE]..." >&2 > return 1 > fi > printf '%s=%q\n' "$@" > }
$ escape_key_value_pairs PS1 'My prompt > $ ' PS2
Use: escape_key_value_pairs [KEY VALUE]...
$ echo $? 1
Functions get their own argument list.
Use local
to declare function
scope variables:
$ directory="$(mktemp --directory)" $ filename='.bashrc_local' $ save_current_prompt() { > local directory="$HOME" > escape_key_value_pairs PS1 "$PS1" PS2 "$PS2" >> "${directory}/${filename}" > }
$ save_current_prompt $ tail --lines=2 ~/.bashrc_local
PS1=\\\$\␠ PS2=\>\␠
$ echo "$directory"
/tmp/tmp.1ORclnSGb9
This function pollutes the surrounding variable namespace:
reverse() {
arguments=("$@")
for index in $(seq $(( $# - 1 )) -1 0)
do
printf '%s ' "${arguments[$index]}"
done
printf '\n'
}
Change it so that none of the variables it assigns are propagated to the outer scope.
local arguments index
Numeric contexts:
$(( index++ ))
prints
result
index+=1
only if
declare -i index
first
(( index++ ))
[[ "$index" -eq 0 ]]
Integers only:
$ [[ 1.1 -eq 1 ]]
bash: [[: 1.1: syntax error: invalid arithmetic operator (error token is ".1")
Use tools like bc
for more
powerful maths.
0
starts octal:
$ echo $(( 077 ))
63
0x
starts hex:
$ echo $(( 0xff ))
255
Case insensitive.
N#
starts
base N (2-64):
$ echo $(( 64#a ))
10
$ echo $(( 64#A ))
36
$ echo $(( 64#@ ))
62
$ echo $(( 64#_ ))
63
Case insensitive if base ≤ 36.
Comes after variable expansion:
$ msb=BE $ lsb=EF $ echo $(( "0x${msb}${lsb}" ))
48879
Beware leading zeros:
$ month=08 $ (( month++ ))
bash: let: 08: value too great for base (error token is "08")
Find out in August!
Type coercion:
$ foo=one $ [[ "$foo" -eq 0 ]] && echo 'equal'
equal
Sum an array of hexadecimal number strings
without the
0x
prefix.
For example, (ffff 11)
should
sum to 65552 (65535 + 17).
$ declare -i sum=0
$ for number in "${numbers[@]}"
> do
> sum+="0x${number}"
> done
Build up argument lists or commands using arrays:
$ cat excludes.txt foo bar baz
$ while read -r -u 9 exclude > do > excludes+=(--regexp "$exclude") > done 9< excludes.txt
$ set -o xtrace
$ grep --invert-match "${excludes[@]}" <<< $'foo bar\nfoo baa\n'
+ grep --color=auto --invert-match --regexp 'foo bar' --regexp baz
foo baa
Only a single command and its arguments:
Read newline-terminated strings:
$ while read -r line > do > echo "$line" > done < <(printf '%s\n' 'foo' 'bar')
foo bar
A “line” in *nix operating systems.
Read newline-separated or -terminated strings:
$ while read -r line || [[ -n "$line" ]] > do > echo "$line" > done < <(printf $'foo\nbar')
foo bar
read
populates variable then
fails on non-newline character at EOF.
Read NUL-terminated strings:
$ cd "$(mktemp --directory)" $ touch 'backslash\separated' $'newline\nseparated' 'space separated' $ while IFS= read -d '' -r filename > do > printf '%q\n' "$filename" > done < <(find . -mindepth 1 -exec printf '%s\0' {} +)
./backslash\\separated $'./newline\nseparated' ./space\ separated
Read
$IFS
-terminated words:
$ read -r first second rest <<< ' aye bee cee dee ' $ printf '%q\n' "$first" "$second" "$rest"
aye bee cee\ \ \ dee
Trims leading, trailing and separating
$IFS
characters.
How many characters does
$result
contain?
$ result="$(printf '%s' $'foo\n\n')" $ echo "${#result}"
3
$()
removes trailing newlines.
$()
workaround:
$ result="$(printf '%s' $'foo\n\n'; printf x)"
$ result="${result%x}"
$ echo "${#result}"
5
<<<
(here string) is
bad in a different way:
$ wc --bytes <<< $'foo\n\n'
6
Unconditionally adds a newline.
Newline-preserving redirects:
$ printf $'foo\n\n' | wc --bytes 5
$ printf $'foo\n\n' > result.txt $ wc --bytes result.txt 5 result.txt
$ wc --bytes < <(printf $'foo\n\n') 5
echo
vs.
printf
:
$ echo foo | xxd -cols 1
00000000: 66 f 00000001: 6f o 00000002: 6f o 00000003: 0a .
echo
adds newline (0x0a) at
end of output.
echo
vs.
printf
:
$ printf '%s' foo | xxd -cols 1
00000000: 66 f 00000001: 6f o 00000002: 6f o
printf
formats arguments.
Save script arguments to a file, and reuse them in another script.
Hint: The only character which can’t be in a string is
NUL (\0
in
printf
).
Hint: read
’s
-r
flag avoids treating
backslashes specially.
for argument
do
printf '%s\0' "$argument"
done > arguments.bin
set --
while read -d '' -r argument
do
set -- "$@" "$argument"
done < arguments.bin
Command “success:”
$ if true
> then
> echo 'Success'
> fi
Success
Defined as exit code 0.
Completely application specific. For example, zero in arithmetic expressions:
$ count=0
$ echo $? 0
$ (( count=0 ))
$ echo $? 1
$ (( count++ ))
$ echo $? 1
$ (( count++ ))
$ echo $? 0
$ printf '%s\n' "$count" 2
Some fairly well–documented numbers in
/usr/include/sysexits.h
/$(nix eval --raw
nixpkgs.glibc.dev.outPath)/include/sysexits.h
:
$ grep '^#define ' /usr/include/sysexits.h
#define EX_OK 0 /* successful termination */
#define EX__BASE 64 /* base value for error messages */
#define EX_USAGE 64 /* command line usage error */
#define EX_DATAERR 65 /* data format error */
#define EX_NOINPUT 66 /* cannot open input */
#define EX_NOUSER 67 /* addressee unknown */
#define EX_NOHOST 68 /* host name unknown */
#define EX_UNAVAILABLE 69 /* service unavailable */
#define EX_SOFTWARE 70 /* internal software error */
#define EX_OSERR 71 /* system error (e.g., can't fork) */
#define EX_OSFILE 72 /* critical OS file missing */
#define EX_CANTCREAT 73 /* can't create (user) output file */
#define EX_IOERR 74 /* input/output error */
#define EX_TEMPFAIL 75 /* temp failure; user is invited to retry */
#define EX_PROTOCOL 76 /* remote error in protocol */
#define EX_NOPERM 77 /* permission denied */
#define EX_CONFIG 78 /* configuration error */
#define EX__MAX 78 /* maximum listed value */
When a command terminates on a fatal signal whose number is N, Bash uses the value 128+N as the exit status.
$ kill -l INT 2
$ sleep 1d ^C $ echo $?
130
Don’t exit "$error_count"
!
$ bash $ exit 256 exit $ echo $?
0
#!/usr/bin/env bash
set -o errexit
temporary_directory="$(mktemp --directory)"
[…]
mkdir "$temporary_directory"
#!/usr/bin/env bash
set -o errexit -o noclobber
temporary_directory="$(mktemp --directory)"
echo "Start" > "${temporary_directory}/script.log"
[…]
echo "End" > "${temporary_directory}/script.log"
/tmp/[…]/script.log: cannot overwrite existing file
#!/usr/bin/env bash
set -o errexit -o noclobber -o nounset
temporary_directory="$(mktemp --directory)"
echo "Start" > "${temporary_directry}/script.log"
temporary_directry: unbound variable
#!/usr/bin/env bash
set -o errexit -o noclobber -o nounset -o pipefail
grep foo "$1" | cut --delimiter=':' --fields=1 | grep --invert-match bar
Easier than $PIPESTATUS
.
Umask:
$ cd "$(mktemp --directory)" $ umask 0022 $ touch first $ ls -l first -rw-r--r-- […] first
$ umask 0077 $ touch second $ ls -l second -rw------- […] second
Don’t litter!
cleanup() {
rm --force --recursive "${temporary_directory-}"
}
trap cleanup EXIT
temporary_directory="$(mktemp --directory)"
mktemp --directory
is atomic.
mktemp
result is only
accessible by owner (umask 0077
).
Print debugging information on demand:
trap env USR1
kill -USR1 $!
Triggers after the currently running command.
Reload configuration in long-running process:
trap read_configuration HUP
kill -HUP "$server_pid"
Start with a script which processes standard input until EOF:
#!/usr/bin/env bash
while read -r line
do
: # Omitted
done
You have no visibility of how far it has processed.
Modify it to react to SIGUSR1 by printing the value of
$line
.
trap 'echo "$line"' USR1
Commands can be defined in several ways, for example:
$ type -a [
[ is a shell builtin [ is /usr/bin/[
Presented in order of decreasing precedence.
Don’t use which
to determine
what will be run!
The precedence order:
To find documentation:
help command
for
builtins
man command
for
executable files
command --help
if
neither work
kill -9 "$!"
Now you have to clean up manually.
Run a command with a timeout:
$ timeout 1s sleep 2s
$ echo $? 124
kill "$!"
timeout="$(date --date='now + 1 minute' +%s)"
while [[ "$(date +%s)" -lt "$timeout" ]]
do
if kill -0 "$!"
then
sleep 0.1
else
exit 0 # Win
fi
done
exit 1 # Fail
kill -9
something is
broken
What you see is not always what you think you see:
$ printf '%s\n' “foo” ‘foo’
“foo” ‘foo’
“Typographic” quotes are not syntactic.
Usually caused by WYSIWYG web framework.
What you see is not always what you think you see:
$ grep —fixed-strings foo <<< foobar
grep: foo: No such file or directory
Em dash ≠ double dash.
Usually caused by WYSIWYG web framework.
Never copy straight to a terminal:
git clone
/dev/null
echo "Hi! I'm a Trojan horse,
what do you not need on this machine?"
git
clone
git@gitlab.com:engmark/advanced-shell-scripting-with-bash.git
Even command line editors are vulnerable.
Use a graphical editor if at all possible.
ShellCheck can find many common issues:
$ shellcheck --shell=bash - <<< 'while read line; do :; done'
In - line 1: while read line; do :; done ^--^ SC2162: read without -r will mangle backslashes. ^--^ SC2034: line appears unused. Verify use (or export if used externally). For more information: https://www.shellcheck.net/wiki/SC2034 -- line appears unused. Verify use (... https://www.shellcheck.net/wiki/SC2162 -- read without -r will mangle backs...
$ bash --noprofile --norc -o xtrace
$ find "$(egrep --only-matching '/usr/[^:]+(:|$)' <<< "$PATH" | head --lines=1 | head --bytes=-2)" -mindepth 1
++ egrep --only-matching '/usr/[^:]+(:|$)' ++ head --bytes=-2 ++ head --lines=1
+ find /usr/local/bin -mindepth 1
/usr/local/bin/foo
Standard input is not shown.
“+
” ($PS4
) denotes shell depth ($SHLVL
).
Pipelined commands run simultaneously, so ordering is not guaranteed.
set -o xtrace
inside scripts.
strace
prints system calls and
signals:
$ strace -e openat cat /dev/null
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/dev/null", O_RDONLY) = 3 +++ exited with 0 +++
man 1 strace
lsof
lists files currently
open by programs:
$ nc example.org 80 & [3] 9999 $ lsof -p $!
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME nc 9999 username cwd DIR 254,3 4096 8919615 /home/username nc 9999 username rtd DIR 254,2 4096 2 / nc 9999 username txt REG 254,2 39608 2921200 /usr/bin/netcat nc 9999 username mem REG 254,2 84016 2885772 /usr/lib/libresolv-2.27.so […] nc 9999 username 0u CHR 136,0 0t0 3 /dev/pts/0 nc 9999 username 1u CHR 136,0 0t0 3 /dev/pts/0 nc 9999 username 2u CHR 136,0 0t0 3 /dev/pts/0 nc 9999 username 3u IPv4 4538734 0t0 TCP machine-name:46748->example.org:http (ESTABLISHED)
man 8 lsof
netstat
prints networking
information:
$ sudo netstat --listening --numeric --program --tcp | sed --quiet '1,2p;/ssh/p'
Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1234/sshd tcp6 0 0 :::22 :::* LISTEN 1234/sshd
man 8 netstat
/proc has lots of runtime information:
$ journalctl --catalog --follow --unit=sshd | grep "$USER" & $ ls -lA /proc/$!/fd
total 0 lr-x------ 1 username username 64 May 7 14:32 0 -> 'pipe:[5181666]' lrwx------ 1 username username 64 May 7 14:32 1 -> /dev/pts/3 lrwx------ 1 username username 64 May 7 14:32 2 -> /dev/pts/3
man 5 procfs
Find the maximum number of files your shell (PID
$$
) can open. Hint:
/limits
.
Extra exercise: Find the PID of the
journalctl
command:
$ journalctl --catalog --follow --unit=sshd | grep "$USER" &
cat /proc/$$/limits
ls -l /proc/$!/fd
, then
ls -l /proc/*/fd/1 | grep INODE
*.bash in the current directory in current locale’s alphabetical order:
ls *.bash | while read file
do
something $file
done
Expelled from PhD programme, computer ground to dust and buried using the secret rituals of the church of Stéphane Chazelas.
*.bash in the current directory in current locale’s alphabetical order:
for file in ./*.bash
[…]
👍
*.bash including dotfiles in the current directory in current locale’s alphabetical order:
shopt -s dotglob
for file in ./*.bash
[…]
👍
*.bash except for foo.bash in the current directory in current locale’s alphabetical order:
shopt -s extglob
for file in ./!(foo).bash
[…]
😅
*.bash in and below the current directory in current locale’s alphabetical order:
shopt -s globstar
for file in ./**/*.bash
[…]
😕
Universal ordering:
export LC_COLLATE='C'
for file in ./*
[…]
Or LC_COLLATE='en_NZ.utf8'
…
Let me just
curl dict://dict.org/d:collate
🧐
files=(./*)
for (( index = ${#files[@]} - 1; index >= 0; index-- ))
do
something "${files[$index]}"
done
😟
Non globbable pattern:
while IFS= read -d '' -r -u 9 path
do
something "$path"
done 9< <(find . \( -type d -regex '^.*/\.git$' -prune -false \) -o -type f -exec printf '%s\0' {} +)
😭
Combining all of the above 😉
Explain every part of the previous command to someone. For reference:
while IFS= read -d '' -r -u 9 path
do
something "$path"
done 9< <(find . \( -type d -regex '^.*/\.git$' -prune -false \) -o -type f -exec printf '%s\0' {} +)
find .
finds all files in
the current directory and child directories.
\( expression \)
overrides find
expression
precedence.
-type d
matches
directories.
-regex '^.*/\.git$'
matches filenames (actually directories
because of the previous expression) ending
with ‘/.git’.
-prune
stops
find
from
descending into matching directories.
-false
makes the
entire expression false.
-o
means "or". Since the
previous expression was false we always process the
expressions after this.
-type f
matches plain
files (not directories).
-exec some command {} +
runs a command suffixed with as many filenames as
possible, if necessary running the command multiple
times with different sets of files.
printf '%s\0'
prints any
subsequent arguments terminated with
\0
, aka. NUL.
<(some command)
creates a named pipe allowing the command output to
be treated as a file.
some command 9<
causes file descriptor 9 to point to standard input.
while condition command; do
inner commands; done
runs
inner commands
as
long as
condition command
returns a zero exit code.
IFS= some command
empties the internal field separator during
some command
, avoiding any trimming of characters when word
splitting.
read options path
reads a single piece of the input stream into the
variable path
. Options:
-d ''
sets the
input stream separator to NUL.
-r
ensures that
backslashes are treated literally.
-u 9
sets the
input stream to file descriptor 9.
some command "$path"
uses the now safe value in the
path
variable.
Conclusion: arguments ≫ menus or prompts.
read
loops
Conclusion: support standard input and output before files.
Conclusion:
exit
at the first sign of
a problem.
Conclusion: keep it really simple.