Beginning shell scripting with Bash

Copyright © 2022 Victor Engmark

Creative Commons Attribution-ShareAlike 4.0 International License

Huge thanks to Toitū Te Whenua Land Information New Zealand for letting me release this with an open license

Made with reveal.js

Goal

Learn how to learn Bash via vocabulary, documentation, filesystem, tools, and safety

Prerequisites

  • Some familiarity with programming
  • A Bash prompt

Introductions

Who am I?

Who are you?

  • What sort of programming are you already familiar with?
  • What do you want to use Bash for?

Hints & tips

Why learn Bash

  • Used everywhere on Linux
  • Dangerous and powerful
  • Blazing fast processing of big datasets

Vocabulary

Terminal emulator

Provides the window you type into, mouse support, colour palette, and search.

GNOME Console, PuTTY, and XTerm are popular terminal emulators.

Standard input stream

Provides program input.

Normally connects to the terminal, but can be connected to a file using COMMAND < FILE.

File descriptor 0, /dev/fd/0.

Standard output stream

Provides the "normal" program output.

Prints to the terminal by default.

Can be redirected to standard input of another command using WRITER | READER or to a file using COMMAND > PATH.

File descriptor 1, /dev/fd/1.

Standard error stream

Provides error-related program output.

Prints to the terminal by default.

File descriptor 2, /dev/fd/2.

Shell

A read-eval-print loop (or REPL) ↻:

  1. Reads what you press on the keyboard.
  2. Evaluates the input.
  3. Prints the result.
  • a results in "a" being printed.
  • Enter results in executing the command.

Bash is a popular shell 😉.

Secure Shell (SSH)

Network protocol similar to TLS used for HTTPS. An SSH client connects to an SSH server and launches a new shell on the server to communicate with it.

PuTTY and OpenSSH are popular SSH clients.

Synopsis definition

Summary of how to run a command.

  • lowercase means literal
  • UPPERCASE means placeholder
  • NAME… (ellipsis) means one or more
  • [bracketed] means optional
  • first|second (pipe) means mutually exclusive
  • = separates an option name and a value

Synopsis example

foo [--verbose] [--config=PATH] pack|unpack PATTERN FILE…

Username

A short, single word, representing a system user for authentication purposes.

How to find documentation

type -a WORD

Tells you the types of the word, highest precedence first.

Examples: type -a type, type -a bash

help WORD

For internal Bash builtins and keywords.

Examples: help help, help type, help cd

COMMAND --help and COMMAND -h

Self-documenting commands like grep.

Examples: grep --help, ls --help

man COMMAND

Manual pages.

Examples: man man, man grep

info COMMAND

Similar to man pages, with more complex web-site-like structure. Usually also available as a man page.

Examples: info info, info grep

Web sites

Filesystem

File

A "normal" file is a collection of bytes.

Directory

A file containing other files and directories. Known as a folder in graphical systems.

Path

A string referencing a file's location in the filesystem.

Examples:

  • / is called the root directory, the ancestor of all other paths
  • /home/USERNAME is normally the home directory of user USERNAME

Working directory

The directory path associated with the current shell, useful to simplify operations on files with similar paths.

Available with the pwd command and the PWD variable.

Relative path

A path relative to the working directory.

Examples:

  • . refers to the working directory
  • .. refers to the parent directory of the working directory
  • ../log refers to a sibling file called log
  • ./foo/bar refers to a grandchild file bar inside the foo child directory

Glob

Pattern matching syntax for files. * matches zero or more characters, and ? matches exactly one character.

Example: foo?bar* matches foo bar and foo-bar.txt, but not foobar.txt.

Not to be confused with regular expressions!

Tools

Quick intro to popular tools.

Most tools which take files can also take standard input if no files are specified.

ls [PATH…] lists files

Takes PATH… to list particular files/directories.

Should never be used in scripts (1, 2); use patterns like for file in ./* and find + while read instead.

cd DIRECTORY changes the working directory

Example: ../tmp/foo.txt refers to foo.txt in a sibling directory called tmp. If you cd ../tmp you can instead refer to it as ./foo.txt.

Should not be used in scripts; debugging is easier if you use the original paths.

echo [OPTION…] [VALUE…] prints values

Should not be used in shell scripts, since it can't tell whether a parameter starting with a hyphen is an option or a value. Use printf instead.

printf FORMAT [VALUE…] prints formatted values

Examples:

  • printf '%s\n' VALUE… emulates echo
  • printf '%04d-%02d-%02d\n' YEAR MONTH DAY prints a date

cat FILE… concatenates files

Example: cat intro.txt body.txt outro.txt > full.txt.

cut OPTION… [FILE…] prints selected parts of lines

Example: cut --delimiter=$'\t' --fields=1,3,5- in.tsv prints the first, third, and fifth onward fields from a tab-separated file.

df [PATH…] prints disk (actually filesystem) fullness

Example: df --human-readable prints info on all the filesystems.

du [PATH…] prints disk usage of files

Example: du --si --summarize prints the size of the current directory in human-readable form, such as 3G for three gigabytes.

See also ncdu - interactive and user-friendly.

find [OPTION…] [STARTING POINT…] [EXPRESSION] prints file paths matching the expression

Example: find / -maxdepth 2 -type f -iname 'tmp*' prints paths to "regular" files (not directories, devices, etc) whose names start with "tmp" (ignoring case) within the root directory and direct children of the root directory.

grep PATTERN [FILE…] prints lines matching the regular expression pattern

Example: grep '^2020-01-01.*foo$' my.log prints lines starting (^) with "2020-01-01", followed by anything (.*), then foo at the end ($).

You can also specify multiple patterns.

See ripgrep; rg

head [FILE…] prints the first ten lines of each file

Examples:

  • head --lines=2 prints the first two lines
  • head --lines=-2 prints every line except the last two

tail [FILE…] prints the last ten lines of each file

Examples:

  • tail --lines=2 prints the last two lines
  • tail --lines=+2 prints every line from the second onward (that is, skipping the first line)
  • tail --follow prints the last ten lines, then any new lines coming into the file.

less [FILE…] is an interactive viewer or "pager"

For example, pipe long output to this command to browse it while the command is running. Handy shortcuts:

  • / to search forward
  • ? to search backward
  • s to save standard input to file
  • v to open FILE in $EDITOR

nano [FILE] edits text files

Ctrl-key (^…) and Alt-key (M-…) sequences on the bottom of the terminal.

See also vim & emacs - much more powerful, but much more difficult.

read -r reads a single line into the REPLY variable

Typically used to read a file line-by-line in a loop:

while read -r; do
    COMMAND "$REPLY"
done < FILE

Or from a command with done < <(COMMAND).

(You should always use -r.)

readarray reads lines into the MAPFILE variable

Example: readarray -t < <(grep '^error: .*' my.log) creates an array of all the search results (after trimming the newline at the end of each line). Loop over the results using for line in "${MAPFILE[@]}".

shuf shuffles lines

Example: shuf --head-count=4 --repeat dictionary.txt prints four random lines, allowing more than one copy of the same line.

sort sorts lines

Example: sort --output=.gitignore --unique .gitignore sorts the .gitignore file in-place, removing any duplicate lines.

Specify an encoding like LC_COLLATE=en_NZ.utf-8 to ensure consistency.

uniq deals with line uniqueness

Beware: "unique" here means "different from the previous line", not globally unique. Compare with sort --unique, which first needs to sort the file.

Examples:

  • uniq --count prefixes lines with the number of occurrences
  • uniq --skip-fields=2 avoids comparing the first two (tab-separated) fields

wc FILE is the word counter

Can also count --lines, --chars.

bc [FILE…] is a basic calculator

Runs as a REPL without a file.

Much more advanced than Bash integer-only 64-bit arithmetic.

Example: bc --mathlib <<< '4*a(1)' gives you π (4 × arctan(1)).

jq FILTER [FILE…] extracts, modifies, and creates JSON

Interactive examples at jqplay.org.

xsltproc STYLESHEET [FILE…] extracts, modifies, and creates XML using XSLT

Examples are too big, sorry! 😞

GNU parallel runs commands in parallel

Example: parallel git -C {} fetch ::: ~/*/.git/.. runs git fetch in every Git directory in the home directory.

tee prints input and saves it to a file

Typically used to see and record program output at the same time: grep '^error: .*' my.log | tee errors.log

script records a shell session in a file called typescript

Read with less --raw-control-chars typescript.

shellcheck [FILE…] reports lints in shell scripts

Excellent self-learning tool. Playground at shellcheck.net.

shfmt FILE… auto-formats shell scripts

screen allows you to pause and resume a shell session

Handy if you don't want to keep a connection alive while running a long process remotely.

Does not survive a server restart.

bash is the shell itself

bash --noprofile --norc -o xtrace starts a shell with very basic configuration (--noprofile --norc) and debug printing of commands (-o xtrace).

Safety

Bash parses commands in surprising ways

  • Quote variable references
  • Use the simplest syntax possible

Filenames can be weird

Quote variable references.

Use arrays for lists

Concatenating strings runs into issues with whitespace and more.

Use safety settings for strict execution

set -o errexit -o nounset -o pipefail
shopt -s failglob inherit_errexit

Resources

Thank you!