The Shell

What is the shell

The shell is a programming environment, just like Python or Ruby, and so it has variables, conditionals, loops, and functions.

1
2
3
4
5
6
7
missing:~$ echo $PATH
missing:~$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
missing:~$ which echo
/bin/echo
missing:~$ /bin/echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

We can find out which file is executed for a given program name using the ==which== program. We can also bypass $PATH entirely by giving the path to the file we want to execute.

Connecting programs

The simplest form of redirection is < file and > file. These let you rewire the input and output streams of a program to a file respectively:

1
2
3
4
5
6
7
8
missing:~$ echo hello > hello.txt
missing:~$ cat hello.txt
hello
missing:~$ cat < hello.txt
hello
missing:~$ cat < hello.txt > hello2.txt
missing:~$ cat hello2.txt
hello

==cat== is a program that concatenates files. When given file names as arguments, it prints the contents of each of the files in sequence to its output stream

You can also use ==>>== to append to a file. Where this kind of input/output redirection really shines is in the use of pipes. The ==|== operator lets you “chain” programs such that the output of one is the input of another:

1
2
3
4
missing:~$ ls -l / | tail -n1
drwxr-xr-x 1 root root 4096 Jun 20 2019 var
missing:~$ curl --head --silent google.com | grep --ignore-case content-length | cut --delimiter=' ' -f2
219

A versatile and powerful tool

  • sudo
    As its name implies, it lets you “do” something “as su” (short for “super user”, or “root”),example:
    1
    $ echo 3 | sudo tee brightness

Shell Script

variables ans functions

To assign variables in bash, use the syntax foo=bar and access the value of the variable with $foo. Note that foo = bar will not work since it is interpreted as calling the foo program with arguments = and bar
Strings in bash can be defined with ' and " delimiters, but they are not equivalent. Strings delimited with ' are literal strings and will not substitute variable values whereas " delimited strings will.

1
2
3
4
5
foo=bar
echo "$foo"
# prints bar
echo '$foo'
# prints $foo

As with most programming languages, bash supports control flow techniques including if, case, while and for. Similarly, bash has functions that take arguments and can operate with them. Here is an example of a function that creates a directory and cd into it.

1
2
3
4
mcd () {
mkdir -p "$1"
cd "$1"
}

special variables

Here $1 is the first argument to the script/function. Unlike other scripting languages, bash uses a variety of special variables to refer to arguments, error codes, and other relevant variables. Below is a list of some of them. A more comprehensive list can be found here.

1
2
3
4
5
6
7
8
$0 - Name of the script
$1 to $9 - Arguments to the script. $1 is the first argument and so on.
$@ - All the arguments
$# - Number of arguments
$? - Return code of the previous command
$$ - Process identification number (PID) for the current script
!! - Entire last command, including arguments. A common pattern is to execute a command only for it to fail due to missing permissions; you can quickly re-execute the command with sudo by doing sudo !!
$_ - Last argument from the last command. If you are in an interactive shell, you can also quickly get this value by typing Esc followed by . or Alt+.

Commands will often return output using STDOUT, errors through STDERR, and a Return Code to report errors in a more script-friendly manner. The return code or exit status is the way scripts/commands have to communicate how execution went. A value of 0 usually means everything went OK; anything different from 0 means an error occurred.

Exit codes can be used to conditionally

Exit codes can be used to conditionally execute commands using && (and operator) and || (or operator), both of which are short-circuiting operators. Commands can also be separated within the same line using a semicolon ; The true program will always have a 0 return code and the false command will always have a 1 return code. Let’s see some examples

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
false || echo "Oops, fail"
# Oops, fail

true || echo "Will not be printed"
#

true && echo "Things went well"
# Things went well

false && echo "Will not be printed"
#

true ; echo "This will always run"
# This will always run

false ; echo "This will always run"
# This will always run

Another common pattern is wanting to get the output of a command as a variable. This can be done with command substitution. Whenever you place $( CMD ) it will execute CMD, get the output of the command and substitute it in place. For example, if you do for file in $(ls), the shell will first call ls and then iterate over those values.
A lesser known similar feature is process substitution, <( CMD ) will execute CMD and place the output in a temporary file and substitute the <() with that file’s name. This is useful when commands expect values to be passed by file instead of by STDIN. For example, diff <(ls foo) <(ls bar) will show differences between files in dirs foo and bar.
let’s see an ==example== that showcases some of these features. It will iterate through the arguments we provide, grep for the string foobar, and append it to the file as a comment if it’s not found:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!/bin/bash

echo "Starting program at $(date)" # Date will be substituted

echo "Running program $0 with $# arguments with pid $$"

for file in "$@"; do
grep foobar "$file" > /dev/null 2> /dev/null
# When pattern is not found, grep has exit status 1
# We redirect STDOUT and STDERR to a null register since we do not care about them
if [[ $? -ne 0 ]]; then
echo "File $file does not have any foobar, adding one"
echo "# foobar" >> "$file"
fi
done

[[ ]] is one sort of comparation. see test to learn detail.

shell globbing

  • Wildcards - Whenever you want to perform some sort of wildcard matching, you can use ? and * to match one or any amount of characters respectively. For instance, given files foo, foo1, foo2, foo10 and bar, the command rm foo? will delete foo1 and foo2 whereas rm foo* will delete all but bar.

  • Curly braces {} - Whenever you have a common substring in a series of commands, you can use curly braces for bash to expand this automatically. This comes in very handy when moving or converting files.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    convert image.{png,jpg}
    # Will expand to
    convert image.png image.jpg

    cp /path/to/project/{foo,bar,baz}.sh /newpath
    # Will expand to
    cp /path/to/project/foo.sh /path/to/project/bar.sh /path/to/project/baz.sh /newpath

    # Globbing techniques can also be combined
    mv *{.py,.sh} folder
    # Will move all *.py and *.sh files


    mkdir foo bar
    # This creates files foo/a, foo/b, ... foo/h, bar/a, bar/b, ... bar/h
    touch {foo,bar}/{a..h}
    touch foo/x bar/y
    # Show differences between files in foo and bar
    diff <(ls foo) <(ls bar)
    # Outputs
    # < x
    # ---
    # > y

    Some differences between shell functions and scripts

  • Functions have to be in the same language as the shell, while scripts can be written in any language. This is why including a shebang for scripts is important.

  • Functions are loaded once when their definition is read. Scripts are loaded every time they are executed. This makes functions slightly faster to load, but whenever you change them you will have to reload their definition.

  • Functions are executed in the current shell environment whereas scripts execute in their own process. Thus, functions can modify environment variables, e.g. change your current directory, whereas scripts can’t. Scripts will be passed by value environment variables that have been exported using export

  • As with any programming language, functions are a powerful construct to achieve modularity, code reuse, and clarity of shell code. Often shell scripts will include their own function definitions.

Finding how to use commands

  1. -h or --help: said command
  2. man: more detail
  3. tldr: example of use case

Finding files

  • find
    All UNIX-like systems come packaged with find, a great shell tool to find files. find will recursively search for files matching some criteria. Some examples:

    1
    2
    3
    4
    5
    6
    7
    8
    # Find all directories named src
    find . -name src -type d
    # Find all python files that have a folder named test in their path
    find . -path '*/test/*.py' -type f
    # Find all files modified in the last day
    find . -mtime -1
    # Find all zip files with size in range 500k to 10M
    find . -size +500k -size -10M -name '*.tar.gz'

    Beyond listing files, find can also perform actions over files that match your query. This property can be incredibly helpful to simplify what could be fairly monotonous tasks.

    1
    2
    3
    4
    # Delete all files with .tmp extension
    find . -name '*.tmp' -exec rm {} \;
    # Find all PNG files and convert them to JPG
    find . -name '*.png' -exec convert {} {}.jpg \;
  • fd
    fd is a simple, fast, and user-friendly alternative to find. It offers some nice defaults like colorized output, default regex matching, and Unicode support. It also has, in my opinion, a more intuitive syntax. For example, the syntax to find a pattern PATTERN is fd PATTERN.

  • locate
    more effcient, and just uses file name. The difference bettwen locate and find are here

search based on file content

  • grep
  • ripgrep(rg)
    1
    2
    3
    4
    5
    6
    7
    8
    # Find all python files where I used the requests library
    rg -t py 'import requests'
    # Find all files (including hidden files) without a shebang line
    rg -u --files-without-match "^#\!"
    # Find all matches of foo and print the following 5 lines
    rg foo -A 5
    # Print statistics of matches (# of matched lines and files )
    rg --stats PATTERN