Here are some portable Bourne shell idioms that I find useful to remember for scripting. The Bourne shell does much more than most users realize, and the ksh and bash extensions are rarely essential. (From the command-line, bash and ksh are vastly more useful.) My favorite reference book is "Portable Shell Programming --- An Extensive Collection of Bourne Shell Examples" by Bruce Blinn from Prentice Hall. Get information on a built-in bash command with ``help''. It's much easier than reading the full bash man page at http://www.gnu.org/software/bash/manual/bash.html For Bash suggestions, I recommend this bash FAQ: http://mywiki.wooledge.org/BashFAQ/ * Text filtering commands * Administrating from scripts and the command-line often benefit from pipes of text filtering commands. Here are some that are easy to overlook or forget. -- o ``mmencode'' converts to and from base64 and "quoted-printable" formats for email. Search for the ``metamail'' package. Unfortunately, this has become hard to find. Alternatively ``uuencode -m'' converts to base64, and ``uudecode -m'' converts from base64. Or decode and encode quoted-printable and base64 with => perl -pe 'use MIME::QuotedPrint; $_=MIME::QuotedPrint::decode($_);' perl -pe 'use MIME::QuotedPrint; $_=MIME::QuotedPrint::encode($_);' perl -pe 'use MIME::Base64; $_=MIME::Base64::encode($_);' perl -pe 'use MIME::Base64; $_=MIME::Base64::decode($_);' <= URL-encode a string with => perl -ne 'chomp; s/([^-_.~A-Za-z0-9])/sprintf("%%%02X", ord($1))/seg; print "$_\n"' <= o Convert utf-8 characters to escaped hexadecimal for html, and back: => perl -C -pe 's/([^\x00-\x7f])/sprintf("&#%d;", ord($1))/ge;' perl -C -pe 's/&\#(\d+);/chr($1)/ge;s/&\#x([a-fA-F\d]+);/chr(hex($1))/ge;' <= o ``uniq'' lets you remove duplicated lines from a sorted file. o Count the number of times a given line occurs with => sort | uniq -c | sort -n sort | uniq -c | sort -k1,1nr -k2 <= o Break one word per line with => perl -pe 's/\s+/\n/g' <= o Combine separate lines into a single line of words with => paste -s -d" " <= o Add up numbers that arrive one per line => paste -s -d+ | bc <= o ``comm'' lets you suppress lines unique to one or both of two files. o ``cat -s'' never prints more than one blank line in a row. o Remove all blank lines with => perl -ne 'print if /\S/' <= o Print lines starting with one containing FOO and ending with one containing BAR. => sed -n '/FOO/,/BAR/p' <= o Print lines other than those starting with one containing FOO and ending with one containing BAR. => sed -n '/FOO/,/BAR/!p' <= o ``diff3 -m'' for merging changes in files edited from a common ancestor. o ``fold'' breaks lines to proper width, and ``fmt'' will reformat lines into paragraphs. o ``dirname'' and ``basename'' let you extract the directory and filenames from a full path to a file. o ``namei'' breaks a pathname into pieces and follows symbolic links. o ``expand'' and ``col -x'' replace tabs by spaces. o ``col -b'' removes backspaces from a file. o ``cat -v'' shows non-printing characters as ascii escapes. o ``sed '1,10d' '' deletes the first 10 lines. o ``sed -n '3p' '' and ``sed -n '3{p;q}' '' both print the third line, but the latter is more efficient. o ``sed '/foo/q' '' truncates a file after the line containing ``foo''. o ``sed -ne '/foo/,/bar/p' '' prints everything from the line containing ``foo'' to the line containing ``bar''. o Align space-delimited fields into orderly columns with ``column -t''. o Right justify queries with => printf "%40s" "Do you want to delete? [y/N] " <= o Convert dos text files to unix, and vice versa: => dos2unix file.txt unix2dos file.txt tr -d \\r < win.txt > unix.txt # if you can't find dos2unix sed -e 's/$/\r/' < unix.txt > win.txt # if you can't find unix2dos <= o ``cat -n'' and ``nl'' numbers lines. o Both of these perform string substitution, but the latter allows more general regular expressions: => sed -e 's/oldtext/newtext/g' perl -pe 's/oldtext/newtext/g' <= Here's how to replace double quotes by single quotes for TeX: => < in.tex perl -pne 's%\B"\b%``%g' | perl -pne "s%\b\"\B%''%g" > out.tex <= o Use ``iconv'' to convert between character encodings. o Here are two ways to find string patterns (regular expressions) in a file: => grep 'pattern' filename [file] [< file] perl -ne 'print if /pattern/' [file] [< file] <= o Print the first and third columns of each line: => awk '{print $1,$3}' perl -lane 'print "$F[0] $F[2]"' while read a b c d ; do echo "$a $c" ; done <= o Convert to lower-case: => tr '[A-Z]' '[a-z]' tr '[:upper:]' '[:lower:]' perl -pe 'tr/[A-Z]/[a-z]/' perl -pe '$_ =lc' <= o Simple character substitutions and deletions may be simplest with ``tr''. => tr -d '\r' # delete carriage returns tr '\n' '\0' # replace newlines by null characters. <= => $ echo 1-2a-3b | tr "[1-9]" "[2-9]" | tr '-' '_' | tr -d 'a' 2_3_4b <= o You can pipe into a loop with ``read -r''. Here is a complicated way to cat a text file, piping in and out of a loop. => cat file | while read -r a; do echo "$a" ; done | cat <= o To read lines in pairs from two files try => paste file1 file2 | while read -r a b ; do echo "$a $b" ; done <= o Divide words one per line, then sum them as numbers: => $ echo 1 2 3.1 | perl -pe 's/\s+/\n/g' | perl -e '$s=0; while (<>) {$s += $_;} ; print "$s\n";' 6.1 <= o Reverse lines with ``tac'' and words with ``rev''. o Sort a list of dependencies with ``tsort''. o Shuffle lines randomly with ``shuf''. Generate shuffled integers with => $ shuf -i1-100 -n3 93 57 71 <= o Generate random lottery numbers between 1 and 292201338: => $ echo "($RANDOM + 32768*($RANDOM + 32768*$RANDOM)) % 292201338 + 1" | bc 130237776 <= _ * Files and directories * -- o Select text (non-binary) files with one of these => \ls | perl -lne 'print if -T' perl -le 'for (glob "*") {print if -T }' perl -le 'print for grep -T, <*>' <= The perl algorithm for detecting text files is very good. o To do something to files with goofy names, including spaces and dashes, delimit the files with null characters instead of whitespace or newlines. => find . -type f -print0 | xargs -r0 ls <= Or read from one line at a time: => cd "$dir1" && find . -type f | while read -r f ; do if [ ! -f "$dir2/$f" ] ; then echo "$f is in $dir2 but not in $dir2" fi done <= o See if a directory contains any files, including broken links. => has_files() { set -- "$1"/.[!.]* "$1"/*; test -e "$1" || test -e "$2" || test -L "$1" || test -L "$2"; } if has_files ${dir} ; then echo "${dir} has files" ; else echo "${dir} is empty" ; fi <= See if files of a certain type exist: => if [ "`printf '%s' *.par`" != '*.par' ] ; then echo "has par files" ; fi [or] test "`printf '%s' *.par`" != '*.par' && echo "has pars" || echo "no pars" <= o ``readlink -f'' will fully resolve what a symbolic link points to. Find all bad symbolic links with => find . -type l | while read -r f ; do if ! readlink -f "$f" >&/dev/null then echo "$f" ; fi ; done <= o To see the canonical path for the current directory, you can use either of these: => readlink -f . pwd -P <= _ * Variables * -- o To see if a variable contains a regular expression, combine ``if'' and ``grep''. For example to see if the name of a file begins with a dot, try => if echo "$filename" | grep '^[.]' >/dev/null then echo yes ; else echo no ; fi <= ``expr'' also has a support for limited regular expressions. => if [ `expr "$filename" : '[.].*'` -ne 0 ] then echo yes ; else echo no ; fi <= o Use ``read -r'' to avoid tokenizing filenames with spaces. Here's how to find all files containing a space, and replace them by underscores. => find . -iname '* *' | while read -r f ; do echo mv "$f" "`echo "$f" | sed 's/ */_/g'`" done <= o For simple integer arithmetic use ``expr'': => N=`expr "$N" + 3` <= o For arbitrary-precision floating-point math, use ``bc -l'' => # Get pi to 10 places with arctangent (bc man page) PI=`echo "scale=10; 4*a(1)" | bc -l` # Expensive calculation of zero (Craig Artley): ZERO=`echo "c($PI/4)-sqrt(2)/2" | bc -l` <= o ``seq 1 100'' generates all integers between 1 and 100. To iterate a loop 100 times, try => for i in `seq 1 100` ; do ... ; done <= o You can set the environment of a subprocess by defining a variable on the same line. The current shell is not affected. => $ x=doggie sh -c 'echo x=$x' x=doggie $ x=pig ; x=doggie echo x=$x x=pig <= o Test that a string has non-zero length with => if [ -n "$string" ] ; then echo "not empty" ; fi <= The ``-n'' is actually the default for a string expression, so you can omit it: => if [ "$string" ] ; then echo "not empty" ; fi <= o There are several good ways to set default values for environmental variables. Many do this => if [ ! "$VARIABLE" ] ; then VARIABLE="default value" ; fi export VARIABLE <= A simple alternative is => : ${VARIABLE:="default value"} export VARIABLE <= The colon at the beginning of the line is necessary as a no-op that allows its arguments to be evaluated. o Rarely you may want to accept a variable defined as an empty string. If so, then omit the colon before the equals when setting the default. => : ${VARIABLE="default value"} export VARIABLE <= To test whether a string is defined, even if empty, test => if [ "${VARIABLE+x}" ] ; then echo DEFINED ; fi <= o To echo all variables starting with X: ``echo ${!X*}'' o To check whether a series of variables are defined, try => for V in JAVA_HOME SSH_AGENT_PID TEXMFDIR NETHACKOPTIONS ; do eval v="\$$V" if [ ! "$v" ] ; then echo "You must define $V" ; fi done <= _ * Running commands * -- o Use ``"$@"'' when passing command-line arguments unaltered to subprocesses. This is equivalent to passing ``"$1" "$2" ...'', but the first version works properly for no arguments. o Test the processing of arguments, like this => $ set a 'b c' d $ for i in "$@" ; do echo "|$i|" ; done |a| |b c| |d| $ for i in "$*" ; do echo "|$i|" ; done |a b c d| $ for i in $* ; do echo "|$i|" ; done |a| |b| |c| |d| <= o See what runtime options you may have set with these => set -o; bind -p; shopt -p; stty -a <= For example, you can edit a bash command by default in emacs mode. Change to vi with => set -o vi <= In emacs mode, you can edit your command in your environmental ``$EDITOR'' with ``cntl-x cntl-e'' In vi-mode, use ``esc-v''. See ``help fc'' for more. o Repeat the last argument of the previous command with ``!$''. Repeat all arguments without the command with ``!*''. o To guarantee that a background process outlives the current shell, add extra parentheses like this: => ( command & ) <= Otherwise, your current shell, by exiting X or ssh, may terminate all processes that have your shell as the parent process. The extra parentheses starts a subshell that exits as soon as the command is spawned in the background. The background process changes its parent process ID to 1. This is a command-line version of the "double fork." o Repeat until a command succeeds: => while ! cvs -z 3 -q update -dPA ; do echo -n . ; sleep 60 ; done <= o Make a progress bar (loop while waiting on a process) => sleep 10 & while ps -p $! >/dev/null; do echo -n . ; sleep 1 ; done ; echo or while pidof mozilla-bin > /dev/null ; do echo -n . ; sleep 1 ; done ; echo <= ``pgrep -f'' or ``killall -0'' are alternatives to ``pidof'' for this purpose. _ * Manipulating paths * -- o Loop over the elements of a PATH by tokenizing with the character ':'. => IFS=':' ; for dir in $PATH ; do echo $dir ; done <= o Check for the existence of an executable version of a command in your PATH: => function checkPath() {IFS=':' ; for dir in $PATH ; do if [ -x "$dir/$1" ] ; then return 0; fi ; done; return 1;} if checkPath commandName ; then ... ; fi <= o Here is my prefered way to modify a PATH => # Arguments currentpath newelement [after] # addtopath a:b c -> c:a:b # addtopath a:b c after -> a:b:c # addtopath a:b a -> a:b addtopath () { P=$1 E=$2 O=$3 if [ ! "$P" ] ; then P="$E" elif ! echo $P | egrep "(^|:)$E($|:)" >/dev/null ; then if [ "$O" = "after" ] ; then P="$P:$E" else P="$E:$P" fi fi echo "$P" } # example PATH=`addtopath "$PATH" /usr/local/bin after` <= _ * Common script chores * -- o Debug the script with ``set -x''. o Make a script exit immediately after any failed command with ``set -e''. o Process flags in a script: => for i in "$@" ; do case $i in -a) FLAG_A=1 shift ;; -b) FLAG_B="$2" shift ; shift ;; --) shift ; break ;; esac done <= o Print help from a script: => if [ $# -lt 1 -o "$1" = "-h" -o "$1" = "-help" -o "$1" = "--help" ] ; then cat <<-END Usage: `basename $0` [-flag] arg1 [arg2] More information. END exit fi <= o Handle errors with functions: Often an error exit is handled most cleanly with a function. => print_usage_and_exit() { cat <<-END Usage: `basename $0` arg1 arg2 [arg3] The first two arguments are required. END exit } if [ $# -lt 2 ] ; then print_usage_and_exit fi <= o Here's a robust way to locate the directory containing a script, following symbolic links. (Taken from the launch script of ``FindBugs''.) => program="$0" while [ -h "$program" ]; do link=`ls -ld "$program"` link=`expr "$link" : '.*-> \(.*\)'` if [ "`expr "$link" : '/.*'`" = 0 ]; then dir=`dirname "$program"` program="$dir/$link" else program="$link" fi done script_directory=`dirname $program` script_directory=`cd $script_directory && /bin/pwd` <= o Trapping signals to stop scripts: Ever try to interrupt your script, then discover that it killed only one command and continued to the next? Force a complete exit by adding the following line early in your script. => trap "exit 1" 1 2 3 15 <= You can also trap normal and error exits: => # force script to exit when any command fails set -e # Trap on any exit trap "echo Always called before exit" 0 # Trap on error exit only trap "echo Error exit was called " ERR echo "Next command will fail" # Returns error code of 1 false echo "Will not see this comment" <= o Process ID's Get the process ID of the current shell as ``$$'', of the parent shell with ``$PPID'' and ``$!'' for the most recently backgrounded child process. Interactively, you get see child PID's with ``jobs -p''. o Here's how to ask a yes or no question, with a default of no. It checks whether the first letter is a y or Y and ignores leading spaces. => echo -n "Do you want to continue? [y/N]: " read answer if expr "$answer" : ' *[yY].*' > /dev/null; then echo Continuing else echo Quitting exit fi <= o Here's how to ask for a password without echoing the characters. The trapping ensures that an interrupt does not leave the echoing off. => stty -echo trap "stty echo ; echo 'Interrupted' ; exit 1" 1 2 3 15 echo -n "Enter password: " read password echo "Your password is \"$password\"" stty echo <= Gnome and other frameworks often allow simple scripting of GUIs: => password=`zenity --entry --text "Enter password:"` <= _ * File descriptors * -- o Redirecting output file descriptors Here are common ways to capture the standard output and standard error of a single command in a log file: => command >file.log 2>&1 command 2>&1 | tee file.log <= o If you have a script with many commands, you can have them all write to the same log file by default: => # save default standard output in file descriptor 10 exec 10>&1 # redirect standard output to a log file. exec >file.log # redirect standard error to same log file exec 2>&1 # close stdin exec 0<&- # This command will write to log file command # echo to default standard output instead of log file echo "Visible message" 1>&10 <= Avoid file descriptor 5, which bash already uses. (``ulimit -n'' should show many available file descriptors.) o Avoid writing to stdout if it is not connected to a terminal: => test -t 1 && echo "Connected to a terminal" <= o Open a socket Associate a file descriptor, say 4, with a socket, and close with => 4< /dev/tcp/$hostname/$port 4<&- <= A more portable solution is to use ``nc''. Listen on a port with => nc -l -p 3535 <= Connect to a remote host port like => echo 'GET /' | nc hostname 80 <= An even more general utility is ``socat'', which also handles Unix sockets. _ o Hostname lookups on linux General utilities are ``dig'', ``nslookup'', ``host'', ``hostname''. Get an IP address for a specific hostname: => host samplehostname | sed 's/.* //' <= Get a hostname for an IP address: => nslookup 123.123.123.123 | grep 'name = ' | sed 's/.*name = //' <=