September 10th, 2007
Moving Up on Mini-Scripting
After my previous post on ad hoc scripting I had an example come up in my own work. I needed to search through some web logs for all hits in August to three servers from a pair of IP addresses. Below is the actual command line script I used, except I changed some identifying information and extended the script from a single line to multiple lines for clarity. This script reduced the logs to a file that details which web server was it, which IP address the request came from, and the date and time of the request:
let D=1; while (( D <= 31 )); do DD=$(printf "%02d" $D) for M in server-01 server-02 server-03 ; do gzip -cd $M/weblog.200708$DD.gz | \\ perl -anle 'print "'$M' $F[0] $F[3]" if $F[0] =~ /^10\.1\.2\.[34]$/' done let D=D+1 done > ~/log_report
There are a few things I want to point out about this mini-script:
- Iteration is performed on an index value (
$D
), tested using Kornshell’s double parentheses syntax, and incremented using alet
statement. - It makes use of a Perl one-liner to rearrange the data on each line and only print that data that is relevant to the task at hand.
- I send the output of the entire thing to a file by redirecting it after the final “
done
” statement.
None of these things is really a big deal to me anymore, but I have noticed many people do not think or know you can do these things. I think that if you do realize it then you can come up with some more powerful mini-scripts from your command line.
I noted before that I use the Z shell, a Bourne compatible shell. While the above mini-script will not work in the Bourne shell itself, it should also work in the Kornshell and the Bourne Again shell, though your mileage may vary.
Incrementing an Index
The generic script code for iterating over an increasing numeric index is:
let I=0; while (( I < 10 )); do ... your code here ... let I=I+1 done
If your shell is a traditional Bourne shell that does not support let-statements and double parentheses notation, then you may do this:
I=0; while [ $I -lt 10 ]; do ... your code here ... I=`echo $I + 1 | bc` done
If you have a small set of numbers to iterate over then it is often still faster to just type them all out in a for-statement, but that stops being true when you have a large set of numbers.
Invoking Perl and/or Ruby
This technique should not be alien to someone that writes shell scripts. Even if you never thought to invoke Perl and/or Ruby in your scripts, you have probably used sed and awk. In the above example I could have used awk instead of Perl, but Perl seemed like a better choice at the time. Ruby and Perl have flags that help them be more useful on the command line, and in fact they can both become more powerful replacements for sed and awk. Some flags are:
-n
: I call this the awk-mode flag. It causes both Ruby and Perl to loop on each line of the input, putting that line of input into $_
.
-p
: I call this the sed-mode flag. It causes both Ruby and Perl to loop on each line of input like -n
, but also prints the value of $_
at the end of the loop (presumibly you have modified it).
-a
: This is often used with -n
, though it could also be used with -p
. As each line of input is read, it is split and the resulting array stored in @F (Perl) or $F (Ruby). I used this feature in the above mini-script to limit which fields are output to the report I generated.
-l
: This flag causes lines output to be terminated with a newline. It also causes lines of data input via -n
and -p
to be auto-chomped.
-e
: The most important flag for one-liners! This causes Perl and Ruby to take their commands from the argument given after -e
. It may be specified multiple times to issue more code from the command line.
-i
: If you want to edit a file in-place you can use this flag to tell Perl or Ruby to edit each file of input instead of leaving those files untouched. You can also opt to provide an extension to be used for a backup of the input. Note that because of the syntax for the optional extension use, you should check the man page for usage.
Redirecting Output
You can redirect output from any part of a script. Most people focus on redirection of individual commands, but that causes one to have to keep appending data to an output file. One does not need to do that! Just redirect your output after the “done
” and you will capture all of the output from all of the commands inside the loop. Spiffy, eh?