I was recently handed a collection of Apache web server logs to parse for statistics. The first step taken was to assess the date ranges covered by each log file. That’s a simple procedure of looking at the first and last line of the logs. Here’s a one-liner for that:

$ zcat access_log.20070709.all.gz | tee \
>(head -n1) \
>(tail -n1) &>/dev/null

192.123.89.64 – – [22/Aug/2006:15:49:37 -0400] "GET / HTTP/1.1" 200 242 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050921 Red Hat/1.0.7-1.4.1 Firefox/1.0.7"
192.168.75.13 – – [09/Jul/2007:16:11:47 -0400] "GET /app/showXmlDataContent.do HTTP/1.1" 200 28145 "-" "Java/1.5.0_08"

head closes its filehandle after reading the requisite number of lines and that makes baby tee cry. So, I’m directing tee‘s stderr to /dev/null so it masks the ‘Broken pipe‘ error. That will also mask any other tee error that could arise, but in this simple usage it’s not a concern.

Note that the >(command) syntax for a temporary pipe does not allow a space between the >(

The temporary pipe command can be piped to clean up the output:

$ zcat access_log.20070709.all.gz | tee \
>(head -n1 | cut -d' ' -f4) \
>(tail -n1 | cut -d' ' -f4) &>/dev/null

[22/Aug/2006:15:49:37
[09/Jul/2007:16:11:47

Advertisements