You are currently browsing the category archive for the ‘Linux’ category.

I was recently handed a collection of Apache web server logs to parse for statistics. The first step taken was to assess the date ranges covered by each log file. That’s a simple procedure of looking at the first and last line of the logs. Here’s a one-liner for that:

$ zcat access_log.20070709.all.gz | tee \
>(head -n1) \
>(tail -n1) &>/dev/null – – [22/Aug/2006:15:49:37 -0400] "GET / HTTP/1.1" 200 242 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050921 Red Hat/1.0.7-1.4.1 Firefox/1.0.7" – – [09/Jul/2007:16:11:47 -0400] "GET /app/ HTTP/1.1" 200 28145 "-" "Java/1.5.0_08"

head closes its filehandle after reading the requisite number of lines and that makes baby tee cry. So, I’m directing tee‘s stderr to /dev/null so it masks the ‘Broken pipe‘ error. That will also mask any other tee error that could arise, but in this simple usage it’s not a concern.

Note that the >(command) syntax for a temporary pipe does not allow a space between the >(

The temporary pipe command can be piped to clean up the output:

$ zcat access_log.20070709.all.gz | tee \
>(head -n1 | cut -d' ' -f4) \
>(tail -n1 | cut -d' ' -f4) &>/dev/null


Lifehacker a nice write up on Firefox web browsing with SOCKS proxies. The tip about network.proxy.socks_remote_dns was new to me and I will have to play with that sometime. Safari, my primary browser, seems to resolve DNS requests at the proxy by default so that saves me the hassle in the meantime.

One of the take home messages of the Lifehacker entry is that you can run “ssh -D 1080” on your workstation, then configure Firefox (as well as most other browsers) to use a SOCKS proxy at localhost port 1080. This provides encrypted communications between your workstation and server (great when your workstation is on an untrusted wireless network) and for masquerading as the server (useful when accessing websites that are behind a firewall or that restrict access by IP address).

Very simple, extremely handy. But what if you want to use remote server that is behind a firewall and only accessible via a gateway machine?

----------------               -------------         -------------
| workstation  |               |           |         |  server   |
|              | --------------|  gateway  | ------- |           |
|(web browser) |               |           |         |  (SOCKS)  |
----------------               -------------         -------------

In that case you have to tunnel through the gateway to get to the SOCKS server running on the server. In this post I’m going to walk though building up the ssh command that will achieve such a tunnel. I will then present an alternate, more generic method.

Read the rest of this entry »

Last week posted a set of benchmarks for scp, tar over ssh and tar over netcat. The loser in SpikeLab’s environment was scp, coming in at roughly three times slower than tar over ssh (10m10s vs 3m18s, respectively) for a directory of two hundred 10MB files.

Interesting. I had not ever noticed scp performing any worse than ssh but then again I had never compared them directly. I decided to run my own unscientific tests on my servers to see if I’m in the same boat.

The Hardware

sender: RedHat EL4, OpenSSH_3.9p1, OpenSSL 0.9.7a Feb 19 2003
16GB memory, 4x Dual Core AMD Opteron
receiver: Apple OSX Server, OpenSSH_4.5p1, OpenSSL 0.9.7l 28 Sep 2006
1GB memory, 2GHz PowerPC G5

These are on a 1Gb/s network. There are four router hops between them and potentially competing traffic on the network so I repeated the tests a few times during non-peak hours to minimize effects of traffic interference.

The Tests

First I created a directory of two hundred 10MB files.

$ until [ `ls|wc -l` -gt 199 ]; do let $((i=$i+1)); dd if=/dev/urandom of=$i bs=10k count=1k; done;

Then I ran a series of scripted file transfer tests, modeling after SpikeLab’s tests. The script is posted at the end of this post.

$ ./

The Results

A representative result of one of the benchmarks for transferring urandom-generated files across the network is shown in Table 1.

Table 1.

command compression time
scp no 251.04s
scp ssh 262.37s
tar no 264.99s
tar ssh 267.34s
tar gzip 324.88s
tar ssh and gzip 331.32s
tar no, blowfish encryption 279.94s
nc no 69.45s
nc gzip 219.53s

In contrast to SpikeLab’s results, I saw no significant difference between scp and tar over ssh. Also, in my environment the addition of gzip compression to the tar transfers had a detrimental impact on the performance. Compare that with SpikeLab’s results in which the gzip compression significantly improved the transfer rates.

I can think of a few reasons for the transfer rate differences at the two sites. Different versions or build options of SSH could affect the results. ssh/scp have a number of options that can be set in configuration files so what I’m showing as the command line execution is not telling the whole story. Those behind-the-scene configurations may be affecting results.

The effects of gzip compression I see could be explained by the randomness of the files being compressed. The Table 1 results were using files generated from the contents of /dev/urandom. If I repeat the tests with files composed uniformly of NULL characters from /dev/zero then gzip gives a marked improvement (Table 2) on the transfer times. The more random the contents of a file the less compression gzip can achieve. In fact, if the contents is fully random, as the case here, no compression can take place and the size of the compressed file will actually be larger due to gzip’s accounting overhead stored with the file. So in some cases gzip can introduce compute time overhead with no reduction in data sent over the wire. The NULL files from /dev/zero compress nicely – the 2GB directory can be compressed down to a 2MB tarball – so the bandwidth savings is substantial.

Table 2.
Trials using non-random files generated from /dev/zero

command compression time
scp no 271.80s
scp ssh 264.76s
tar no 269.62s
tar ssh 272.20s
tar gzip 78.33s
tar ssh and gzip 76.25s
tar no, blowfish encryption 277.29s
nc no 78.51s
nc gzip 78.12s

Interestingly enabling compression in scp/ssh had no real effect on the NULL files although it should be using the same zlib compression algorithm and same default compression level (6) as gzip. The CPU on the receiver seems to be the limiting factor with gzip compression over netcat so no improvement was seen there. The previous ssh results were using ssh version 2. If I use ssh version 1, with and without compression, I do see a dramatic difference (Table 3).

Table 3.
SSH-1 and /dev/zero data

command compression time
scp no 587.42s
scp ssh 98.88s
tar no 687.80s
tar ssh 93.92s
tar gzip 78.05s
tar ssh and gzip 87.90s

As an aside, I tested SSH blowfish encryption which is reportedly faster than the default AES. However I saw no improvement to the transfer rate by using that algorithm (Table 1).

I think in summary all this highlights the need to benchmark specific environments and adjust accordingly. Your mileage may vary.

Read the rest of this entry »

I do not have a root password for many of the servers I interact with so I can not SSH directly in as the root user. Also, the ssh daemons are wisely configured with ‘PermitRootLogin’ set to ‘no’ so a password would be moot anyway. I do have sudo permissions on the servers so I can connect under my username and sudo the privileged commands as needed. Glazed-eye screen-staring started when I needed to rsync a remote directory that was read-only for root. How do I get rsync to run under sudo on the remote server? I did some searching and here are some options I found.

Option 1. Set NOPASSWD in the /etc/sudoers file.

crashingdaily ALL= NOPASSWD:/usr/bin/rsync

Then use the --rsync-path option to specify the sudo wrapper.

rsync -a -e "ssh" --rsync-path="sudo rsync" /archive

Option 2. For interactive usage, I can pre-activate sudo and then run rsync as in Option 1.

stty -echo; ssh sudo -v; stty echo

rsync -a -e "ssh" --rsync-path="sudo rsync" /archive

The “stty -echo” and “stty echo” is used to temporarily disable the display of the keyboard input to prevent the sudo password from being displayed.

Credits: Wayne Davison and Julian Cowley

Option 3. If sudo is not available, there is possibly an option to use “su”. I was unable to get this to work. su seems to insist on a tty – I get the error ‘standard in must be a tty’. (In this case I do have a root password to use with su, so that’s not an issue).

Create a wrapper script, /usr/local/bin/su-rsync, on the remote server and make it executable.

su – -c "rsync $*"

Then call that script with the --rsync-path option.

rsync -a -e "ssh" --rsync-path=/usr/local/bin/su-rsync /archive

Credit: Wayne Davison

Option 4. Set ‘PermitRootLogin’ to ‘yes’ on the remote server and use SSH key authentication to login directly as the root user. This isn’t really an option for me but I throw it out there for sake of completeness.



Re: how to use option for rsync

rsync using sudo via remote shell

If you don’t know where you are going, any road will take you there.
– Lewis Carroll

My production servers reside behind a perimeter firewall in a data center. A minimal set of ports are open to the world, notably port 22 for sshd and port 80 for the Apache webservers which proxy requests to one of several Tomcat instances. The Tomcat ports are blocked at the data center’s perimeter firewall which means no direct access to Tomcat’s manager interfaces. But that’s OK, there are several options for reaching the Tomcat manager from outside the data center. I’ll glance over three options and then delve into a fourth option that is the gooey center of this posting.

Read the rest of this entry »

It’s the simple things in life… Sometimes you simply want to know the canonical path for your current working directory.

Here is a symbolic link to a directory.

$ ls -go /home/crashingdaily/symlink
lrwxrwxrwx 1 14 Mar 29 10:54 /home/crashingdaily/symlink -> im/a/real/path

I change to the directory using the symbolic link.

$ cd /home/crashingdaily/symlink

The ‘pwd’ bash builtin does not resolve symbolic links so reports that I’m in the ‘symlink’ directory.

$ pwd

‘/bin/pwd’ reports the canonical path.

$ /bin/pwd

I can leverage this to quickly change my path from symlinked to real.

$ pwd

$ cd `/bin/pwd`

$ pwd

Also see

readlink -f .

readlink is advantageous if you need to resolve paths outside your current working directory.

Update: Sometimes the simpler things are right under your nose.

pwd -P


cd -P .

were sitting there all along. No need for /bin/pwd (for bash at least). RTFM, indeed.

[update: This issue is reportedly fixed in Net::SSH::Perl 1.34]

I ran into an issue with the Perl module Net::SFTP that drives the file transfers in a script that I use. Net::SFTP is built on top of Net::SSH::Perl, a pure Perl implementation of the SSH-2 protocol.

The issue is that after 1 GB of data is transferred the connection hangs indefinitely.

I found something similar was reported before in the Net::SSH::Perl bug tracker a couple of years ago. Though in that case the hanging was after 1 hour time rather than 1 GB data transfered. The bug report was rejected by the maintainer just the other day with the comment to “try the latest version”.

Well, I have the latest version of Net::SSH::Perl and Net::SFTP from CPAN (1.30) and still have the issue.

I did some digging.

On the server side I have

OpenSSH_3.9p1, OpenSSL 0.9.7a Feb 19 2003

If I turn on sshd debugging I get this logged at the time of the transfer stalling:

Feb 15 23:21:55 olive sshd[19497]: debug1: need rekeying
Feb 15 23:21:55 olive sshd[19497]: debug1: SSH2_MSG_KEXINIT sent

The client Perl script using Net::SFTP with debugging on, reports:

sftp: In read loop, got 8192 offset 1069547520
sftp: Sent message SSH2_FXP_READ I:130563 O:1069555712
Warning: ignore packet type 20
channel 1: window 16358 sent adjust 16410
sftp: Received reply T:103 I:130563
sftp: In read loop, got 8192 offset 1069555712
sftp: Sent message SSH2_FXP_READ I:130564 O:1069563904

I did some more digging and found that the ‘packet type 20’ that is being ignored corresponds to the SSH2_MSG_KEXINIT sent by the server and is a request to initiate session rekeying.

The rekeying was originally something that the commercial SSH2 package did and that sometimes caused problems early on when talking to an OpenSSH clients.

That was then. This is now and it seems that OpenSSH 3.9 knows all about rekeying sessions (I think this was added in 3.6?). The OpenSSH 3.9 server I was connecting to automatically attempts rekeying with SSH-2 clients after ~1GB of data transfer and as far as I can tell there’s no configuration to turn if off. There is an poorly documented ‘RekeyLimit’ for that version that is supposed to control how much data is transfered before rekeying but it had no effect in my hands and I do not see it attempting to regulate anything in the source code (though I’m not very good at C so maybe I’m missing something).

The proper fix I suppose would be to patch Net::SSH::Perl so it implements rekeying. However, I’m not nearly well versed enough in the SSH-2 protocol to do that, so I needed to work around this issue.

After further digging around in the OpenSSH source code I discovered that the server is compiled with a list of known clients that do not support session rekeying (see client strings with SSH_BUG_NOREKEY in OpenSSH compat.c).

Net::SSH::Perl reports itself to the server as version 1.30 (which it is) as the logging from the OpenSSH server in debug mode demonstrates:

Feb 18 13:48:09 olive sshd[18746]: debug1: Client protocol version 2.0; client software version 1.30

‘1.30’ is not on OpenSSH’s compat.c list of SSH_BUG_NOREKEY clients. I could add it to the list and recompile but I don’t have admin access to the sshd server I use in production (the debug messages were from my test server). Therefore the fix has to be in the client.

So, I changed the $VERSION of Net::SSH::Perl to ‘OpenSSH_2.5.2’. The OpenSSH server recognizes it’s talking to a non-rekeying client and skips the session rekeying process. Now large transfers complete without hanging.

I have a shell script to manage and report on my Tomcat instances. I wanted the ‘status’ portion of the script to report on instance uptime (which, by the way, has improved significantly since switching to JRockit). The script was already reporting the PID of the parent tomcat process so I shoved in this one-liner that takes that PID and gets the elapsed time from ps. I filter the result through grep and sed to get a clean human-readable output.

uptime = `ps -o etime $PID |grep -v ELAPSED | sed ‘s/\s*//g’ | sed “s/\(.*\)-\(.*\):\(.*\):\(.*\)/\1d \2h/; s/\(.*\):\(.*\):\(.*\)/\1h \2m/; s/\(.*\):\(.*\)/\1m \2s/”`

echo $uptime

The output is formated as one of days&hours, hours&minutes, minutes&seconds.

6d 08h
03h 23m
20m 56s

Anyone got a better or different way?

ls -d /usr/local/tomcat_instances/{InstanceA, InstanceB, InstanceC, InstanceD}/conf/Catalina/localhost | xargs -i{} cp /usr/local/tomcat_instances/Instance_Template/conf/Catalina/localhost/ROOT.xml {}

rsync -a -e “ssh ssh” :/logs /sync/logs

That is all.


May 2020