I ran into a fun problem today. The HTTP session cookie for one of my websites was not being retained in the Safari web browser. Packet sniffing clearly showed the server was sending a ‘Set-Cookie’ in the HTTP header. Safari accepted the cookie just fine with other virtual hosts having identical code base and similar configurations for Apache webserver and Tomcat. FireFox 3 accepted the cookie from all the virtual hosts. It was quite a perplexing issue.

It turns out that Safari, Version 3.1.1 at least, does not retain cookies from hosts with an underscore (_) in its name and my troubled website host did indeed have one in its name. I changed the underscore to a dash (-) and Safari was happy.

Who woulda thunk?

Related:

Discussion of the issue from the Ruby Forum
Internet Explorer probably would have given me the same behaviour, but I didn’t test it.
Host name specifications are documented in RFC 952 and RFC 1035. The underscore is not among valid characters.

Here is a tip summarized from comp.unix.shell

The problem:

> For the following TAB-delimited records, I want to count number of
> records with column-2 == -1   (should be 2)
> ===== file.txt ======
> AAA    -1    2008-07-14
> BBB    -14   2008-07-15
> CCC    -20   2008-07-16
> DDD    -1    2008-07-16
> ===========
> I tried:
>   grep -c -- "-1\t" file.txt
> which is not working

A solution offered for ksh93/zsh/bash shells:

grep -c -- $'-1\t' file.txt

I will add another alternative

grep -c '-1^V^I' file.txt

Where ^V^I means type ctrl-v and ctrl-i to enter a tab character.

Related:

Advanced Bash-Scripting Guide Example 34-1
Insert ASCII Control Characters in Text

Oracle is today’s new toy for me. And by toy I mean frequent source of frustration. One issue I battled this morning was getting dbca to start. It would just hang at the command line prompt with no feedback about what it was or wasn’t doing. netca and oem would launch just fine so it didn’t seem to be a Java or X11 configuration issue.

The Oracle forums were of no help but I did find a workaround here. What to do when dbca does not start?

So, I yanked my cable a bit, then, remembering I’m on a wireless Linux laptop, shutdown the network service. Then dbca launched. Of course then dbca couldn’t connect to the databases so I had to start the network again. Good times.

This doesn’t solve the fundamental problem - I certainly couldn’t shutdown the network in a production environment and I doubt it would fly on an Oracle certification exam but at least I’m able to continue with my Oracle adventures.

I frequently use Perl’s in place file editing from the command line. What I didn’t consider until it bit me today is that the file ownership can change using this method.

Here’s the original file, owned by tomcat_6 and only readable by user and group.

$ ls -l web.xml
-rw-rw----  1 tomcat_6 tomcat 49384 Jul 10 11:38 web.xml

I belong to the tomcat group, so have write permissions to the file. The enclosing directory is also tomcat group writable. The importance of this is noted below.

Using the ‘perl pie’ one-liner to make an in place edit if the file:

$ perl -p -i -e 's;<session-timeout>\d+</session-timeout>;\
<session-timeout>1440</session-timeout>;' web.xml

Now the file is owned by me and my default group.

$ ls -l web.xml
-rw-rw----  1 crashing  daily 49384 Jul 10 21:55 web.xml

Most critically, now the file is no longer readable by the tomcat processes. This little change prevented my Tomcat server from starting. Ouch.

sed is a little nicer. It changes the owner but not the group.

$ sed -i 's;<session-timeout>.*</session-timeout>;\
<session-timeout>1445</session-timeout>;g' web.xml
$ ls -l web.xml 
-rw-rw----  1 crash tomcat 49385 Jul 10 22:44 web.xml

Neither the Perl nor the sed one-liners work if the directory is not writable because Perl and sed require unlinking the original file and replacing it with a new version.

The winner for both maintaining file ownership and working if the directory is not writable is ed.

$ ed - "web.xml" <<EOF
,s;<session-timeout>[[:digit:]]*</session-timeout>;\
<session-timeout>1440</session-timeout>;
w
EOF

ed truly does an in place edit. Nice. If only I could remember the syntax.

The Tiger and Leopard releases of MacOS X include an implementation of BSD’s dummynet. Dummynet is “a system facility that permits the control of traffic going through the various network interfaces“.

There are many uses for this feature. I use it as part of my website development to simulate a slow network connection. Many of the users of our websites are in developing countries with slow, dialup-speed, network connections. By using a couple of quick commands I can throttle my connection to the webserver down to similar speeds. As such, I can feel their pain even though I’m on a snazzy gigabit connection, two hops away from the webserver.

The following series of commands will slow my communications to and from the webserver down to 56K modem speeds. It only affects http connections (to any web server, not just mine). My other network connections - ssh, for example - operate with native network performance.

$ sudo ipfw add pipe 1 src-port http
$ sudo ipfw add pipe 1 dst-port http
$ sudo ipfw pipe 1 config bw 56kbit/s

Adam Knight’s Traffic Shaping in Mac OS X is a good starter tutorial.

Additional articles and documentation.

Here is a brief look at the Linux commands, type and hash.

To start, I open a new terminal and check for the location of the foo executable using the which command. (foo is a trivial shell script I concocted for illustrative examples in this posting.)

$ which foo
/usr/local/bin/foo

which tells me the location of the executable file in my $PATH but that is not necessarily what will be executed when I call foo on the command line. To learn that, I use the type command.

$ type foo
foo is /usr/local/bin/foo

type reports how a word will be interpreted if used as command name. In this case, it is telling me that using foo as a command will execute the file /usr/local/bin/foo. That happens to be what which also reported, but, as we will see in the following examples, that will not always be the case.

I can run the foo command to print its release version.

$ foo --version
foo release 1.0

Consider if I install a new foo, version 2.0, in my ~/bin directory, leaving the 1.0 version in /usr/local/bin/foo. I have positioned ~/bin ahead of /usr/local/bin in my $PATH and the which command confirms the new 2.0 version is found first.

$ which foo
/home/crashingdaily/bin/foo

However, when I call for foo to be executed it still executes the old 1.0 version.

$ foo --version
foo release 1.0

Hmm, what’s going on? The which command reported the first executable found in my path but type will be more informative. It will tell us precisely what file, function, builtin, keyword or alias is associated with a given command name.

$ type foo
foo is hashed (/usr/local/bin/foo)

This is reporting that the shell has saved the meaning of foo in a hash table as /usr/local/bin/foo (the 1.0 version). Caching the command in a hash table is an optimization that saves the shell from having to search $PATH every time. A given shell does this the first time a command is run in that shell instance. The hashed values survive for the life of the shell instance. Start a new shell, e.g. by opening a new terminal or invoking a subshell, and you start a new hash table for that shell process.

The hash table can also be manually manipulated to clear or set values. Enter the hash command. This command is used to print and edit the shell’s command hash table.

The entire hash table can be cleared

$ hash -r

or you can delete a specific entry

$ hash -d foo

Having cleared the hash table, invoking foo now calls the first in my path.

$ foo --version
foo release 2.0

And this copy is now placed in the hash table.

$ type foo
foo is hashed (/home/crashingdaily/foo)

I can also print that entry of the hash table to get its value. However, do not use this as a substitute for type as it only deals with files. More on that shortly.

$ hash -t foo
/home/crashingdaily/foo

As you might imagine, if I delete the executable whose value has been hashed,

$ rm /home/crashingdaily/foo

the command is not found when I attempt to use it,

$ foo --version
-bash: /home/crashingdaily/bin/foo: No such file or directory

until I update the hash. I can clear the hash as before or I can explictly set the value like so

$ hash -p /usr/local/bin/foo foo
$ type foo
foo is hashed (/usr/local/bin/foo)
$ foo --version
foo release 1.0

Keep in mind that the hash table caches (and the hash command reports) file and path names. It does not deal with keywords, functions, builtins or aliases which may be invoked by a command name. So, type is the utility to use for learning what will execute when a command name is invoked.

To further illustrate type usage, I will define a function named foo.

$ function foo { echo 'hello world'; }

Because function names take precedence over file executables having the same name, invoking foo executes the function instead of the shell script. Now, type reports

$ type foo
foo is a function
foo ()
{
    echo 'hello world'
}

and invoking the name gives us the expected function output.

$ foo
hello world

Using type with the -a option reports all the known values of foo

$ type -a foo
foo is a function
foo ()
{
    echo 'hello world'
}
foo is /home/crashingdaily/bin/foo
foo is /usr/local/bin/foo

In summary, use type to learn what a given command name currently means to the shell - there could be more than one meaning. Use hash to manipulate cached associations of command names with files (and be aware that the command name may be associated with a non-file that takes precedence).

Additional documentation for type and hash is availble via the help command.

$ help type
type: type [-afptP] name [name ...]
    For each NAME, indicate how it would be interpreted if used as a
    command name...
$ help hash
hash: hash [-lr] [-p pathname] [-dt] [name ...]
    For each NAME, the full pathname of the command is determined and
    remembered....

History Meme

$ history|awk '{a[$2]++} END{for(i in a){printf "%5d\t%s\n",a[i],i}}'|sort -rn|head
  77  whoami
  74  tic
  72  gawk
  70  od
  70  bash
  69  shred
  66  killall
  58  sleep
  55  id
  55  play

Wake the media, alert the dog. I installed Windows XP Pro into a Parallels virtual machine (VM) today.

Disabling ‘Acceleration’ in the VM was required for successful installation. With Acceleration enabled, the installation kept hanging at the “installing devices” stage (this could be progressed sometimes by forced restarting of the VM) and spontaneously rebooting.

From Problems with installing Windows XP SP0 in VM in Parallels’ knowledge base:

1. Change the acceleration mode of the VM:
2. Open the VM's configuration page.
3. In the Resources list select Options.
4. Click the Advanced tab.
5. Set the Acceleration level to Normal or Disabled.
After installing Windows XP in the VM, you can change the Acceleration level back to High.

The group I work with publishes web sites that front relational databases. Our core system infrastructure includes a custom Tomcat/Struts-based web application proxied through an Apache web server. Development of our software involves software engineers working on the web application, web design gurus focused on the web interface, programmers writing CGI scripts and beta testers poking at our progress.

To facilitate individualized development, we have a model whereby everyone on the project gets a personal Apache virtual host in which to install the web application and its sundry bits. Some people need more than one virtually hosted web site and sometimes a web site needs to be thrown up for a quick isolated test and then thrown away.

To ease the ups and downs of the virtual hosts, I use wildcard A records in the DNS for our domains. This allows me to quickly configure (and un-configure) an Apache virtual host without having to formally register (and un-register) the host name in the nameserver. So I can configure a name-based virtual host for the Apache web server:

<VirtualHost *:80>
    ServerAlias vh1.crashingdaily.com
    DocumentRoot /var/www/vh1.crashingdaily.com/html
</VirtualHost>

and instantly http://vh1.crashingdaily.com/ is a valid URL. Of course a real virtual host configuration would be more involved and the development files need to be installed to /var/www/vh1.crashingdaily.com/html but hopefully you get the idea.

Now, what happens if someone attempts to use host name for which there is no configured virtual host - say, they mistype the public production site as, ww.crashingdaily.com? Well, the wildcard A-record will send the user to our development server at IP address of the DNS wildcard and the Apache web server, not finding a virtual host configured for ‘ww.crashingdaily.com’, will use its default virtual host to answer the request.

For name-based virtual hosting, the first virtual host defined during Apache’s configuration phase catches any request to an unknown server name. This is a problem when an end user trying for our production site mistypes the host - they end up at a password protected development site and, depending on our ever changing vhost configurations, possibly a site unrelated to our project.

To get our production users back on track when they are using a malformed host name, I set up a wildcard Apache virtual host that answers for all host names not explicitly configured.

<VirtualHost *:80>
    ServerAlias *.crashingdaily.com
    Redirect 301 / http://www.crashingdaily.com/
</VirtualHost>

This configuration goes on our development server to which wildcard DNS records resolve. Our production sites are on a separate machine so we can’t just add the ServerAlias to the configuration for the ‘www’ virtual host.

This was a successful solution for our public users, they were redirected to the public site, but caused confusion for our developers when they mistyped a host name for a development site or, more commonly, when the host name is typed correctly but the site was not a successfully configured virtual host in Apache. Like our public users, the developers would get redirected to the public production site but, in this case, that is not the desired action. And because the production site frequently looks like the development site, it is not immediately clear what has happened.

I needed to make the wildcard virtual host return a real message to all users, developer or public. I wanted to keep the host exclusively virtual with no physical files on the server. I came up with the following.

<VirtualHost *:80>
    ServerAlias *.crashingdaily.com
    Redirect 404 /
    ErrorDocument 404 "No such site. Check the URL speling. Our main site is \
                       <a href='http://crashingdaily.com/'>http://crashingdaily.com/</a>"
</VirtualHost>

With this configuration all requests to the wildcard virtual hosts are remapped by the Redirect rule to trigger a HTTP response of 404 Not Found. The configuration also specifies the custom ErrorDocument to return to the client when a 404 error is encountered.

Now when a developer or public user accesses a non-existent website within our domain they get a clear indication of the problem rather than being transparently redirected to another, possibly unintended, site.

Related:

Apache mod_alias - provides the Redirect rule.
Architectural Concerns on the use of DNS Wildcards


(Our real hosts and IP addresses have been changed for this posting to protect my ass.)

BASH Cures Cancer has yet another interesting post this week. This one, entitled “Command Substitution and Exit Status“, leverages the exit status returned from a BASH sub-shell. That alone is a useful tip but what tickled me was the author’s trick of shortening the delay in a running loop by starting a newer, shorter sleep loop that kills off the longer-running sleep process. Clever. The thought of dueling shell processes - “I’m going to sleep for a minute.”, “Oh, no you’re not!” - makes me chuckle. Sigh. I am easily amused.

Categories

 

September 2008
M T W T F S S
     
1234567
891011121314
15161718192021
22232425262728
2930  

Latest del.icio.us