The group I work with publishes web sites that front relational databases. Our core system infrastructure includes a custom Tomcat/Struts-based web application proxied through an Apache web server. Development of our software involves software engineers working on the web application, web design gurus focused on the web interface, programmers writing CGI scripts and beta testers poking at our progress.

To facilitate individualized development, we have a model whereby everyone on the project gets a personal Apache virtual host in which to install the web application and its sundry bits. Some people need more than one virtually hosted web site and sometimes a web site needs to be thrown up for a quick isolated test and then thrown away.

To ease the ups and downs of the virtual hosts, I use wildcard A records in the DNS for our domains. This allows me to quickly configure (and un-configure) an Apache virtual host without having to formally register (and un-register) the host name in the nameserver. So I can configure a name-based virtual host for the Apache web server:

<VirtualHost *:80>
    ServerAlias vh1.crashingdaily.com
    DocumentRoot /var/www/vh1.crashingdaily.com/html
</VirtualHost>

and instantly http://vh1.crashingdaily.com/ is a valid URL. Of course a real virtual host configuration would be more involved and the development files need to be installed to /var/www/vh1.crashingdaily.com/html but hopefully you get the idea.

Now, what happens if someone attempts to use host name for which there is no configured virtual host – say, they mistype the public production site as, ww.crashingdaily.com? Well, the wildcard A-record will send the user to our development server at IP address of the DNS wildcard and the Apache web server, not finding a virtual host configured for ‘ww.crashingdaily.com’, will use its default virtual host to answer the request.

For name-based virtual hosting, the first virtual host defined during Apache’s configuration phase catches any request to an unknown server name. This is a problem when an end user trying for our production site mistypes the host – they end up at a password protected development site and, depending on our ever changing vhost configurations, possibly a site unrelated to our project.

To get our production users back on track when they are using a malformed host name, I set up a wildcard Apache virtual host that answers for all host names not explicitly configured.

<VirtualHost *:80>
    ServerAlias *.crashingdaily.com
    Redirect 301 / http://www.crashingdaily.com/
</VirtualHost>

This configuration goes on our development server to which wildcard DNS records resolve. Our production sites are on a separate machine so we can’t just add the ServerAlias to the configuration for the ‘www’ virtual host.

This was a successful solution for our public users, they were redirected to the public site, but caused confusion for our developers when they mistyped a host name for a development site or, more commonly, when the host name is typed correctly but the site was not a successfully configured virtual host in Apache. Like our public users, the developers would get redirected to the public production site but, in this case, that is not the desired action. And because the production site frequently looks like the development site, it is not immediately clear what has happened.

I needed to make the wildcard virtual host return a real message to all users, developer or public. I wanted to keep the host exclusively virtual with no physical files on the server. I came up with the following.

<VirtualHost *:80>
    ServerAlias *.crashingdaily.com
    Redirect 404 /
    ErrorDocument 404 "No such site. Check the URL speling. Our main site is \
                       <a href='http://crashingdaily.com/'>http://crashingdaily.com/</a>"               
</VirtualHost>

With this configuration all requests to the wildcard virtual hosts are remapped by the Redirect rule to trigger a HTTP response of 404 Not Found. The configuration also specifies the custom ErrorDocument to return to the client when a 404 error is encountered.

Now when a developer or public user accesses a non-existent website within our domain they get a clear indication of the problem rather than being transparently redirected to another, possibly unintended, site.

Related:

Apache mod_alias – provides the Redirect rule.
Architectural Concerns on the use of DNS Wildcards


(Our real hosts and IP addresses have been changed for this posting to protect my ass.)

Advertisements