Portrait of Edd Dumbill, taken by Giles Turnbull

Subscribe to updates

Feed icon Atom or RSS

or get email updates

What I make

expectnation
a conference management web application


XTech Conference
a European web technology conference

In search of agile infrastructure for web applications

Many advances have been made in agile software development. Frameworks such as Ruby on Rails embody agile principles by making software easy to write, easy to test, and above all, easy to change.

If only we could say the same for the infrastructure on which we develop and deploy web applications. Not every application can be a Rails one, especially where it's been up and running for years. The systems we deploy on often rely on configuration files scattered all over the filesystem, and can be complex results of years of change.

By extension, the systems we develop on often don't exactly mirror where we deploy. If we're lucky and careful, then we have a staging server which mirrors the live environment, but that's not too much of an advantage. Among other things, we need ways to evolve and refactor the live environment, and have our development environments easily track that.

I'd like to call such an environment "agile infrastructure". Infrastructure that doesn't hamper developers, and allows live configurations to change and evolve. Infrastructure that allows new things to be tried with minimum cost, and can provide the best information possible to help future planning.

Tools enable techniques, of course. It's a lot easier to be agile if you're programming in Rails rather than plain old PHP. So what are the tools enabling agile infrastructure?

Techniques I've found useful

While there's a long way to go, I'd like to describe some of the techniques I've been using to help create a flexible environment for developing a sizeable PHP/MySQL application.

Operating system

An operating system must be easy to bring into a known state with minimum interference. For this, I've found Ubuntu or Debian to be an ideal choice. Two of the main reasons include the constancy of stable releases and the fact that it's rare you need to stray outside the distribution's servers to get the software you need.

Deployment

Deploying an application shouldn't break the cleanliness of your OS build. I prefer to package deployed applications into .deb files. This enables me to build on several advantages of the package management system:

  • configuration file handling: deployed machines can keep individual config in /etc, which is preserved over software update.
  • no random state: you know for sure exactly the state of the deployed application, no random files left hanging around.
  • natural workflow: packages take a little bit of effort to build and deploy, which means it focuses developers on getting them right. If things do go wrong, it's very easy to roll back one version.

Source control

The case for source control, thankfully, doesn't need making these days. Yet not every source control system is created equal. Developers shouldn't be constrained by source control, it should be cheap and easy to try new ideas and merge these in later.

Not only cheap branching, but also easy merging, is required to keep development agile while still retaining the benefits of source control. Many people flock to Subversion these days, predominantly because it fixes some of CVS' more egregious misfeatures. However, merging can still be pretty difficult in Subversion.

I prefer to use a system allowing easy merging, such as svk or bazaar. Because so many of us use laptops these days, disconnected operation is also a huge boon. Bazaar-NG offers both repository-oriented and completely decentralized operation, giving the best of both worlds. It also wins the competition for least setup overhead.

Virtualization

Previously a high-powered technology, cheap virtualization is now with us. Many identical virtual machines can be quickly created to enable testing and experimentation. Deployment to virtual machines can enable hardware independence for applications, reducing another big headache.

Large organisations have it within their reach to bring new hardware online easily, and have invested much in management systems for it. Virtualization brings this in reach of small and one-man development teams.

Desktop-based virtualization

Systems such as VMWare and Parallels untether developers from their desks. I use the OS X desktop daily for its productivity software, but an Ubuntu virtual machine for a large proportion of development work. A simple keypress lets me flick between the two. The advantages of multiple desktops to hand for web development are well known. (In fact, thanks to WINE and IES4Linux I can run Internet Explorer cheaply too from the same virtual machine).

Xen virtualization

After the initial novelty wears off, one of the most tedious tasks in the world is bringing new machines online. Combined with the tendency development environments have to get crufty as new ideas and tools are tried out, this makes for increasing disparity between the development and deployment environments.

Wouldn't it be nice just to take a clean machine off the shelf, check out the source, and pick up again? Using Xen for Linux, this cheap virtualization is possible.

Sadly, Xen isn't yet a packaged part of Ubuntu, but getting it up and running on Ubuntu Dapper isn't too hard. Warning: here's where things get pretty deep. Skip this bit if you're not interested in the nitty-gritty.

I followed these steps:

  • Followed instructions in the Ubuntu wiki to get the first domain (vm) up and running
  • Installed the xen-tools package, which provides wrappers for easily creating virtual machines
  • Adjusted some of the xen-tools configuration (/etc/xen-tools/hook.d/20-setup-apt) to set up for Ubuntu rather than Debian
  • Once the virtual machine was made, adjusted its config to allow for the kernel initrd and configure the network bridge I was using

Getting over the DNS mountain

If you want it to be easy to bring new virtual machines up and down, you need to adopt a different approach to managing your DNS. You need to know how to predictably connect to a virtual machine. Yet it's going to get very boring and error prone to manage the IP address space manually in your DNS, or even to map the MAC addresses of your virtual machines in your DHCP configuration.

Help is at hand in the form of mDNS (also known as Rendezvous / Bonjour / ZeroConf). In the host machine, and in each virtual machine, I ensured the avahi-daemon and avahi-utils packages were installed, providing mDNS services. (You should also check that mdns is present in the hosts entry of /etc/nsswitch.conf).

What all this does is allow the machines (virtual or otherwise) on the network to resolve names using mDNS. So if I create 3 VMs, alpha, beta and gamma, I can access these right away using the hostnames alpha.local, beta.local and gamma.local.

Hey presto! Easily make machines appear and disappear without any need for sysadmin involvement. All the extra steps to add avahi installation can be easily scripted with xen-tools, to reduce the number of commands needed to birth a new machine to one.

With this ease of creating new machines, it's much more straightforward to experiment with server topology and conduct experiments on server environments.

Conclusion

Agile development is necessarily constrained by the infrastructure on which it is conducted. The infrastructure itself is constrained by the tools and platforms of choice.

By appropriate tool choice we can reduce the commitment and overhead of infrastructure. This enables developers to get on with the job they do best, and makes experimentation and evolution of infrastructure much simpler.

When programming, it's dangerous to fall in love with your code, as the best solution may often involve throwing half of it away. The same hazard exists with infrastructure, with bizarre and awkward effects propagating back into code and operations. Get yourself a low commitment, agile environment, and you'll be able to keep up with the pace. 

blog comments powered by Disqus


You are reading the weblog of Edd Dumbill, writer, programmer, entrepreneur and free software advocate.
Copyright © 2000-2012 Edd Dumbill