3 steps to better code deploys.
We've all been there. Its crunch time, a critical bug has been found and you need to deploy a fix asap. Its an easy one to fix, a one-liner. But you know it takes 2 hours to deploy the fix. Possibly longer. And you have a ton of other things you really need to be doing rather than wait around while a computer does something.
Having multiple stages of deployment is a good thing. Dev, Stage and Production environments are all essential. But they magnify every little bit of time wasted in each deployment pipeline. You need gates, but you need to be able to get through them fast, while making sure close they behind you.
Automate the deployment
Probably one of the first steps is to automate the deployment of code to the target environment. This generally has two phases
- Packaging the code.
- Configuration Management
Packaging the code can be a broad definition, and could refer to an automated git checkout onto the server, but more likely is some kind of language specific packaging (python package, jar file or ruby gem), or an operating system package (you can checkout a guide to doing the last two here and here).
Configuration Management is the actual configuration of the server or image, and the installing of the actual deployed software. Puppet, Chef, Salt, Ansible and Fabric are all reasonable choices. By automating the deployment of code you get a reliable easy and quick process for pushing out code.
Once your code is deployed you need to make sure that it has fixed the critical bug and has caused no regressions. The best way to speed this up is to have a suite of automated tests at the unit, functional and integration levels. At deployment time all the unit and functional testing should have already passed in a Continuous Integration environment (such as Jenkins or BuildBot ). Having fast, repeatable and automated unit/functional test suites is critical to being able to deploy quickly, reliably and with confidence.
Integration testing should test the flow of information between services or even just a workflow that requires multiple steps, and hits multiple parts of your infrastructure. Its easy to have a bunch of manual steps to perform integration testing, but its much faster and reliable to have these automated. If they aren't automated yet, there is no time like the present. As a side note, the integration tests should really be idempotent, to allow re-running of the tests no matter the state of the system.
Speed up deployment
So you have the building blocks for deploying code to your environments in an automated way. But its still taking too long. So what can you do?
Deploying code from an OS package rather than code checkout or language package is much faster. This is because most packages have external dependencies that need to be downloaded. When using a git checkout or language package, you generally do this at deploy time rather than packaging time. Additionally OS Packages are usually a glorified file copy and are therefore pretty fast.
To speed this up even further you can host a software mirror close to your image to reduce network roundtrips (I wrote up a guide to host your own apt-get repository here). This can contain your own software and also any other system packages required for the deployment.
When it comes to configuration management Fabric is probably the slower option due to the use of ssh and more of a procedural operation rather than declarative. The others really only do a diff of your system and also can operate in parallel which is particularly useful if deploying identical images.
If deploying to some kind of IaaS platform (such as EC2 or Rackspace) you can speed up deployment time by building some common software into a base image. This can be very useful in a very homogenous environment, because you can have large parts of the configuration management done at image package time rather than deployment time. This is less of an advantage in an heterogeneous software environment however as any unused software installed is a waste of time and can potentially inject greater complexity into the system.
Alternatively rather than package individual pieces of software, you could instead package entire server images. Things like Docker have recently bought this to the forefront of deployment strategies, but Netflix for example have been building individual AMI's for each deploy for a while. This means that all the configuration management is done one time only and deployed multiple times. For deployments with multiple images this could be a major speed advantage.
In a lot of these strategies we are moving time spent further down the pipeline in deployment back up to packaging. This means that an investment in the servers doing your CI builds could have a large payoff in terms of speed, magnified by the amount of time per build and per developer. Therefore correct CI server resourcing is something that should not be overlooked.
When looking to streamline processes within an organisation its important to look at repetitive time consuming tasks as low hanging fruit. This is especially important when those tasks get in the way of development velocity and is essentially wasted time. Hopefully by taking a few of the above steps, deployment will no longer be the compilation of our generation.
As always comments, suggestions and clarifications are always welcome in the comments section below.
Want more info on effective deployment practices every week? Signup below for more articles direct to your email, or follow Plank and Whittle on TwitterFollow @PlankAndWhittle