Automate Release Checks

March 8th, 2010 | Tags: ,

Context

  • Each release of your product has some unique aspect to it, for example an intended customer-specific configuration.
  • Releases of your product are frequently made which fail in some, though not necessarily always the same, downstream step.
  • Verifying correct operation of each build empirically is time consuming or costly.

Therefore

  1. Write a program which is the simplest possible wrapper for your release of the artifact consumed downstream.
  2. Each time a failure causes a flow interruption, modify the program to detect this problem with the artifact and fail loudly, refusing to produce the artifact.

Discussion

This pattern operates on the general principle that a shorter feedback loop for failures will have less negative effect on effort hours and throughput. This is the same principle embodied by test-driven development applied in a slightly different context. To build on the analogy, consider instituting a policy of adding the automated release check to your artifact-builder before correcting the downstream problem, and verifying the release check by attempting to release a new build without correcting the original problem.

This is a kind of test-driven development which is fed by downstream regressions. It is an attempt to introduce a positive rachet effect.

If you do not have some unique aspect to each release, you could simply add unit tests which fail the build if in-repository artifacts are somehow inconsistent or incorrect. Even though we don’t usually think to add unit tests to verify, say, properties of configuration files stored in the repository, these unit tests provide a tighter feedback loop than this pattern and should be preferred.

Observed Instances

One

We have a product where frequent, customer-specific releases are made to an independent QA department. Two files are delivered, an installer exe and an ini file. The ini file is read by the installer exe and causes the installer to install the correct components and install the proper configuration into the registry. The install is a lengthy process (twenty minutes) on machines which have a lot of software installed, discouraging developers from running the installer. Also, verifying correct operation requires setting up a protocol emulator which may not be available.

The ini file layout is very brittle. It contains many options which must be consistent with other options. There are also many options which must be specified for every component in the configuration. Yet other options can cause some other options to be ignored entirely. The install itself rarely fails, it leaves the machine configured in an unusable state instead.

A significant portion of my time was spent in the QA lab troubleshooting failed installs. When I first started maintaining this software, it seemed like one third of a typical day might be spent in the QA lab. Since QA is on an isolated network, I went through many thumb drives, not remembering which machine I had last left it in.

I wrote a wrapper for building the install after reading the ini file. This also had the benefit of allowing us to remove unused components from an install. For each failure found in QA, I made it fail loudly and abort before building an install if the problem was detected with the ini file.

I haven’t been in the QA lab in more than four months. I have kept the same thumb drive for perhaps a year, though I’m no longer sure what’s on it.

This has had an unexpected compounding positive effect: The most likely causes of an install failing are now either procedural or due to configuration changes made for testing in QA. Previously, the most likely cause of an install failing was that it had been packaged incorrectly. As a result of this shift, development is no longer the first on the scene of a failed install. Instead, the techs responsible for deploying the software to the customer handle most QA installation issues.

It might seem that a more fundamental problem in this occurrence is the brittleness of the configuration format, that it admits self-inconsistency. This is probably true; however, I could not determine a way to re-engineering the format piecemeal, and customer commitments (as well as time logged in the QA lab!) made such a large project unlikely on a maintenance-mode product. Also, the current installer has all sorts of difficult-to-understand logic with undocumented intent. This solution was truly cheap to get running.

No comments yet.
TOP