By Dave Farley
At LMAX where I worked for a while, they have extensive, world-class, automated acceptance testing. LMAX tests every aspect of their system and this is baked in to their development process. No story is deemed complete unless every acceptance criterion associated with it has a passing, automated, whole-system acceptance test.
This is a minimum: usually there is more than one acceptance test per acceptance criterion. This triggers the question: “What is an acceptance test?” I recently had a discussion on exactly this topic with some friends, trying to define the scope of acceptance tests more clearly. This was triggered by an article published by Mike Wacker of Google, who claimed that it was not practical to keep “end-to-end tests passing”.)
My former colleague Adrian replied, in essence, that LMAX has been living with exactly this kind of complex end-to-end test for the past eight or nine years. This sparked a debate on the meaning of end-to-end testing which I will skip for now. I will use the term “acceptance testing” to mean the sort of testing described in the Google article; I think their intent is what I mean by acceptance tests. There is a serious problem to address here, that of test maintainability.
As soon as you adopt an extensive automated testing strategy you also take-on the problem of living with your tests. I don’t know the details of Google’s testing approach, but there are several things in Mike’s article that suggest that Google is succumbing to some common problems:
Firstly, their feedback cycle is too long! The article talks about building and testing the latest version of a service “every night”. That is acceptable in a few limited, difficult circumstances – if you are burning your software into hardware devices, for example. Otherwise, it is unacceptably slow and will compromise the value and maintainability of your tests.
As my ex-colleague Mike Roberts used to say: “Continuous is more often than you think”. Testing every night is too slow, you need valuable feedback much more frequently than that. I think that you should be aiming for commit stage feedback in under 5 minutes (under 10 is survivable, but unpleasant) and acceptance stage feedback in under 30 minutes (60 is survivable, but unpleasant). I think that unit testing alone is insufficient, for some of the reasons that the Google article cites.
There are hints of other problems. “Developers like it because it off-loads most, if not all, of the testing to others”. I think that this is a common anti-pattern. It is vital that developers own the acceptance tests. It may be that in the very early stages of their initial creation someone in a different role may sketch the test, but developers are the people who will break the tests and so they are the people who are best placed to fix them and maintain them.
This is, for me, an essential part of the Continuous Delivery feedback loop. I have never seen a successful automated testing effort based on a separate QA team writing and maintaining tests. The testing effort always lags, and there is no “cost” to the development team of completely invalidating the tests. Make the developers own the maintenance of the tests and you fix this problem. Prevent release candidates that fail any test from progressing by implementing a deployment pipeline, and make it a developer’s priority to keep the system in a “releaseable state” – meaning “all tests pass”.
The final vital aspect of acceptance tests is that they should be simple to create and easy to understand. This is all about ensuring that the infrastructure supporting your acceptance tests is appropriately designed. Allow for a clear separation of the “What” from the “How”. We want each test case to assert only “What” the system under test should do, not “How” it does it. This means that we need to abstract the specification of test-cases from the technicalities of interacting with the system under test.
The Google article is right that unit tests, particularly those created as part of a robust TDD process, are extremely valuable and effective. They do, though, only tell part of the testing story. Acceptance tests, testing your system in life-like circumstances are, to me, a fundamental part of an effective testing strategy. Although theoretically you could cover everything you need in unit tests, in practice we are never smart enough to figure that out. Evaluating our software from the perspective of our users is at the core of a CD testing strategy.
So here are my guidelines for a successful test strategy:
- Automate virtually all of your testing.
- Don’t look to tests to verify, look to them to falsify.
- Don’t release if a single test is failing.
- Do Automate User Scenarios as Acceptance Tests.
- Do focus on short feedback loops (roughly 5 minutes for commit stage tests and 45 minutes for acceptance tests)
You can find a video of me presenting in a bit more detail on some of these topics here: https://vimeo.com/channels/pipelineconf/123639468