I was nothing short of blown away over the past few days, when some comments I made on twitter about UI automation caused a lot of folks to raise their eyebrows.
Here’s the tweet in question.
I’m not against discussions on the invalidity of the test automation pyramid.
If you don’t like it, you use whatever model you want as long as it suggests you write AS FEW UI TESTS AS POSSIBLE.
seriously – stop your infatuation with UI tests
— Alan Page (@alanpage) May 3, 2018
Feedback (blowback?) ranged from accusations of harmful blanket statements to lectures on how what I really meant was “checking”, not testing – with a handful of folks who seemed worried that the world I was describing was too scary compared to the world where they lived. Testing evolves at different speeds in different places, so the last point, at least, was expected. But I stand by the sentiment in that tweet.
The test automation pyramid is a model for thinking about distribution of your tests. Mike Cohn first (I think) wrote about it here. In my opinion (and observation), there is an unhealthy obsession in software testing with writing “automation” (where “automation” means UI automation – i.e. using tools like Selenium to manipulate the UI to test the system under test). A few folks recently have complained about the pyramid (“it’s not a pyramid, it’s a triangle!”; “There are more than 3 types of testing”, etc.).
“All Models are wrong, some are useful” — George Box
It’s a good model, with a lot of practical application. Two key takeaways from the pyramid model are:
- Write tests at the lowest level where they can find the bug
- Minimize the amount of top/UI level tests
I’m passionate about the second point for three main reasons:
- UI Tests are flaky. This was true 25 years ago when I first wrote UI automation, and it’s true today. They’re just not as reliable and trustworthy as lower level tests.
- Despite the fact that reliable UI Tests are difficult to write, we (industry) seem to think that UI automation is a reasonable entry point to the world of coding for “manual” testers. UI automation is a horrible way to start programming. Shell (or other) scripts to help set up test environments or generate test data would be a much better use of time (and achieve more success) than learning to code by writing UI tests.
- UI tests are s l o w. This is fine if you have a handful of tests, but a huge issue if you have hundreds, or thousands of UI tests.
Note – if you have a large amount of testing that can only be done at the UI level, that’s a big red testability issue you should probably address before investing in expensive testing.
Let’s assume for a moment that the problems with UI automation stability have been solved (companies like testim.io have used ML to make some strides in this area, and despite the entry path problem I mentioned above, there is improvement in automation tools and tester skills). If we go with this assumption, then point #1 – and possibly point #2 above are no longer an issue.
Point #3, however, is not solvable. Tests that automate the UI are slow. Way slow. Like a glacier stuck in molasses slow. I once wrote a UI based networking test to create a folder, share it, connect to it, write files to it, delete files, and then unshare the folder. That test took a little less than two minutes. Problem was, that I needed to test that process for every character possible in isolation (due to issues with DBCS code pages on non-Unicode Windows where details would fill pages of no longer relevant information). On Chinese windows, for example, this was (IIRC), somewhere near 8000 characters.
I wrote an API level test that tested the entire code page – including varying lengths of folder and share names that ran in under 5 minutes (and less than a minute for Western code pages). Of course, we still did spot checking (both exploratory, and via some UI automation), but testing at the level closest to where we could find bugs was the most efficient – both in proximity and speed.
Another view of tests I like is the size model from google. Rather than dwell too much on what makes a test a unit or integration test, think of tests in sizes – where tests of a certain duration are classified at different levels. This model works well (and solves the pyramid complaints I’ve seen recently), but it doesn’t have a visualization.
So – without further babbling, I created this alternate view – The Test Automation Snowman.
Use it, or ignore it. But I still beg you to consider writing far fewer UI based tests.
Ancient Google Test Sizes:
https://testing.googleblog.com/2010/12/test-sizes.html
I can’t speak to the rest of the post insulting someone. Possibly I already have.
I think your argument makes complete sense. Your automation snowman is a good diagram as well. One thing I feel is missing from the diagram is a quantification of “slow”, “slightly slower” and so on. I believe that for people that have never written a unit or integration test have a different concept of what is slow. Someone that writes only UI tests thinks that a 30 minute UI test is slow. While for someone like myself, that knows that a unit test can run in milliseconds, I consider anything longer than a 30 second UI test to be slow. I used to be the former too 🙂
So I think that by adding quantification to your snowman, you could possibly avoid the miscommunication that can result from different perspectives.
The only reason I didn’t add numbers because I think it’s fair to have some variance between types of applications or release cadence (for something shipping quarterly, I think it’s ok to have a suite of tests that takes 2 hours – but I’d never tolerate that for something deploying continuously.
But I think it’s worth calling it out now. Thanks.
I think one possible reason for the large amount of ui automation could be that testers are still black box testing for many things so using the browser is all they know.
I myself want to grow and move more towards learning to write tests near the lower levels, since the majority of my experience is writing ui tests
You wrote UI automation 25 years ago?
Yep. For Windows apps using MS Test (later became Visual Test). The beta version was available in 1993, and I used it frequently.