Stop Writing Automation

After releasing The A Word, I didn’t plan on writing any more posts about automation. But, after pondering transitions in test, and after reading this post from Noah Sussman, I have a thought in my head that I need to share.

I don’t think testers should write automation. Star

I suppose I better explain myself.

All automation isn’t created equal. Automation works wonderfully for short confirmatory or validation tests. Unit, functional, acceptance, integration tests, and all other “short” tests lend themselves very well to automation. But I think it’s wasteful and inefficient to have testers write this automation – this should be written by the code owners. Testing your own code (IMO) improves design, prevents regression, and takes much, much less time than passing code off to another team to test.

That leaves the test team to write automation for end-to-end scenarios. There’s nothing wrong with that…except that writing end to end automated tests is hard (especially, as Noah points out, at the GUI level). The goal of automation (as touted by many of the vendors), is to enable running a bunch of tests automatically, so that testers will have more time for hands-on testing. In reality, I think that most test teams spend half of their time writing automation, half of their time maintaining and debugging automation, and half of their time doing everything else.

Let’s look at a typical scenario.


Pretty easy – you only need to automate three actions, add a bit of validation and you’re done!

If you’ve attempted something like this before, you know that’s not the whole story. A good automated test doesn’t just execute a set of user actions and then go on it’s way – you need to look at error conditions, and try to write the test in a way that prevents it from breaking.


It’s hard. It’s fragile. It’s a pain in the ass. So I say, stop doing it.

“But Alan – we have to test the scenario so that we know it works for our customers”. I believe that you want to know that the scenario works – but as much as you try to be the customer, nobody is a better representative of the customer than the customer. And – validating scenarios is quite a bit easier if we let the customers do it.

NOTE: I acknowledge that for some contexts, letting the customers generate usage data for an untested scenario is an inappropriate choice. But even when I’ve already tested the scenario, I still want (need!) to know what the customer experience is. As a tester, I can be the voice of the customer, but I am NOT the customer.

Now – let’s look at the same scenario:


But this time, instead of writing an automated test, I’ve added code (sometimes called instrumentation) to the product code that let’s me know when I’ve taken an action related to the scenario. While we’re at it, let’s add more instrumentation at every error path in the code.


Now, when a user attempts the scenario, you could get logs something like this:

05072014-1601-3143: Search started for FizzBuzz
05072014-1601-3655: Download started for FizzBuzz
05072014-1602-2765: Install started for FizzBuzz
05072014-1603-1103: FizzBuzz launched successfully


05072014-1723-2143: Search started for FizzBuzz
05072014-1723-3655: Download started for FizzBuzz
05072014-1723-2945: ERROR 115: Connection Lost


05072014-1819-3563: Search started for FizzBuzz
05072014-1819-3635: ERROR 119: Store not available.


From this, you can begin to evaluate scenario success by looking at how many people get through the entire scenario, and how many fail for particular errors.


or generate data like this:


And now we have information about our product that’s directly related to real customer usage, and the code to enable it is substantially more simple and easy to maintain than the traditional scenario automation.

StarA more precise way to put, I don’t think testers should write automation. is – I think that many testers in some contexts don’t need to write automation anymore. Developers should write more automation, and testers should try to learn more from the product in use. Use this approach in financial applications at your own risk!

Testing Trends…or not?

I read this article over the weekend about five emerging trends in software testing – Test Automation; Rise of mobile and cloud; Emphasis on security; Context-driven testing; and More business involvement.

I fully acknowledge that I work in a software development environment that isn’t like many others, but while reading the article, I really didn’t feel like any of those areas are “emerging” – all are fully emerged already. Sure, the trends are interesting to testers, but emerging? I could waste some space rebutting or commenting on the areas above, but instead, let me offer some alternate trends that I see inside of MS and from some of my colleagues who work elsewhere.

Fuzzier Role Definitions.  I don’t really like the terms “whole team approach” or “combined engineering”, but I do see software teams really figuring out how to work better together and leverage every team members strengths effectively. Great testers are working as Test Specialists and working much more broadly across the team. I expect the “lines” between software disciplines to fade even more in the future.

Developers Own More Testing. You can call them “checks” if you wish (I call them “short tests”, but software developers are beginning to own much bigger portions of traditional software testing. This is a good thing – it ensures that daily code quality is high, and gives test specialists a high quality product to work with.

Testing Live Sites. Mock-test environments typically do a poor job representing production environments. Other than brief sanity checks for the most critical components, many web service teams just roll their new bits straight to production, and then run their tests against the live system. With a good monitoring system (including the ability to stage rollouts and automatically roll back if needed), this is a safe, efficient, and frankly, practical method for testing services.

Data is HUGE. Many software teams have figured out that the best way to get an accurate representation of how customers use software is collect and analyze data from those same customers. A whole lot of traditional test activities can be replaced by product instrumentation, and an efficient method for getting product instrumentation back to the team for analysis. On a lot of teams, last year’s testers are this  year’s data analysts and data scientists. While not every tester is cut out for this role, this move to data analysis is a strong trend on a lot of software teams.

To critique myself for a moment, I think a lot of readers could say that none of these points are emerging either. That’s a fair point, since I know teams that have been doing everything above for years…but I’m just now seeing some of these trends “emerge” on multiple teams (and not just those testing web services or sites).

What trends do you see? Did I miss anything huge? Have the above four points already reached the tipping point of emergence?

Users, Usage, Usability, and Data

The day job (and a new podcast) have been getting the bulk of my time lately, but I’m way overdue to talk about data and quadrants.

If you need a bit of context or refresher on my stance, this post talks about my take on Brian Marick’s quadrants (used famously by Gregory and Crispin in their wonderful Agile Testing book); and I assert that the the left side of the quadrant is well suited for programmer ownership, and that the right side is suited for quality / test team ownership. I also assert that the right side knowledge can be obtained through data, and that one could gather what they need in production – from actual customer usage.

And that’s where I’ll try to pick up.


Agile Testing labels Q3 as “Manual”, and Q4 as “Tools”. This is (or can be) generally true, but I claim that it doesn’t have to be true. Yes, there are some synthetic tests you want to run locally to ensure performance, reliability, and other Q4 activities, but you can get more actionable data by examining data from customer usage. Your top-notch performance suite doesn’t matter if your biggest slowdown occurs on a combination of graphics card and bus speed that you don’t have in your test lab. Customers use software in ways we can’t imagine – and on a variety of configurations that are practically impossible to duplicate in labs. Similarly, stress suites are great – but knowing what crashes your customers are seeing, as well as the error paths they are hitting is far more valuable. Most other “ilities” can be detected from customer usage as well.

Evaluating Q3 from data is …interesting. The list of items in the graphic above is from Agile Testing, but note that Q3 is the quadrant labeled (using my labels) Customer Facing / Quality Product. You do Alpha and Beta testing in order to get customer feedback (which *is* data, of course), but beyond there, I need to make a bit larger leaps.

To avoid any immediate arguments, I’m not saying that exploratory testing can be replaced with data, or that exploratory testing is no longer needed. What I will  say is that not even your very best exploratory tester can represent how a customer uses a product better than the actual customer.

So let’s move on to scenarios and, to some extent, usability testing. Let’s say that one of the features / scenarios of your product is “Users can use our client app to create a blog post, and post it to their blog”. The “traditional” way to validate this scenario is to either make a bunch of test cases (either written down in advance(yuck) or discovered through exploration) that create blog entries with different formatting and options, and then make sure it can post to whatever blog services are supported. We would also dissect the crap out of the scenario and ask a lot of questions about every word until all ambiguity is removed. There’s nothing inherently wrong with this approach, but I think we can do better.

Instead of the above, tweak your “testing” approach. Instead of asking, “Does this work?”, or “What would happen if…?”, ask “How will I know if the scenario was completed successfully?” For example, if you knew:

  • How many people started creating a blog post in our client app?
  • Of the above set, how many post successfully to their blog
  • What blog providers do they post to
  • What error paths are being hit?
  • How long does posting to their blog take?
  • What sort of internet connection do they have?
  • How long does it take for the app to load?
  • After they post, do they edit the blog immediately (is it WYSIWYG)?
  • etc.

With the above, you can begin to infer a lot about how people use your application and discover outliers, answer questions; and perhaps, help you discover new questions you want to have answered. And to get an idea of whether they may have liked the experience, perhaps you could track things like:

  • How often do people post to their blog from our client app?
  • When they encounter an error path, what do they do? Try again? Exit? Uninstall?
  • etc.

Of course, you can get subjective data as well via short surveys. These tend to annoy people, but used strategically and sparsely, can help you gauge the true customer experience. I know of at least one example at Microsoft where customers were asked to provide a star rating and feedback after using an application – over time, the team could use available data to accurately predict what star rating customers would give their experience. I believe that’s a model that can be reproduced frequently.

Does a data-prominent strategy work everywhere? Of course not. Does it replace the need for testing? Don’t even ask – of course not. Before taking too much of the above to heart, answer a few questions about your product. If your product is a web site or web service or anything else you can update (or roll back) as frequently as you want, of course you want to rely on data as much as possible for Q3 and Q4. But, even for “thick” apps that run on a device (computer, phone, toaster) that’s always connected, you should also consider how you can use data to answer questions typically asked by test cases.

But look – don’t go crazy. There are a number of products, where long tests (what I call Q3 and Q4 tests) can be replaced entirely by data. But don’t blindly decide that you no longer need people to write stress suites or do exploratory testing. If you can’t answer your important questions from analyzing data, by all means, use people with brains and skills to help you out. And even if  you think you can get all your answers with data, use people as a safety net while you make the transition. It’s quite possible (probable?) to gather a bunch of data that isn’t actually the data you need, and then mis-analyze it and ship crap people don’t want – that’s not a trap you want to fall into.

Data is a powerful ally. How many times, as a tester, have you found an issue and had to convince someone it was something that needed to be fixed or customers would rebel? With data, rather than rely on your own interpretation of what customers want, you can make decisions based on what customers are actually doing. For me, that’s powerful, and a strong statement towards the future of software quality.

Alan and Brent talk testing…

Brent Jensen and I found some time together recently to talk about testing. For some reason, we decided to record it. Worse yet, we decided to share it! I suppose we’ll keep this up until we either run out of things to say (unlikely), or until the internet tells us to shut up (much more likely).

But for now, here’s Alan and Brent talk about Testing.

Subscribe to the ABTesting Podcast!

Subscribe via RSS
Subscribe via iTunes

Swiss Testing Day 2014

As you may have noticed, my blogging has slowed. I’ve been navigating and ramping up on a new team and helping the team shift roles (all fun stuff). I’ve also been figuring out how to work in a big org (our team is “small”, but it’s part of one of Microsoft’s “huge” orgs – and there are some learning pains that go with that.

But – I did take a few days off this week to head to Zurich and give a talk at Swiss Testing Day. I attended the conference three years ago (where I was impressed with the turnout, passion and conference organization), and I’m happy to say that it’s still a great conference. I met some great people (some who I had virtually met on twitter), and had a great time catching up with Adrian and the rest of the Swiss-Q gang (the company that puts on Swiss Testing Day).

I talked (sort of) about testing Xbox. I talked a bit about Xbox, and mostly about what it takes to build and test the Xbox. I stole a third or so from my Star West keynote, and was able to use a few more examples now that the Xbox One has shipped (ironically, the Xbox One is available in every country surrounding Switzerland, but not in Switzerland). Of note, is that now I’ve retired from the Xbox team (or moved on, at least), I expect that I’m done with giving talks about Xbox (although I expect I’ll use examples of the Xbox team in at least one of my sessions at Star East).

I owe some follow ups on my Principles post, and I promise to get to those soon. I’ll define “soon” later.

Some Principles

I’ve been thinking a lot less about testing activities lately, and much, much more on how we to make higher quality software in general. The theme is evident from my last several blog posts, but I’m still figuring out exactly what that means for me. What it boils down to, is a few principles that reflect how I approach making great software.

  1. I believe in teams made up of generalizing specialists – I dislike the notion of strong walls between software disciplines (and in a perfect world, I wouldn’t have separate engineering disciplines).
  2. It is inefficient (and wasteful) to have a separate team to conduct confirmatory testing (e.g. “checks” as many like to call them). This responsibility should lie with the author of the functionality.
  3. The (largely untapped) key to improving our ability to improve software quality comes from analysis and investigation of data (usage patterns, reliability trends, error path execution, etc.).

I haven’t written much about point #3 – that will come soon. Software teams have wasted huge amounts of time and money equating test cases and test pass rates to software quality, and have ignored trying to figure out if the software is actually useful.

We can do better.

A few links:

Riffing on the Quadrants

In 2003, Brian Marick introduced the concept of “Agile Quadrants” to describe the scope of testing[1] (later expanded on by Lisa Crispin and Janet Gregory[2]). Several people (including me) have expanded and elaborated on the quadrants to describe the scope and activities of testing.

Here’s a version of the testing quadrants.


One challenge I’ve seen called out before is that there are varying definitions of what a unit actually is. Many unit tests in existence today are actually functional tests, and some may even be better defined as acceptance tests.

I’ve noticed that we see the same behavior on the right side of the quadrant – items in Q4 sometimes slide up to Q3, and items in Q3 may slide down to Q4 depending on how the data (and definitions) are used. But note that while items may move up or down depending on definition or application, we don’t see activities move from right to left or left to right.

If we look a little closer, there’s a clear equivalence class between activities in Q1 and Q2 vs. activities in Q3 and Q4. The activities on the left are confirmatory tests, and the results (whether they’re pass fail, or an analysis failure) tell you exactly what to do – while the activities on the left require analysis or investigation. (noted indirectly here) Furthermore, almost everything on the right side is about (or can be about) data! As test organizations move towards more data analysis and investigation, this is important. Also note that in many contexts (especially, but not limited to web services), the activities in the right half of the quadrant can all be done in production.


Now, this is just a riff, so I’m not entirely sure where I’m going (or if there is a huge hole in my logic), but I’m beginning to see both a strong correlation between activities on the left side and on the right side of the quadrants, and a distinction in skill sets as well. Deep data analysis is quite a bit different from feature development, (and unit and functional testing skills as well).

I remain a fan and supporter of generalizing specialists, and I’m seeing a much bigger need for data specialists and people who can investigate through ambiguity to help build great teams (and yes, we’ll need people that can smoke jump into activities in any of these quadrants as well).

More riffing later as I continue to explore.



Australia 2013-2014

No testing related content in this post – just a quick trip report of our family vacation to Australia that I can point people to as necessary.

Sydney (part 1)

After a long flight, we landed in Sydney, where we immediately set out to explore a bit (I’ve found that walking in the sun is a great way to beat jet lag). We ate dinner at Darling Harbor, and got to bet at a reasonable time so we could get up for our flight the next day.

The Outback

Our first (extended) stop was in Ayers Rock (Uluru). Ayers RockThanks to the time change, we had no trouble getting up early to watch the sun rise and light up Uluru. After taking a few pictures, we drove a bit closer (we rented a car for this portion of the trip) so we could do the hike around the base. Uluru Trail

We got an early start, but the heat was well over 100f by the time we finished around 10am. Still, we were rewarded with some once in a lifetime views and scenes.

More Uluru

The next day we went on a similar exploration of the area known as Kata Tjuta. Here, we saw much more wildlife (including our first views of wild kangaroos). Again, we went out early to avoid the heat, and spent the hottest part of the day indoors. The next day (or maybe the day after), we checked out and drove (sort of) towards Alice Springs.

Alice SpringsPath to rim walk

We took a detour on the way to Alice Springs to visit Kings Canyon. With the kids (and given that we had a long day of driving ahead of us), we just hiked to the first lookout – but if I ever, for some reason, end up at Kings Canyon again, I would love to do the rim walk – the picture at the right shows the “entrance” to the rim walk – a massive carved staircase that takes hikers up to “rim level”.

On our way out of the park, I took a little grief from my travel companions for an unavoidable murder of a large (~2 foot) lizard who happened to be basking in the middle of the road as I drove. Fortunately, I redeemed myself a few hours later when I managed to miss a kangaroo who bounced across the road in front of me.

Our buddyWe arrived safe and sound in Alice Springs, and spent a few days relaxing and exploring a bit. We found our way to Desert Park, a zoo of sorts, filled with animals and plants native to the central Australian desert. It was there where we were able to meet the guy on the left. We drove directly from Desert Park to the Alice Springs airport.


We landed in Cairns early evening on December 24th, and had a small in-room Christmas celebration before grabbing some food and spending most of the rest of the day in the pool.

We spent the next two days snorkeling the Great Barrier Reef, and had an amazing time. The second day was only the boy and I, but we were able to track down both a ray and a shark (a two-foot long non-man-eating shark, but it was exciting anyway). After another day (or two – who remembers?), we were off to Brisbane.


Brisbane was essentially our gateway city to the beach, but we made sure we had a full day to soak up what we could. The Queensland museum had an exhibit by Cai Guo-Qiang that Museum 1was quite interesting. Pictures to the right and below (and more info here). We also spent a big chunk of time at the South Bank park swimming and lounging – and enjoying some great restaurants.

Museum 2

Gold Coast / Sunshine Coast

Next up was five or so days each on the Gold Coast and Sunshine Coast (both fairly short train-rides from Brisbane). The kids and I spent three days at Dream World and White Water world amusement parks (where I somehow was talked into riding this monstrosity.Kill me now

Other than that, we mostly played on the beach (or in the pool). I finished up reading a few books I’d been working through, and downloaded and read half a dozen novels as well.


I went to Melbourne last year as a quick side trip after giving a few presentations in Sydney. This time, with family in tow, I didn’t hit quite as many restaurants as last time, but we were able to tack on the incredible Melbourne zoo, and the aquarium. LemurI especially enjoyed hanging out with the lemurs, but I was impressed with just about every animal exhibit (note the lack of basketball in any of these pictures – that’s an inside joke for PK and KK).KeithTortoise

The aquarium was equally impressive – especially for the shark lover in our family. I’ll let the picture tell the story here.

Ba-dum   Ba-dum...

Sydney (part 2)

We spent the final four days of our trip back in Sydney. This time we stayed at the Holiday Inn (located right between the Opera House and the Sydney Bridge – with a rooftop pool that views both!). We forced the kids to walk across the Sydney Bridge (and back), and explored the aquarium (almost déjà vu, but there were quite a few different fish and sharks in the Sydney aquarium).Bridge from Opera HouseOpera House from pool

More bridge shots

We saw The Illusionists at the Opera House (in the concert hall), and the kids and I went to see a play in the downstairs playhouse the next day. We tried to see as much as we could before we left, but I’m sure we missed some must-see places (both in Sydney, and in our other adventures).

At this point, or body clocks are all back in the correct time zone, and we’re all slowly getting back into the rhythm of our west coast lives. It was a fantastic trip, and one we will remember for a long, long time.