An Angry Weasel Hiatus

I mentioned this on twitter a few times, but should probably mention it here for completeness.

I’m taking a break.

I’m taking a break from blogging, twitter, and most of all, my day job. Beyond July 4, 2014, I’ll be absent from the Microsoft campus for a few months– and most likely from social media as well. While I can guarantee my absence from MS, I may share on twitter…we’ll just have to see.

The backstory is that Microsoft offers a an award – a “sabbatical” for employees of a certain career stage and career history. I received my award nine years ago, but haven’t used it yet…but now’s the time – so I’m finally taking ~9 weeks off beginning just after the 4th of July, and concluding in early September.

Now – hanging out at home for two months would be fun…but that’s not going to cut it for this long awaited break. Instead, I’ll be in the south of France (Provence) for most of my break before spending a week in Japan on my way back to reality.

I probably won’t post again until September. I’m sure I have many posts waiting on the other side.

-AW

Alan and Brent are /still/ talking…

In case you missed it, Brent and I are still recording testing podcasts. I stopped posting the announcements for every podcast post on this blog, but if you want to subscribe, you have a few choices.

  1. You can Subscribe via RSS
  2. You can Subscribe via iTunes
  3. You can bookmark the podcast blog
  4. You can follow me on twitter and look for the announcements (usually every other Monday)

And for those of you too busy to click through, here are the episodes so far:

  1. AB Testing – Episode 1
  2. AB Testing – Episode 2
  3. AB Testing – Episode 3
  4. AB Testing – Episode 4
  5. AB Testing – Episode 5

Stop, If You Want To…

Well – that was a fun post. The dust hasn’t quite settled, but a follow up is definitely in order. 

First, some context. I was committed to giving a lightning talk as part of STAR East’s “Lightning Strikes the Keynotes” hour. I purposely didn’t pick a topic before I left, and figured I would come up with something while I was there. On Wednesday morning (the day of the lightning talks), I was out for a run thinking about the conference, when I had the idea to talk about testers writing fewer automated tests. I realized that if programmers should be writing more tests (ok, “checks” for those of you who insist), and that just about every mention of automation I heard at the conference talked about the challenges in automating end-to-end scenarios and equal challenges in maintaining that automation – that it was a topic worth exploring. I could have called the post, Testers Should Stop Writing Some of Their Automation Because Programmers Should Do Some of it, and Some of Your Automation Isn’t Very Good Anyway and it is Getting in the Way of Testing You Should Be Doing, but I chose a more controversial (and shorter) title instead. I purposely left out the types of automation (and other coding activities) that testers still should do, because I was afraid that if I did, that it would distract from the main two points (Programmers need to write a lot more tests, and Testers spend too much time writing and maintaining bad automation).

Also – I only had five minutes (to talk about it), so I stuck to the main points:

  1. Developers need to own more testing.
  2. Testers need to stop wasting time writing ineffective automation.

But the fun really began with the comments. I’ve never had more fun reading blog comments, twitter, and my mailbox (for some reason a bunch of people prefer to email me directly with comments rather than comment publically). I thought I’d comment on a few areas where I had a lot of questions.

Does this work for both services and “thick” clients?

Although I didn’t call it out in my post, this sort of approach works really well with services. But – it can work with thick clients too, you just need to be a little more careful with your deployment, as rollback and monitoring won’t be in real time like your service. I think mobile apps are a great example of where you may run experiments with a limited number of users, but windows (or mac) apps could follow the model as well. For always (or often) connected devices, I see no reason to not push updates – of course, these updates should probably go through a bit more testing than services before being pushed, as if something is broken, getting back to a safe state will take a bit of work.

The important thing to note for those still in unbelief is that deploying test items to production is done all of the time. Ebay does it, amazon does it, netflix does it (I could go on, but believe me that it’s done a lot). I don’t have a link, but in the comments, Noah Sussman tells me that NASA does it.

What Do Testers Do? Where did my cheese go. You are annoying!

The fear of cheese moving is strong (“If developers do functional testing, what will I do?”). There’s a lot of testing activity left to do (even when you take away writing developers tests for them and wasting time on unneeded automation). Stress and performance suites and monitoring tools (for example) should give the coding testers on the team plenty of work to do. Data analysis is also necessary if you’re gathering data from customers. And thanks to Roberto for pointing out that sanity checking UI changes, testing for Accessibility or Localization, or color changes all could use an onsite tester (or sometimes some code) to help.

And honestly, now that I’ve made you think about it, there are a few places where testers writing automation is useful. But it’s about time that testers stopped trying to write automation for cases that shouldn’t be automated. Want to try logging in and out 500 times – automate that (ideally NOT at the GUI level)? Go for it. Want to automate the end-to-end scenario of setting up an account and logging in? Please don’t bother. Instead, just add some monitoring code that lets you know if login is failing and save yourself some frustration.

One other thing – the point about developers owning more testing – that one is equally true for services and clients. It doesn’t make sense to me at all to have a separate team verify functional correctness. It’s not “too hard” or “too much work” for developers to write tests. Developers need to own writing quality code – and doing this requires that they write tests. I was surprised that some people felt that it was bad for developers to test their own code – but I suppose that years of working in silos will make you believe that there’s some sort of taint in doing so (but there’s not!).

This is obviously a point (and a change that is really happening now at a lot of companies) that causes no small amount of fear in testers. I get that, but ignoring the change or burying your head in the sand, or justifying why testers need to own functional testing isn’t going to help you figure out how to function when these changes hit your team.

Stop Writing Automation

After releasing The A Word, I didn’t plan on writing any more posts about automation. But, after pondering transitions in test, and after reading this post from Noah Sussman, I have a thought in my head that I need to share.

I don’t think testers should write automation. Star

I suppose I better explain myself.

All automation isn’t created equal. Automation works wonderfully for short confirmatory or validation tests. Unit, functional, acceptance, integration tests, and all other “short” tests lend themselves very well to automation. But I think it’s wasteful and inefficient to have testers write this automation – this should be written by the code owners. Testing your own code (IMO) improves design, prevents regression, and takes much, much less time than passing code off to another team to test.

That leaves the test team to write automation for end-to-end scenarios. There’s nothing wrong with that…except that writing end to end automated tests is hard (especially, as Noah points out, at the GUI level). The goal of automation (as touted by many of the vendors), is to enable running a bunch of tests automatically, so that testers will have more time for hands-on testing. In reality, I think that most test teams spend half of their time writing automation, half of their time maintaining and debugging automation, and half of their time doing everything else.

Let’s look at a typical scenario.

image

Pretty easy – you only need to automate three actions, add a bit of validation and you’re done!

If you’ve attempted something like this before, you know that’s not the whole story. A good automated test doesn’t just execute a set of user actions and then go on it’s way – you need to look at error conditions, and try to write the test in a way that prevents it from breaking.

image

It’s hard. It’s fragile. It’s a pain in the ass. So I say, stop doing it.

“But Alan – we have to test the scenario so that we know it works for our customers”. I believe that you want to know that the scenario works – but as much as you try to be the customer, nobody is a better representative of the customer than the customer. And – validating scenarios is quite a bit easier if we let the customers do it.

NOTE: I acknowledge that for some contexts, letting the customers generate usage data for an untested scenario is an inappropriate choice. But even when I’ve already tested the scenario, I still want (need!) to know what the customer experience is. As a tester, I can be the voice of the customer, but I am NOT the customer.

Now – let’s look at the same scenario:

image

But this time, instead of writing an automated test, I’ve added code (sometimes called instrumentation) to the product code that let’s me know when I’ve taken an action related to the scenario. While we’re at it, let’s add more instrumentation at every error path in the code.

image

Now, when a user attempts the scenario, you could get logs something like this:

05072014-1601-3143: Search started for FizzBuzz
05072014-1601-3655: Download started for FizzBuzz
05072014-1602-2765: Install started for FizzBuzz
05072014-1603-1103: FizzBuzz launched successfully

or

05072014-1723-2143: Search started for FizzBuzz
05072014-1723-3655: Download started for FizzBuzz
05072014-1723-2945: ERROR 115: Connection Lost

or

05072014-1819-3563: Search started for FizzBuzz
05072014-1819-3635: ERROR 119: Store not available.

 

From this, you can begin to evaluate scenario success by looking at how many people get through the entire scenario, and how many fail for particular errors.

image

or generate data like this:

image

And now we have information about our product that’s directly related to real customer usage, and the code to enable it is substantially more simple and easy to maintain than the traditional scenario automation.

StarA more precise way to put, I don’t think testers should write automation. is – I think that many testers in some contexts don’t need to write automation anymore. Developers should write more automation, and testers should try to learn more from the product in use. Use this approach in financial applications at your own risk!

Testing Trends…or not?

I read this article over the weekend about five emerging trends in software testing – Test Automation; Rise of mobile and cloud; Emphasis on security; Context-driven testing; and More business involvement.

I fully acknowledge that I work in a software development environment that isn’t like many others, but while reading the article, I really didn’t feel like any of those areas are “emerging” – all are fully emerged already. Sure, the trends are interesting to testers, but emerging? I could waste some space rebutting or commenting on the areas above, but instead, let me offer some alternate trends that I see inside of MS and from some of my colleagues who work elsewhere.

Fuzzier Role Definitions.  I don’t really like the terms “whole team approach” or “combined engineering”, but I do see software teams really figuring out how to work better together and leverage every team members strengths effectively. Great testers are working as Test Specialists and working much more broadly across the team. I expect the “lines” between software disciplines to fade even more in the future.

Developers Own More Testing. You can call them “checks” if you wish (I call them “short tests”, but software developers are beginning to own much bigger portions of traditional software testing. This is a good thing – it ensures that daily code quality is high, and gives test specialists a high quality product to work with.

Testing Live Sites. Mock-test environments typically do a poor job representing production environments. Other than brief sanity checks for the most critical components, many web service teams just roll their new bits straight to production, and then run their tests against the live system. With a good monitoring system (including the ability to stage rollouts and automatically roll back if needed), this is a safe, efficient, and frankly, practical method for testing services.

Data is HUGE. Many software teams have figured out that the best way to get an accurate representation of how customers use software is collect and analyze data from those same customers. A whole lot of traditional test activities can be replaced by product instrumentation, and an efficient method for getting product instrumentation back to the team for analysis. On a lot of teams, last year’s testers are this  year’s data analysts and data scientists. While not every tester is cut out for this role, this move to data analysis is a strong trend on a lot of software teams.

To critique myself for a moment, I think a lot of readers could say that none of these points are emerging either. That’s a fair point, since I know teams that have been doing everything above for years…but I’m just now seeing some of these trends “emerge” on multiple teams (and not just those testing web services or sites).

What trends do you see? Did I miss anything huge? Have the above four points already reached the tipping point of emergence?

Users, Usage, Usability, and Data

The day job (and a new podcast) have been getting the bulk of my time lately, but I’m way overdue to talk about data and quadrants.

If you need a bit of context or refresher on my stance, this post talks about my take on Brian Marick’s quadrants (used famously by Gregory and Crispin in their wonderful Agile Testing book); and I assert that the the left side of the quadrant is well suited for programmer ownership, and that the right side is suited for quality / test team ownership. I also assert that the right side knowledge can be obtained through data, and that one could gather what they need in production – from actual customer usage.

And that’s where I’ll try to pick up.

image

Agile Testing labels Q3 as “Manual”, and Q4 as “Tools”. This is (or can be) generally true, but I claim that it doesn’t have to be true. Yes, there are some synthetic tests you want to run locally to ensure performance, reliability, and other Q4 activities, but you can get more actionable data by examining data from customer usage. Your top-notch performance suite doesn’t matter if your biggest slowdown occurs on a combination of graphics card and bus speed that you don’t have in your test lab. Customers use software in ways we can’t imagine – and on a variety of configurations that are practically impossible to duplicate in labs. Similarly, stress suites are great – but knowing what crashes your customers are seeing, as well as the error paths they are hitting is far more valuable. Most other “ilities” can be detected from customer usage as well.

Evaluating Q3 from data is …interesting. The list of items in the graphic above is from Agile Testing, but note that Q3 is the quadrant labeled (using my labels) Customer Facing / Quality Product. You do Alpha and Beta testing in order to get customer feedback (which *is* data, of course), but beyond there, I need to make a bit larger leaps.

To avoid any immediate arguments, I’m not saying that exploratory testing can be replaced with data, or that exploratory testing is no longer needed. What I will  say is that not even your very best exploratory tester can represent how a customer uses a product better than the actual customer.

So let’s move on to scenarios and, to some extent, usability testing. Let’s say that one of the features / scenarios of your product is “Users can use our client app to create a blog post, and post it to their blog”. The “traditional” way to validate this scenario is to either make a bunch of test cases (either written down in advance(yuck) or discovered through exploration) that create blog entries with different formatting and options, and then make sure it can post to whatever blog services are supported. We would also dissect the crap out of the scenario and ask a lot of questions about every word until all ambiguity is removed. There’s nothing inherently wrong with this approach, but I think we can do better.

Instead of the above, tweak your “testing” approach. Instead of asking, “Does this work?”, or “What would happen if…?”, ask “How will I know if the scenario was completed successfully?” For example, if you knew:

  • How many people started creating a blog post in our client app?
  • Of the above set, how many post successfully to their blog
  • What blog providers do they post to
  • What error paths are being hit?
  • How long does posting to their blog take?
  • What sort of internet connection do they have?
  • How long does it take for the app to load?
  • After they post, do they edit the blog immediately (is it WYSIWYG)?
  • etc.

With the above, you can begin to infer a lot about how people use your application and discover outliers, answer questions; and perhaps, help you discover new questions you want to have answered. And to get an idea of whether they may have liked the experience, perhaps you could track things like:

  • How often do people post to their blog from our client app?
  • When they encounter an error path, what do they do? Try again? Exit? Uninstall?
  • etc.

Of course, you can get subjective data as well via short surveys. These tend to annoy people, but used strategically and sparsely, can help you gauge the true customer experience. I know of at least one example at Microsoft where customers were asked to provide a star rating and feedback after using an application – over time, the team could use available data to accurately predict what star rating customers would give their experience. I believe that’s a model that can be reproduced frequently.

Does a data-prominent strategy work everywhere? Of course not. Does it replace the need for testing? Don’t even ask – of course not. Before taking too much of the above to heart, answer a few questions about your product. If your product is a web site or web service or anything else you can update (or roll back) as frequently as you want, of course you want to rely on data as much as possible for Q3 and Q4. But, even for “thick” apps that run on a device (computer, phone, toaster) that’s always connected, you should also consider how you can use data to answer questions typically asked by test cases.

But look – don’t go crazy. There are a number of products, where long tests (what I call Q3 and Q4 tests) can be replaced entirely by data. But don’t blindly decide that you no longer need people to write stress suites or do exploratory testing. If you can’t answer your important questions from analyzing data, by all means, use people with brains and skills to help you out. And even if  you think you can get all your answers with data, use people as a safety net while you make the transition. It’s quite possible (probable?) to gather a bunch of data that isn’t actually the data you need, and then mis-analyze it and ship crap people don’t want – that’s not a trap you want to fall into.

Data is a powerful ally. How many times, as a tester, have you found an issue and had to convince someone it was something that needed to be fixed or customers would rebel? With data, rather than rely on your own interpretation of what customers want, you can make decisions based on what customers are actually doing. For me, that’s powerful, and a strong statement towards the future of software quality.

Alan and Brent talk testing…

Brent Jensen and I found some time together recently to talk about testing. For some reason, we decided to record it. Worse yet, we decided to share it! I suppose we’ll keep this up until we either run out of things to say (unlikely), or until the internet tells us to shut up (much more likely).

But for now, here’s Alan and Brent talk about Testing.

Subscribe to the ABTesting Podcast!

Subscribe via RSS
Subscribe via iTunes

Swiss Testing Day 2014

As you may have noticed, my blogging has slowed. I’ve been navigating and ramping up on a new team and helping the team shift roles (all fun stuff). I’ve also been figuring out how to work in a big org (our team is “small”, but it’s part of one of Microsoft’s “huge” orgs – and there are some learning pains that go with that.

But – I did take a few days off this week to head to Zurich and give a talk at Swiss Testing Day. I attended the conference three years ago (where I was impressed with the turnout, passion and conference organization), and I’m happy to say that it’s still a great conference. I met some great people (some who I had virtually met on twitter), and had a great time catching up with Adrian and the rest of the Swiss-Q gang (the company that puts on Swiss Testing Day).

I talked (sort of) about testing Xbox. I talked a bit about Xbox, and mostly about what it takes to build and test the Xbox. I stole a third or so from my Star West keynote, and was able to use a few more examples now that the Xbox One has shipped (ironically, the Xbox One is available in every country surrounding Switzerland, but not in Switzerland). Of note, is that now I’ve retired from the Xbox team (or moved on, at least), I expect that I’m done with giving talks about Xbox (although I expect I’ll use examples of the Xbox team in at least one of my sessions at Star East).

I owe some follow ups on my Principles post, and I promise to get to those soon. I’ll define “soon” later.