I took some time recently to answer some questions about Combined Engineering, Data-Driven Quality, and the Future of Testing for the folks at the A1QA blog.
Check it out here.
I took some time recently to answer some questions about Combined Engineering, Data-Driven Quality, and the Future of Testing for the folks at the A1QA blog.
Check it out here.
We’re less than a week away from the sixth anniversary of How We Test Software at Microsoft (some chapters were completed nearly seven years ago).
Recently I was considering a new position at Microsoft, and one of my interviewers (a dev architect and also an author) said that he had been reading my book. I stated my concern that the book is horribly out of date, and that it doesn’t really reflect testing at Microsoft accurately anymore. In the ensuing discussion, he asked how I’d structure the book if I re-wrote it today. I gave him a quick brain dump – which in hindsight I believe is still accurate…and worth sharing here.
So – for those who may be curious, here’s the outline for the new (and likely never-to-be-written) edition of How We Test Software at Microsoft. For consistency with the first edition, I’m going to try to follow the original format as much as possible.
This section initially included chapters on Engineering at Microsoft, the role of the Tester / SDET, and Engineering Life cycles. The new edition would go into some of the changes Satya Nadella is driving across Microsoft – including flattening of organizational structures, our successes (and shortcomings) in moving to Agile methodologies across the company, and what it means to be a “Cloud First, Mobile First” company. I think I’d cover a little more of the HR side of engineering as well, including moving between teams, interviews, and career growth.
Part 1 would also dive deeply into the changes in test engineering and quality ownership, including how [combined] engineering teams work, and an intro to what testers do . This is important because it’s a big change (and one still in progress), and it sets the tone for some of the bigger changes described later in the book.
Although I don’t think I could give Marick’s Testing Quadrants the justice that Lisa Crispin and Janet Gregory (and many others) give the them, I think the quadrants (my variation presented on the left) give one view into how Microsoft teams can look at sharing quality when developers own the lions share of engineering quality (with developers generally owning the left side of the quadrant and testers generally owning the right side.
And finally, part 1 would talk about how community works within Microsoft (including quality communities, The Garage, and how we use email distribution lists and Yammer to share (or attempt to share) knowledge.
Fans of the book may remember that the original name for the second and third sections of the book were“About Testing”, and “About Test Tools and Systems” – but I think talking about quality – first from the developer perspective, and then from the quality engineer / tester perspective (along with how we use tools to make our jobs easier) is a better way to present the 2015 version of Testing at Microsoft.
I’d spend a few chapters showing examples of the developer workflow and typical tools – especially where the tools we use are available to readers of the book (e.g. unit testing and coverage tools available in Visual Studio and use of tools like SpecFlow for acceptance test development).
This section would also include some digging into analysis tools, test selection, test systems, and some of the other big pieces we use across all products and disciplines to keep the big engines running.
I didn’t talk much about how Microsoft does Exploratory Testing in the original book – I’d fix that in the second edition.
I also thought about breaking this section into “developer owned” and “testing owned” sections, but that line isn’t clear across the company (or even across organizations), so I think I’d start as a progression from unit testing to exploratory testing and let readers draw their own ideas of where their team may want to draw the line.
There’s at least a book or two of information that could go in this section, and it represents the biggest shift over the last six years. The original chapters on Customer Feedback Systems and Testing Software Plus Services hinted about how we approach Data-Driven Quality (DDQ), but our systems, and the way we use them have matured massively, and there’s definitely a lot of great information worth sharing – and enough where it’s an easy decision to dedicate a full section and several chapters to the topic.
Much of what I talked about in this section of the original has either happened…or become obsolete. As in the first version, I would reserve this section for talking about engineering ideas still in the experimental stage, or ideas currently under adoption. Also, as in the original, I’d use this section as a platform for whatever essay idea chewing on me about where software engineering is going, and what it means to Microsoft.
In my 20+ years of testing, I’ve read at least a hundred different books on software testing, software development, leadership, and organizational change. I’ll include a list of books that have had a significant influence on my approach and views to software engineering, publications that influenced the content of book, and publications that readers may find helpful for further information.
My last post dove into the world of engineering teams / combined engineering / fancy-name-of-the-moment where there are no separate disciplines for development and test. I think there are advantages to one-team engineering, but that doesn’t mean that your team needs to change.
I’ve mentioned this before, but it’s worth saying again. Don’t change for change’s sake. Make changes because you think the change will solve a problem. And even then, think about what problems a change may cause before making a change.
In my case, one-team engineering solves a huge inefficiency in the developer to tester communication flow. The back-and-forth iteration of They code it – I test it – They fix it – I verify it can be a big inefficiency on teams. I also like the idea of both individual and team commitments to quality that happen on discipline-free engineering teams.
I see the developer to tester ping pong match take up a lot of time on a lot of teams. But I don’t know your team, and you may have a different problem. Before diving in on what-Alan-says, ask yourself, “What’s the biggest inefficiency on our team”. Now, brainstorm solutions for solving that inefficiency. Maybe combined engineering is one potential solution, maybe it’s not. That’s your call to make. And then remember that the change alone won’t solve the problem (and I outlined some of the challenges in my last post as well).
OK. So your team is going to go for it and have one engineering team. What else will help you be successful?
In the I-should-have-said-this-in-the-last-post category, I think running a successful engineering team requires a different slant on managing (and leading) the team. Some things to consider (and why you should consider them) include:
This post started as a follow up and clarification for my previous post – but has transformed into a message to leaders and managers on their role – and in that, I’m reminded how critically important good leadership and great managers are to making transitions like this. In fact, given a choice of working for great leaders and awesome managers on a team making crummy software with horrible methods, I’d probably take it every time…but I know in that team doesn’t exist. Great software starts with great teams, and great teams come from leaders and managers who know what they’re doing and make changes for the right reasons.
If your team can be better – and healthier – by working as a single engineering team, I think it’s a great direction to go. But make your changes for the right reasons and with a plan for success.
I talk a lot (and write a bit) about software teams without separate disciplines for developers and testers (sometimes called “combined engineering” within the walls of MS – a term I find annoying and misleading – details below). For a lot of people, the concept of making software with a single team falls into the “duh – how else would you do it?” category, while many others see it as the end of software quality and end for tester careers.
Done right, I think an engineering team based approach to software development is more efficient, and produces higher quality software than a “traditional” Dev & Test team. Unfortunately, it’s pretty easy to screw it up as well…
If you are determined to jump on the fad of software-engineering-without-testers, and don’t care about results, here is your solution!
First, take your entire test team and tell them that they are now all developers (including your leads and managers). If you have any testers that can’t code as well as your developers, now is a great time to fire them and hire more coders.
Now, inform your new “engineering team” that everyone on the team is now responsible for design, implementation, testing, deployment and overall end to end quality of the system. While you’re at it, remind them that because there are now more developers on the team that you expect team velocity to increase accordingly.
Finally, no matter what obstacles you hit, or what the team says about the new culture, don’t tell them anything about why you’ve made the changes. Since you’re in charge of the team, it’s your right to make any decision you want, and it’s one step short of insubordination to question your decisions.
It’s sad, but I’ve seen pieces (and sadder still, all) of the above section occur on teams before. Let’s look at another way to make (or reinforce) a change that removes a test team (but does not remove testing).
Rather than say, “We’re moving to a discipline-free organization” (the action), start with the cause – e.g. “We want to increase the efficiency of our team and improve our ability to get new functionality in customers hands. In order to do this, we are going to move to a discipline-free organizational structure. …etc.” I would probably even preface this with some data or anecdotes, but the point is that starting with a compelling reason for “why” has a much better chance of working than a proclamation in isolation (and this approach works for almost any other organizational change as well).
The core output (and core activity) of an engineering team is working software. To this end, this is where most (but not necessarily all) of the team should focus. The core role of a software engineer is to create working software (at a minimum, this means design, implementation, unit, and functional tests). IMO, the ability to this is the minimum bar for a software engineer.
But…if you read my blog, you’re probably someone who knows that there’s a lot more to quality software than writing and testing code. I think great software starts with core software engineering, but that alone isn’t enough. Good software teams have learned the value of using generalizing specialists to help fill the gap between core-software engineering, and quality products.
Core engineering (aka what-developers-do) covers the a large part of Great Software. The sizes of circles may be wrong (or wrong for some teams); this is just a model.
Nitpicker note: sizes of circles may be wrong (or wrong for some teams) – this is a model.
Engineering teams still have a huge need for people who are able to validate and explore the system as a whole (or in large chunks). These activities remain critically important to quality software, and they can’t be ignored. These activities (in the outer loop of my model) improve and enhance “core software engineering”. In fact, an engineering team could be structured with an separate team to focus on the types of activities in the outer loop.
I see an advantage, however, in engineering teams that include both circles above. It’s important to note that some folks who generally work in the “Core Engineering” circle will frequently (or regularly) take on specialist roles that live in the outer loop. A lot of people seem to think that discipline-free software teams, everyone can do everything – which is, of course, flat out wrong. Instead, it’s critical that a good software team has (generalizing) specialists who can look critically at quality areas that span the product.
There also will/must be folks who live entirely in the outer ring, and there will be people like me who typically live in the outer ring, but dive into product code as needed to address code problems or feature gaps related to the activities in the outer loop. Leaders need to support (and encourage – and celebrate) this behavior…but with this much interaction between the outer loop of testing and investigation, and the inner loop of creating quality features, it’s more efficient to have everyone on one team. I’ve seen walls between disciplines get in the way of efficient engineering (rough estimate) a million times in my career. Separating the team working on the outer loop from core engineering can create another opportunity for team walls to interfere with engineering. To be fair, if you’re on a team where team walls don’t get in the way, and you’re making great software with separate teams, then there’s no reason to change. For some (many?) of us, the efficiency gains we can get from this approach to software development are worth the effort (along with any short-term worries from the team).
It’s really easy for a team to make the decision to move to one-engineering-team for the wrong reasons, and it’s even easier to make the move in a way that will hurt the team (and the product). But after working mostly the wrong way for nearly 20 years, I’m completely sold on making software with a single software team. But making this sort of change work effectively is a big challenge.
But it’s a challenge that many of us have faced already, and something everyone worried about making good software has to keep their eye on. I encourage anyone reading this to think about what this sort of change means for you.
I recently ran across this article on nytimes.com from over a year ago. Here’s the punch line (or at least one of them):
“We looked at tens of thousands of interviews, and everyone who had done the interviews and what they scored the candidate, and how that person ultimately performed in their job. We found zero relationship. It’s a complete random mess…”
I expect (hope?) that the results aren’t at all surprising. After almost 20 years at Microsoft and hundreds of interviews, it’s exactly what I expected. Interviews, at best, are a guess at finding a good employee, but often serve the ego of the interviewer more than the needs of the company. The article makes note of that as well.
“On the hiring side, we found that brainteasers are a complete waste of time. How many golf balls can you fit into an airplane? How many gas stations in Manhattan? A complete waste of time. They don’t predict anything. They serve primarily to make the interviewer feel smart.”
I wish more interviewers paid attention to statistics or articles (or their peers) and stopped asking horrible interview questions, and really, really tried to see if they could come up with better approaches.
So – why do we do interviews if they don’t work? Well, they do work – hopefully at least as a method of making sure you don’t hire a completely incapable person. While it’s hard to predict future performance based on an interview, I think they may be more effective at making sure you don’t hire a complete loser – but even this approach has flaws, as I frequently see managers pass on promising candidates for (perhaps) the wrong reasons out of fear of making a “bad hire”.
I do mostly “as appropriate” interviewer at Microsoft (this is the person on the interview loop who makes the ultimate hire / no hire decision on candidates based on previous interviews and their own questions). For college candidates or industry hires, one of the key questions I’m looking to answer is, “is this person worth investing 12-18 months of salary and benefits to see if they can cut it”. A hire decision is really nothing more than an agreement for a long audition. If I say yes, I’m making a (big) bet that the candidate will figure out how to be valuable within a year or so, and assume they will be “managed out” if not. I don’t know the stats on my hire decisions, but while my heart says I’m great, my head knows that I may be just throwing darts.
If I had a secret formula for what made people successful in tech jobs, I’d share. But here’s what I look for anyway:
The above isn’t a perfect list (and leaves off the “can they do the job?” question, but I think someone who can do the above can at least stay employed.
I’m back home and back at work…and slowly getting used to both. It was by far the best (and longest) vacation I’ve had (or will probably ever have). While it’s good to be home, it’s a bit weird getting used to working again after so much time off (just over 8 weeks in total).
But – a lot happened while I was gone that’s probably worth at least a comment or two.
Microsoft went a little crazy while I was out, but the moves (layoffs, strategy, etc.) make sense – as long as the execution is right. I feel bad for a lot of my friends who were part of the layoffs, and hope they’re all doing well by now (and if any are reading this, I almost always have job leads to share).
A lot of testers I know are riled up (and rightfully so) about ISO 29119 – which, in a nutshell, is a “standard” for testing that says software should be tested exactly as described in a set of textbooks from the 1980’s. On one hand, I have the flexibility to ignore 29119 – I would never work for a company that thought it was a good idea. But I know there are testers who find themselves in a situation where they have to follow a bunch of busywork from the “standard” rather than provide actual value to the software project.
Honestly, I have to say that I don’t think of myself as a tester these days. I know a lot about testing and quality (and use that to help the team), but the more I work on software, the more I realize that thinking of testing as a separate and distinct activity from software development is a road to ruin. This thought is at least partially what makes me want to dismiss 29119 entirely – from what I’ve seen, 29119 is all about a test team taking a product that someone else developed, and doing a bunch of ass-covering while trying to test quality into the product. That approach to software development doesn’t interest me at all.
I talked with a recruiter recently (I always keep my options open) who was looking for someone to “architect and build their QA infrastructure”. I told them that I’d talk to them about it if they were interested, but that my goal in the interview would be to talk them out of doing that and give them some ideas on how to better spend that money.
I didn’t hear back.
It’s also been a long hiatus from AB Testing. Brent and I are planning to record on Friday, and expect to have a new episode published by Monday September 22, and get back on our every-two-week schedule.
I mentioned this on twitter a few times, but should probably mention it here for completeness.
I’m taking a break.
I’m taking a break from blogging, twitter, and most of all, my day job. Beyond July 4, 2014, I’ll be absent from the Microsoft campus for a few months– and most likely from social media as well. While I can guarantee my absence from MS, I may share on twitter…we’ll just have to see.
The backstory is that Microsoft offers a an award – a “sabbatical” for employees of a certain career stage and career history. I received my award nine years ago, but haven’t used it yet…but now’s the time – so I’m finally taking ~9 weeks off beginning just after the 4th of July, and concluding in early September.
Now – hanging out at home for two months would be fun…but that’s not going to cut it for this long awaited break. Instead, I’ll be in the south of France (Provence) for most of my break before spending a week in Japan on my way back to reality.
I probably won’t post again until September. I’m sure I have many posts waiting on the other side.
In case you missed it, Brent and I are still recording testing podcasts. I stopped posting the announcements for every podcast post on this blog, but if you want to subscribe, you have a few choices.
And for those of you too busy to click through, here are the episodes so far:
Well – that was a fun post. The dust hasn’t quite settled, but a follow up is definitely in order.
First, some context. I was committed to giving a lightning talk as part of STAR East’s “Lightning Strikes the Keynotes” hour. I purposely didn’t pick a topic before I left, and figured I would come up with something while I was there. On Wednesday morning (the day of the lightning talks), I was out for a run thinking about the conference, when I had the idea to talk about testers writing fewer automated tests. I realized that if programmers should be writing more tests (ok, “checks” for those of you who insist), and that just about every mention of automation I heard at the conference talked about the challenges in automating end-to-end scenarios and equal challenges in maintaining that automation – that it was a topic worth exploring. I could have called the post, Testers Should Stop Writing Some of Their Automation Because Programmers Should Do Some of it, and Some of Your Automation Isn’t Very Good Anyway and it is Getting in the Way of Testing You Should Be Doing, but I chose a more controversial (and shorter) title instead. I purposely left out the types of automation (and other coding activities) that testers still should do, because I was afraid that if I did, that it would distract from the main two points (Programmers need to write a lot more tests, and Testers spend too much time writing and maintaining bad automation).
Also – I only had five minutes (to talk about it), so I stuck to the main points:
But the fun really began with the comments. I’ve never had more fun reading blog comments, twitter, and my mailbox (for some reason a bunch of people prefer to email me directly with comments rather than comment publically). I thought I’d comment on a few areas where I had a lot of questions.
Although I didn’t call it out in my post, this sort of approach works really well with services. But – it can work with thick clients too, you just need to be a little more careful with your deployment, as rollback and monitoring won’t be in real time like your service. I think mobile apps are a great example of where you may run experiments with a limited number of users, but windows (or mac) apps could follow the model as well. For always (or often) connected devices, I see no reason to not push updates – of course, these updates should probably go through a bit more testing than services before being pushed, as if something is broken, getting back to a safe state will take a bit of work.
The important thing to note for those still in unbelief is that deploying test items to production is done all of the time. Ebay does it, amazon does it, netflix does it (I could go on, but believe me that it’s done a lot). I don’t have a link, but in the comments, Noah Sussman tells me that NASA does it.
The fear of cheese moving is strong (“If developers do functional testing, what will I do?”). There’s a lot of testing activity left to do (even when you take away writing developers tests for them and wasting time on unneeded automation). Stress and performance suites and monitoring tools (for example) should give the coding testers on the team plenty of work to do. Data analysis is also necessary if you’re gathering data from customers. And thanks to Roberto for pointing out that sanity checking UI changes, testing for Accessibility or Localization, or color changes all could use an onsite tester (or sometimes some code) to help.
And honestly, now that I’ve made you think about it, there are a few places where testers writing automation is useful. But it’s about time that testers stopped trying to write automation for cases that shouldn’t be automated. Want to try logging in and out 500 times – automate that (ideally NOT at the GUI level)? Go for it. Want to automate the end-to-end scenario of setting up an account and logging in? Please don’t bother. Instead, just add some monitoring code that lets you know if login is failing and save yourself some frustration.
One other thing – the point about developers owning more testing – that one is equally true for services and clients. It doesn’t make sense to me at all to have a separate team verify functional correctness. It’s not “too hard” or “too much work” for developers to write tests. Developers need to own writing quality code – and doing this requires that they write tests. I was surprised that some people felt that it was bad for developers to test their own code – but I suppose that years of working in silos will make you believe that there’s some sort of taint in doing so (but there’s not!).
This is obviously a point (and a change that is really happening now at a lot of companies) that causes no small amount of fear in testers. I get that, but ignoring the change or burying your head in the sand, or justifying why testers need to own functional testing isn’t going to help you figure out how to function when these changes hit your team.
After releasing The A Word, I didn’t plan on writing any more posts about automation. But, after pondering transitions in test, and after reading this post from Noah Sussman, I have a thought in my head that I need to share.
I don’t think testers should write automation.
I suppose I better explain myself.
All automation isn’t created equal. Automation works wonderfully for short confirmatory or validation tests. Unit, functional, acceptance, integration tests, and all other “short” tests lend themselves very well to automation. But I think it’s wasteful and inefficient to have testers write this automation – this should be written by the code owners. Testing your own code (IMO) improves design, prevents regression, and takes much, much less time than passing code off to another team to test.
That leaves the test team to write automation for end-to-end scenarios. There’s nothing wrong with that…except that writing end to end automated tests is hard (especially, as Noah points out, at the GUI level). The goal of automation (as touted by many of the vendors), is to enable running a bunch of tests automatically, so that testers will have more time for hands-on testing. In reality, I think that most test teams spend half of their time writing automation, half of their time maintaining and debugging automation, and half of their time doing everything else.
Let’s look at a typical scenario.
Pretty easy – you only need to automate three actions, add a bit of validation and you’re done!
If you’ve attempted something like this before, you know that’s not the whole story. A good automated test doesn’t just execute a set of user actions and then go on it’s way – you need to look at error conditions, and try to write the test in a way that prevents it from breaking.
It’s hard. It’s fragile. It’s a pain in the ass. So I say, stop doing it.
“But Alan – we have to test the scenario so that we know it works for our customers”. I believe that you want to know that the scenario works – but as much as you try to be the customer, nobody is a better representative of the customer than the customer. And – validating scenarios is quite a bit easier if we let the customers do it.
NOTE: I acknowledge that for some contexts, letting the customers generate usage data for an untested scenario is an inappropriate choice. But even when I’ve already tested the scenario, I still want (need!) to know what the customer experience is. As a tester, I can be the voice of the customer, but I am NOT the customer.
Now – let’s look at the same scenario:
But this time, instead of writing an automated test, I’ve added code (sometimes called instrumentation) to the product code that let’s me know when I’ve taken an action related to the scenario. While we’re at it, let’s add more instrumentation at every error path in the code.
Now, when a user attempts the scenario, you could get logs something like this:
05072014-1601-3143: Search started for FizzBuzz 05072014-1601-3655: Download started for FizzBuzz 05072014-1602-2765: Install started for FizzBuzz 05072014-1603-1103: FizzBuzz launched successfully
05072014-1723-2143: Search started for FizzBuzz 05072014-1723-3655: Download started for FizzBuzz 05072014-1723-2945: ERROR 115: Connection Lost
05072014-1819-3563: Search started for FizzBuzz 05072014-1819-3635: ERROR 119: Store not available.
From this, you can begin to evaluate scenario success by looking at how many people get through the entire scenario, and how many fail for particular errors.
or generate data like this:
And now we have information about our product that’s directly related to real customer usage, and the code to enable it is substantially more simple and easy to maintain than the traditional scenario automation.
A more precise way to put, I don’t think testers should write automation. is – I think that many testers in some contexts don’t need to write automation anymore. Developers should write more automation, and testers should try to learn more from the product in use. Use this approach in financial applications at your own risk!