The Myths of Tech Interviews

I recently ran across this article on nytimes.com from over a year ago. Here’s the punch line (or at least one of them):

“We looked at tens of thousands of interviews, and everyone who had done the interviews and what they scored the candidate, and how that person ultimately performed in their job. We found zero relationship. It’s a complete random mess…”

I expect (hope?) that the results aren’t at all surprising. After almost 20 years at Microsoft and hundreds of interviews, it’s exactly what I expected. Interviews, at best, are a guess at finding a good employee, but often serve the ego of the interviewer more than the needs of the company. The article makes note of that as well.

“On the hiring side, we found that brainteasers are a complete waste of time. How many golf balls can you fit into an airplane? How many gas stations in Manhattan? A complete waste of time. They don’t predict anything. They serve primarily to make the interviewer feel smart.”

I wish more interviewers paid attention to statistics or articles (or their peers) and stopped asking horrible interview questions, and really, really tried to see if they could come up with better approaches.

Why Bother?

So – why do we do interviews if they don’t work? Well, they do work – hopefully at least as a method of making sure you don’t hire a completely incapable person. While it’s hard to predict future performance based on an interview, I think they may be more effective at making sure you don’t hire a complete loser – but even this approach has flaws, as I frequently see managers pass on promising candidates for (perhaps) the wrong reasons out of fear of making a “bad hire”.

I do mostly “as appropriate” interviewer at Microsoft (this is the person on the interview loop who makes the ultimate hire / no hire decision on candidates based on previous interviews and their own questions). For college candidates or industry hires, one of the key questions I’m looking to answer is, “is this person worth investing 12-18 months of salary and benefits to see if they can cut it”. A hire decision is really nothing more than an agreement for a long audition. If I say yes, I’m making a (big) bet that the candidate will figure out how to be valuable within a year or so, and assume they will be “managed out” if not. I don’t know the stats on my hire decisions, but while my heart says I’m great, my head knows that I may be just throwing darts.

What Makes a Good Tech Employee?

If I had a secret formula for what made people successful in tech jobs, I’d share. But here’s what I look for anyway:

  1. Does the candidate like to learn? To me, knowing how to figure out how to do something is way more interesting than knowing how to do it in the first place. In fact, the skills you know today will probably be obsolete in 3-5 years anyway, so you better be able to give me examples about how you love to learn new things.
  2. Plays well with others – (good) software engineering is a collaborative process. I have no desire to hire people who want to sit in their office with the door closed all day while their worried team mates pass flat food under their door. Give me examples of solving problems with others.
  3. Is the candidate smart? By “smart”, I don’t mean you can solve puzzles or write some esoteric algorithm at my white board. I want to know if you can carry on an intelligent conversation and add value. I want to know your opinions and see how you back them up. Do you regurgitate crap from textbooks and twitter, or do you actually form your own ideas and thoughts?
  4. If possible, I’ll work with them on a real problem I’m facing and evaluate a lot of the above simultaneously. It’s a good method that I probably don’t use often enough (but will make a mental note now to do this more).

The above isn’t a perfect list (and leaves off the “can they do the job?” question, but I think someone who can do the above can at least stay employed.

The Weasel Returns

I’m back home and back at work…and slowly getting used to both. It was by far the best (and longest) vacation I’ve had (or will probably ever have). While it’s good to be home, it’s a bit weird getting used to working again after so much time off (just over 8 weeks in total).

But – a lot happened while I was gone that’s probably worth at least a comment or two.

Borg News

Microsoft went a little crazy while I was out, but the moves (layoffs, strategy, etc.) make sense – as long as the execution is right. I feel bad for a lot of my friends who were part of the layoffs, and hope they’re all doing well by now (and if any are reading this, I almost always have job leads to share).

ISO 29119

A lot of testers I know are riled up (and rightfully so) about ISO 29119 – which, in a nutshell, is a “standard” for testing that says software should be tested exactly as described in a set of textbooks from the 1980’s. On one hand, I have the flexibility to ignore 29119 – I would never work for a company that thought it was a good idea. But I know there are testers who find themselves in a situation where they have to follow a bunch of busywork from the “standard” rather than provide actual value to the software project.

As for me…

Honestly, I have to say that I don’t think of myself as a tester these days. I know a lot about testing and quality (and use that to help the team), but the more I work on software, the more I realize that thinking of testing as a separate and distinct activity from software development is a road to ruin. This thought is at least partially what makes me want to dismiss 29119 entirely – from what I’ve seen, 29119 is all about a test team taking a product that someone else developed, and doing a bunch of ass-covering while trying to test quality into the product. That approach to software development doesn’t interest me at all.

I talked with a recruiter recently (I always keep my options open) who was looking for someone to “architect and build their QA infrastructure”. I told them that I’d talk to them about it if they were interested, but that my goal in the interview would be to talk them out of doing that and give them some ideas on how to better spend that money.

I didn’t hear back.

Podcast?!?

It’s also been a long hiatus from AB Testing. Brent and I are planning to record on Friday, and expect to have a new episode published by Monday September 22, and get back on our every-two-week schedule.

An Angry Weasel Hiatus

I mentioned this on twitter a few times, but should probably mention it here for completeness.

I’m taking a break.

I’m taking a break from blogging, twitter, and most of all, my day job. Beyond July 4, 2014, I’ll be absent from the Microsoft campus for a few months– and most likely from social media as well. While I can guarantee my absence from MS, I may share on twitter…we’ll just have to see.

The backstory is that Microsoft offers a an award – a “sabbatical” for employees of a certain career stage and career history. I received my award nine years ago, but haven’t used it yet…but now’s the time – so I’m finally taking ~9 weeks off beginning just after the 4th of July, and concluding in early September.

Now – hanging out at home for two months would be fun…but that’s not going to cut it for this long awaited break. Instead, I’ll be in the south of France (Provence) for most of my break before spending a week in Japan on my way back to reality.

I probably won’t post again until September. I’m sure I have many posts waiting on the other side.

-AW

Alan and Brent are /still/ talking…

In case you missed it, Brent and I are still recording testing podcasts. I stopped posting the announcements for every podcast post on this blog, but if you want to subscribe, you have a few choices.

  1. You can Subscribe via RSS
  2. You can Subscribe via iTunes
  3. You can bookmark the podcast blog
  4. You can follow me on twitter and look for the announcements (usually every other Monday)

And for those of you too busy to click through, here are the episodes so far:

  1. AB Testing – Episode 1
  2. AB Testing – Episode 2
  3. AB Testing – Episode 3
  4. AB Testing – Episode 4
  5. AB Testing – Episode 5

Stop, If You Want To…

Well – that was a fun post. The dust hasn’t quite settled, but a follow up is definitely in order. 

First, some context. I was committed to giving a lightning talk as part of STAR East’s “Lightning Strikes the Keynotes” hour. I purposely didn’t pick a topic before I left, and figured I would come up with something while I was there. On Wednesday morning (the day of the lightning talks), I was out for a run thinking about the conference, when I had the idea to talk about testers writing fewer automated tests. I realized that if programmers should be writing more tests (ok, “checks” for those of you who insist), and that just about every mention of automation I heard at the conference talked about the challenges in automating end-to-end scenarios and equal challenges in maintaining that automation – that it was a topic worth exploring. I could have called the post, Testers Should Stop Writing Some of Their Automation Because Programmers Should Do Some of it, and Some of Your Automation Isn’t Very Good Anyway and it is Getting in the Way of Testing You Should Be Doing, but I chose a more controversial (and shorter) title instead. I purposely left out the types of automation (and other coding activities) that testers still should do, because I was afraid that if I did, that it would distract from the main two points (Programmers need to write a lot more tests, and Testers spend too much time writing and maintaining bad automation).

Also – I only had five minutes (to talk about it), so I stuck to the main points:

  1. Developers need to own more testing.
  2. Testers need to stop wasting time writing ineffective automation.

But the fun really began with the comments. I’ve never had more fun reading blog comments, twitter, and my mailbox (for some reason a bunch of people prefer to email me directly with comments rather than comment publically). I thought I’d comment on a few areas where I had a lot of questions.

Does this work for both services and “thick” clients?

Although I didn’t call it out in my post, this sort of approach works really well with services. But – it can work with thick clients too, you just need to be a little more careful with your deployment, as rollback and monitoring won’t be in real time like your service. I think mobile apps are a great example of where you may run experiments with a limited number of users, but windows (or mac) apps could follow the model as well. For always (or often) connected devices, I see no reason to not push updates – of course, these updates should probably go through a bit more testing than services before being pushed, as if something is broken, getting back to a safe state will take a bit of work.

The important thing to note for those still in unbelief is that deploying test items to production is done all of the time. Ebay does it, amazon does it, netflix does it (I could go on, but believe me that it’s done a lot). I don’t have a link, but in the comments, Noah Sussman tells me that NASA does it.

What Do Testers Do? Where did my cheese go. You are annoying!

The fear of cheese moving is strong (“If developers do functional testing, what will I do?”). There’s a lot of testing activity left to do (even when you take away writing developers tests for them and wasting time on unneeded automation). Stress and performance suites and monitoring tools (for example) should give the coding testers on the team plenty of work to do. Data analysis is also necessary if you’re gathering data from customers. And thanks to Roberto for pointing out that sanity checking UI changes, testing for Accessibility or Localization, or color changes all could use an onsite tester (or sometimes some code) to help.

And honestly, now that I’ve made you think about it, there are a few places where testers writing automation is useful. But it’s about time that testers stopped trying to write automation for cases that shouldn’t be automated. Want to try logging in and out 500 times – automate that (ideally NOT at the GUI level)? Go for it. Want to automate the end-to-end scenario of setting up an account and logging in? Please don’t bother. Instead, just add some monitoring code that lets you know if login is failing and save yourself some frustration.

One other thing – the point about developers owning more testing – that one is equally true for services and clients. It doesn’t make sense to me at all to have a separate team verify functional correctness. It’s not “too hard” or “too much work” for developers to write tests. Developers need to own writing quality code – and doing this requires that they write tests. I was surprised that some people felt that it was bad for developers to test their own code – but I suppose that years of working in silos will make you believe that there’s some sort of taint in doing so (but there’s not!).

This is obviously a point (and a change that is really happening now at a lot of companies) that causes no small amount of fear in testers. I get that, but ignoring the change or burying your head in the sand, or justifying why testers need to own functional testing isn’t going to help you figure out how to function when these changes hit your team.

Stop Writing Automation

After releasing The A Word, I didn’t plan on writing any more posts about automation. But, after pondering transitions in test, and after reading this post from Noah Sussman, I have a thought in my head that I need to share.

I don’t think testers should write automation. Star

I suppose I better explain myself.

All automation isn’t created equal. Automation works wonderfully for short confirmatory or validation tests. Unit, functional, acceptance, integration tests, and all other “short” tests lend themselves very well to automation. But I think it’s wasteful and inefficient to have testers write this automation – this should be written by the code owners. Testing your own code (IMO) improves design, prevents regression, and takes much, much less time than passing code off to another team to test.

That leaves the test team to write automation for end-to-end scenarios. There’s nothing wrong with that…except that writing end to end automated tests is hard (especially, as Noah points out, at the GUI level). The goal of automation (as touted by many of the vendors), is to enable running a bunch of tests automatically, so that testers will have more time for hands-on testing. In reality, I think that most test teams spend half of their time writing automation, half of their time maintaining and debugging automation, and half of their time doing everything else.

Let’s look at a typical scenario.

image

Pretty easy – you only need to automate three actions, add a bit of validation and you’re done!

If you’ve attempted something like this before, you know that’s not the whole story. A good automated test doesn’t just execute a set of user actions and then go on it’s way – you need to look at error conditions, and try to write the test in a way that prevents it from breaking.

image

It’s hard. It’s fragile. It’s a pain in the ass. So I say, stop doing it.

“But Alan – we have to test the scenario so that we know it works for our customers”. I believe that you want to know that the scenario works – but as much as you try to be the customer, nobody is a better representative of the customer than the customer. And – validating scenarios is quite a bit easier if we let the customers do it.

NOTE: I acknowledge that for some contexts, letting the customers generate usage data for an untested scenario is an inappropriate choice. But even when I’ve already tested the scenario, I still want (need!) to know what the customer experience is. As a tester, I can be the voice of the customer, but I am NOT the customer.

Now – let’s look at the same scenario:

image

But this time, instead of writing an automated test, I’ve added code (sometimes called instrumentation) to the product code that let’s me know when I’ve taken an action related to the scenario. While we’re at it, let’s add more instrumentation at every error path in the code.

image

Now, when a user attempts the scenario, you could get logs something like this:

05072014-1601-3143: Search started for FizzBuzz
05072014-1601-3655: Download started for FizzBuzz
05072014-1602-2765: Install started for FizzBuzz
05072014-1603-1103: FizzBuzz launched successfully

or

05072014-1723-2143: Search started for FizzBuzz
05072014-1723-3655: Download started for FizzBuzz
05072014-1723-2945: ERROR 115: Connection Lost

or

05072014-1819-3563: Search started for FizzBuzz
05072014-1819-3635: ERROR 119: Store not available.

 

From this, you can begin to evaluate scenario success by looking at how many people get through the entire scenario, and how many fail for particular errors.

image

or generate data like this:

image

And now we have information about our product that’s directly related to real customer usage, and the code to enable it is substantially more simple and easy to maintain than the traditional scenario automation.

StarA more precise way to put, I don’t think testers should write automation. is – I think that many testers in some contexts don’t need to write automation anymore. Developers should write more automation, and testers should try to learn more from the product in use. Use this approach in financial applications at your own risk!

Testing Trends…or not?

I read this article over the weekend about five emerging trends in software testing – Test Automation; Rise of mobile and cloud; Emphasis on security; Context-driven testing; and More business involvement.

I fully acknowledge that I work in a software development environment that isn’t like many others, but while reading the article, I really didn’t feel like any of those areas are “emerging” – all are fully emerged already. Sure, the trends are interesting to testers, but emerging? I could waste some space rebutting or commenting on the areas above, but instead, let me offer some alternate trends that I see inside of MS and from some of my colleagues who work elsewhere.

Fuzzier Role Definitions.  I don’t really like the terms “whole team approach” or “combined engineering”, but I do see software teams really figuring out how to work better together and leverage every team members strengths effectively. Great testers are working as Test Specialists and working much more broadly across the team. I expect the “lines” between software disciplines to fade even more in the future.

Developers Own More Testing. You can call them “checks” if you wish (I call them “short tests”, but software developers are beginning to own much bigger portions of traditional software testing. This is a good thing – it ensures that daily code quality is high, and gives test specialists a high quality product to work with.

Testing Live Sites. Mock-test environments typically do a poor job representing production environments. Other than brief sanity checks for the most critical components, many web service teams just roll their new bits straight to production, and then run their tests against the live system. With a good monitoring system (including the ability to stage rollouts and automatically roll back if needed), this is a safe, efficient, and frankly, practical method for testing services.

Data is HUGE. Many software teams have figured out that the best way to get an accurate representation of how customers use software is collect and analyze data from those same customers. A whole lot of traditional test activities can be replaced by product instrumentation, and an efficient method for getting product instrumentation back to the team for analysis. On a lot of teams, last year’s testers are this  year’s data analysts and data scientists. While not every tester is cut out for this role, this move to data analysis is a strong trend on a lot of software teams.

To critique myself for a moment, I think a lot of readers could say that none of these points are emerging either. That’s a fair point, since I know teams that have been doing everything above for years…but I’m just now seeing some of these trends “emerge” on multiple teams (and not just those testing web services or sites).

What trends do you see? Did I miss anything huge? Have the above four points already reached the tipping point of emergence?

Users, Usage, Usability, and Data

The day job (and a new podcast) have been getting the bulk of my time lately, but I’m way overdue to talk about data and quadrants.

If you need a bit of context or refresher on my stance, this post talks about my take on Brian Marick’s quadrants (used famously by Gregory and Crispin in their wonderful Agile Testing book); and I assert that the the left side of the quadrant is well suited for programmer ownership, and that the right side is suited for quality / test team ownership. I also assert that the right side knowledge can be obtained through data, and that one could gather what they need in production – from actual customer usage.

And that’s where I’ll try to pick up.

image

Agile Testing labels Q3 as “Manual”, and Q4 as “Tools”. This is (or can be) generally true, but I claim that it doesn’t have to be true. Yes, there are some synthetic tests you want to run locally to ensure performance, reliability, and other Q4 activities, but you can get more actionable data by examining data from customer usage. Your top-notch performance suite doesn’t matter if your biggest slowdown occurs on a combination of graphics card and bus speed that you don’t have in your test lab. Customers use software in ways we can’t imagine – and on a variety of configurations that are practically impossible to duplicate in labs. Similarly, stress suites are great – but knowing what crashes your customers are seeing, as well as the error paths they are hitting is far more valuable. Most other “ilities” can be detected from customer usage as well.

Evaluating Q3 from data is …interesting. The list of items in the graphic above is from Agile Testing, but note that Q3 is the quadrant labeled (using my labels) Customer Facing / Quality Product. You do Alpha and Beta testing in order to get customer feedback (which *is* data, of course), but beyond there, I need to make a bit larger leaps.

To avoid any immediate arguments, I’m not saying that exploratory testing can be replaced with data, or that exploratory testing is no longer needed. What I will  say is that not even your very best exploratory tester can represent how a customer uses a product better than the actual customer.

So let’s move on to scenarios and, to some extent, usability testing. Let’s say that one of the features / scenarios of your product is “Users can use our client app to create a blog post, and post it to their blog”. The “traditional” way to validate this scenario is to either make a bunch of test cases (either written down in advance(yuck) or discovered through exploration) that create blog entries with different formatting and options, and then make sure it can post to whatever blog services are supported. We would also dissect the crap out of the scenario and ask a lot of questions about every word until all ambiguity is removed. There’s nothing inherently wrong with this approach, but I think we can do better.

Instead of the above, tweak your “testing” approach. Instead of asking, “Does this work?”, or “What would happen if…?”, ask “How will I know if the scenario was completed successfully?” For example, if you knew:

  • How many people started creating a blog post in our client app?
  • Of the above set, how many post successfully to their blog
  • What blog providers do they post to
  • What error paths are being hit?
  • How long does posting to their blog take?
  • What sort of internet connection do they have?
  • How long does it take for the app to load?
  • After they post, do they edit the blog immediately (is it WYSIWYG)?
  • etc.

With the above, you can begin to infer a lot about how people use your application and discover outliers, answer questions; and perhaps, help you discover new questions you want to have answered. And to get an idea of whether they may have liked the experience, perhaps you could track things like:

  • How often do people post to their blog from our client app?
  • When they encounter an error path, what do they do? Try again? Exit? Uninstall?
  • etc.

Of course, you can get subjective data as well via short surveys. These tend to annoy people, but used strategically and sparsely, can help you gauge the true customer experience. I know of at least one example at Microsoft where customers were asked to provide a star rating and feedback after using an application – over time, the team could use available data to accurately predict what star rating customers would give their experience. I believe that’s a model that can be reproduced frequently.

Does a data-prominent strategy work everywhere? Of course not. Does it replace the need for testing? Don’t even ask – of course not. Before taking too much of the above to heart, answer a few questions about your product. If your product is a web site or web service or anything else you can update (or roll back) as frequently as you want, of course you want to rely on data as much as possible for Q3 and Q4. But, even for “thick” apps that run on a device (computer, phone, toaster) that’s always connected, you should also consider how you can use data to answer questions typically asked by test cases.

But look – don’t go crazy. There are a number of products, where long tests (what I call Q3 and Q4 tests) can be replaced entirely by data. But don’t blindly decide that you no longer need people to write stress suites or do exploratory testing. If you can’t answer your important questions from analyzing data, by all means, use people with brains and skills to help you out. And even if  you think you can get all your answers with data, use people as a safety net while you make the transition. It’s quite possible (probable?) to gather a bunch of data that isn’t actually the data you need, and then mis-analyze it and ship crap people don’t want – that’s not a trap you want to fall into.

Data is a powerful ally. How many times, as a tester, have you found an issue and had to convince someone it was something that needed to be fixed or customers would rebel? With data, rather than rely on your own interpretation of what customers want, you can make decisions based on what customers are actually doing. For me, that’s powerful, and a strong statement towards the future of software quality.