bookmark_borderTest Design for Automation

I’ve been pondering test automation recently. Maybe it’s because of my participation in the upcoming stp summit (note: shameless self-promotion), but that’s only the surface of it. I’ve complained about misguided test automation efforts before, but it’s more than that too. For every tester that cries out that 100% automation is the only way to test software, someone else is simultaneously stating that only a human (with eyes and a brain) can adequately test software.

The answer, of course, is in the middle.

But I worry that even for those who have figured out that this isn’t an all or nothing proposition, many testers have no idea at all how to design an automated test – which means that they don’t know how to design a test in the first place. The problem I see most often is in the separation of automated and human testing. When approaching a testing problem, you have failed if your first approach is to think about how you’re going to (or not going to) automate. The first step – and most important – is to think how you’re going to test. From that test design effort, you can deduce what aspects of testing could be accomplished more efficiently with automation (and without).

A common variation of this is the automation of manual tests. It pains me to hear that some testers design scripted test cases, and then automate those actions. This tells me two things: the manual test cases suck, and the automation (probably) sucks. A good human brain-engaged test case never makes a good automated test case, and automating a scripted scenario is rarely a good automated test (although occasionally, it’s a start). Some teams even separate the test “writers” from the “automators” – which, to me, is a perfect recipe for crap automation.

An example would be good here. Imagine an application with the following requirements / attributes:

  • The application has a “Roll” button that, when executed, generates five instances of random numbers between 0-9 (inclusive)
  • The application totals the output from the 5 random numbers and displays them in the “Total” field.
  • There are no user editable fields in the application

For those of you with no imagination, this is how I imagine it.

image

From a manual only perspective, layout, user experience, and interaction are definitely areas that need to be investigated, and that are usually best done manually. If I were to write a few scripted manual test cases for this (not that I would), they may look something like this:

Test Case 1

  1. Press Roll button
  2. Count (use calculator or an abacas if necessary) to verify that the value in the Total field matches the sum of the values below

Test Case 2

  1. Press Roll
  2. Ensure that the values in the lower fields are within 0-9 inclusively
  3. Repeat at least n times

Test Case 3

  1. Press Roll
  2. Ensure that the value in the top section is between 0 and 45
  3. Repeat

I have two complaints about the above tests. The first is that executing them manually is about as exciting as watching a banana rot, and the second is that (as I predicted), they’re not very good automated tests either.

When designing tests, it’s important to think about how automation can (or won’t) make the testing more efficient. What I hope you’ve realized already (and if not, please take a moment to think about the huge testing problem we haven’t talked about yet with this application), is that we’ve done nothing to test for randomness or distribution.

Testing randomness is fun because it’s harder than most people bother to think about. We don’t have to get it perfect for this example, but let’s at least think about it. In addition to the above test cases (granted, the third test case above may be redundant with the first), we need to think about distribution of values within the bottom five boxes. Given the functional goal we’re shooting for, we can probably hit all those test cases and more in a reasonably designed automated test.

Pseudo-code:

Loop 100,000 (or more) iterations
Press Roll Button
Verify that sum of output and total field are identical
Verify that output values are between 0 & 9
Store count of values from output boxes 1-5
End Loop

Examine distribution of numbers
Examine sequence of numbers
Other pattern matching activities as needed

The first loop takes care of the main functionality testing, but where automation really helps here is in the analysis of the output. I’d expect a statistically accurate representation of values, repeated values, and sequences.

That’s stuff you want to use automation to solve. Yet, I keep discovering stories of testers who either don’t bother to test stuff like that at all, or put their automation effort into trying to navigate the subtleties of UI testing (yet something else I have an opinion about).

I don’t’ care whether you’re an automator, a tester, a coder, or a cuttlefish – your goal in testing is not to automate everything, nor is it to validate the user experience through brain-engaged analysis. Your job is to use the most appropriate set of tools and test ideas to carry out your testing mission. You can’t do that when you decide how you’re going to test before you start thinking about what you’re going to test.

Notes:  I don’t know why I picked 100k for the loop – it may be overkill, but it’s also something you have an option of doing if you’re automating the test. I suppose you could do that manually, but you will go insane…or make a mistake…or both.

There’s more to this concept, and follow ups…maybe. The big point is that I worry that people think that coded tests replace human tests, when they really enhance human testing – and also that human testers fail to see where coded tests can help them improve their testing. I think this is a huge problem – and that it’s one of those problems that sounds bad, but is actually much, much worse than that…but I don’t know how to get that point across.

I should also point out that given my loathing of GUI automation, that I’d really, really, hope that the pseudo-code I scratched out above could be accomplished by manipulating a model or an object model or the underlying logic directly. I want my logic/functional tests to test the logic and functionality of the application – not its flaky UI.

bookmark_borderTitles for Testers

I’ve had a few conversations on twitter over the past few weeks about tester titles. My conclusion is that test titles are meaningless, but let’s look at why I think that, and why it doesn’t matter.

Let’s say you’ve been testing for eight years, given a conference presentation or two and wrote a bit about testing on a blog. You’re definitely experienced – given the turnover in test roles, some may even call you an “expert” – you may or not be an expert, but chances are you are at least above average in the testing skills department.

Now – your company just folded and you’re looking for a new job. You go to monster.com or dice.com and  you search for “test’. You see a huge list of titles like:

  • Test Engineer (and Test Engineer II, Test Engineer III, etc.)
  • Quality Analyst
  • Testing Analyst
  • Automation Engineer
  • Test Architect
  • Test Automation Developer
  • Test Manager
  • Director of Test
  • Senior Test Engineer
  • many, many more…

OK – which one do you apply for? I’ll save you a bit of work and tell you that the job descriptions are mostly the same. I’ll give you some advice – if you’re in the job market, do not look for a job based on title – the titles don’t mean anything. With title being out of the picture, I’d suggest looking for companies and locations that interest you and applying for whatever test jobs they have available. Companies may be advertising for a “Test Engineer”, hoping to get someone with half a brain about testing, but would be delighted to have someone with experience. You may think, “How could I take a Test Engineer job when my qualifications dictate that I’m at least a Test Engineer III?” I’ll say it again – titles don’t matter. It’s what you do, and not what you’re called.

As a side note, I give this same advice to people changing jobs within Microsoft where we have standard titles to match career stage. If you find a product you want to work on, but you’re overqualified for the positions they have open, talk to them anyway. Most test managers I know are usually more than happy to up-level their team.

Now let’s fast forward. You have a new job, the work is challenging, and you’re kicking ass. Your job title is merely tester, and you’re worried that people will look down on you because you don’t have a more prestigious title like Senor Test Engineer, or Senior Software Quality Specialist, or Super Tester from Another Planet!.

One last time: in software testing, titles just don’t matter. Software testing is still a relatively new profession and definitions are still evolving for nearly every aspect of the profession. That’s ok (and expected). Focus on what you do (and on kicking ass), and you’ll be just fine.

Or – if consistent titles are that important to you, you can go get a job as a plumber or garbage man (ahem – sanitation engineer).

bookmark_borderMy Love (or Hate) Affair with Meetings

Years ago, I began to get annoyed with meetings. I wasn’t quite sure what I didn’t like about them, so I attended as few meetings as possible. Of course skipping all meetings is somewhat of a CLM (career limiting move), so I had to be selective. Sometimes I left meetings energized and felt like they were a great use of my time, but mostly, I wished I could have had the hour back plus compensation for emotional damage.

Eventually, I discovered Pat Lencioni’s Death by Meeting and gathered some insight. Just like when I began to study testing and was able to classify my testing knowledge with more widely understood patterns and concepts, Lencioni succinctly described why some meetings worked and some didn’t, and I was able to relate that learning to my own experiences. I learned why some (but not all) meetings need agendas, and why conflict is necessary for essential brainstorming. I haven’t perfected every meeting I attend (or run), but the concepts from that book remain in my conscious. I’ve become sort of a sucker for Lencioni’s writings since then and think all of his stories are fascinating insights into team growth and leadership.

I got a bit of a boost a few weeks ago when I read Read This Before Our Next Meeting by Al Pittampalli. This book is a manifesto on how to run a good meeting – and more importantly on why bad meetings are killing us. Having worked in organizations where status and informational meetings are abundant makes the points in this book ring home, and remind me why we can, and must do better about how we hold meetings and make decisions.

I love a good meeting, and after reading Pittampalli’s book, reflecting on Lencioni’s book, and recalling my own meeting experiences, I’m reminded of why. Brainstorming, collaboration, and a level of conflict all contribute to a meeting where the thoughts of the many are far better than the sum of their parts. These are the meetings that energize me.

I’m also reinvigorated to eliminate the bad meetings from my calendar. Meetings where goal is consensus or shoving information down attendees throats don’t belong in successful organizations. We can do better, and I’m inspired to try.

bookmark_borderGive ‘em What They Need

What do you do when your manager | project manager | dev manager | other leader asks you for some bit of data that you know is useless? Say that they want something like code coverage progress and test pass rates, but you know that those metrics aren’t truly useful.

Side Note: waitaminute – some of you may be thinking those aren’t useless at all, The way I see it, coverage is only interesting in how you use it to discover potential holes in testing. Test pass rates aren’t interesting to me at all. I can have a 99.99% pass rate, but if the one failing test “accidentally” erases the customers hard drive, I’m far more worried about that single bug than the pass rate. Even if the pass rate is 95%, what I’m most worried about is what types of bugs are in that failing 5%

But people will still ask for those things, because they think they are useful. So, what do you do?

One approach would be to tell them they’re stupid and that you’re not going to give them useless data. Give that a shot and let me know how it goes.

On the other hand, you could just give them what they want – give them test pass rates and coverage information, and go back to work. that’s easy. (note – looking at these two examples is an application (or variation) of my “find the middle” technique).

What you want to do, is give them what they want, but most importantly, give them what they need. The coverage report gets a lot more interesting – and applicable if you include, for example, a list of tests you’ve added as a result of analyzing code coverage. It’s ok to share the test pass rate too, but include breakdown of the bugs causing the failures. You’ve given them what they asked for (pass rates), but you’ve steered the conversation towards risk (what they need). For bonus points, you can look for ways to intermingle the two requests (e.g. are you finding new bugs when adding tests discovered in coverage analysis).

One of the keys for success here is understand (or at least guess) why they want the data in the first place. They probably want some answer – project status, test progress…something. Figure it out, and give them whatever data you think they need to get their question answered. If you’re off track, reload and try again. You will find the right formula. The easy path is to give people what they want, but if you’re serious about improvement, start giving people what they need.

bookmark_borderIn the Middle

Should you automate everything, or nothing? Should you test everything, or nothing? How about leadership – should you dictate every detail of what your team should do, or give them no guidance at all. The answer for all of these questions – as you’d expect, is “somewhere in the middle”.

In my experience, most people handle the “how much” question when dealing with a range of potential solutions by starting with a reasonable mid-point and working from there – e.g. “let’s automate half of our tests”, or, “We’ll test what’s most important”, or, “I’ll give my team some guidance, and then give them some freedom in how they deal with the details.”

Those options are reasonable, so the technique seems to work. It, in fact, does work – but I think it can be better.

A brainstorming technique I use (someone please tell me if I’ve inadvertently stolen the concept) is to first spend a reasonable amount of time focusing on the extremes – because often, some great ideas for “the  middle” comes out of that brainstorming. Think, for example, what you’d do if you tested everything (yes, I know, impossible, but think about it. You’d likely use an army of vendors, need some sort of coverage metrics, etc. Then think about what you’d do if you tested nothing (you may have developers own unit and functional tests, and would rely on customer feedback for scenarios, etc.). In the end, there may be something you take from both brainstorming sessions when you figure out what “the middle” looks like.

How about we try another example. Let’s say you are testing application compatibility with version 2.0 of the “Weasel” operating system. There were 100 applications written for Weasel 1.0, and you have copies of all of them. How many of those applications do you test? If you go straight to the middle (which, again, isn’t a bad choice), you’d probably prioritize the apps by sales numbers and test the top n number of apps based on how much time you have. Not a bad solution, and one I’d feel comfortable bringing to the team leaders.

But let’s think about the extremes for a bit. What would testing all 100 applications look like? We’d definitely need to outsource the testing – but in order to do that, we’d need some clear directions on what “testing” an application entailed. We could write separate notes for each application, but there are probably come up with something generic (install, uninstall, copy/paste, print, major features, etc.) that could work. This solution is certainly going to be too expensive for Weasel management to approve, but we’re just brainstorming.

Now think for a while what it would be like to test none of the apps (and not piss off customers). Well, if none of the programming interfaces used by the apps changed, they’d all probably still work. But this is Weasel 2.0, so of course we’re going to tweak the APIs. So, maybe it’s possible to profile the APIs used by the Weasel 1.0 apps and diff that against the APIs we’re changing in Weasel 2.0 and then develop an API test suite that ensured API compatibility. There may be something here…

Of course, neither of these solutions is the right answer (nor are my brainstorming sessions complete), But I’ll bet that if you try this approach the next time you’re dealing with a range of possible solutions, you’ll come up with some new ideas on what you may choose to do “in the middle”.

bookmark_borderWhy?

Phil Kirkham confirmed his birthright as a tester by asking the epitomical tester question regarding my last post.

Why?

To be fair, he actually asked:

quite a schedule – so what do you get out of it ? Or conversely, if you didn’t go to these events what do you think you would be missing ?

This is something I think about myself (and occasionally need to explain to my management chain, as time at these events is time away from my day job), and I thought there may be enough here to discuss to merit another post (you may, of course, disagree).

The first event I mentioned, the UCIF event, is pretty much part of my job, So I suppose what I get out of it is that I get paid. I think it will also make my overall ucif work easier, so it’s definitely a good thing for everyone involved (except, perhaps, my family).

The other gigs (PNSQC, German Testing Day, Intel, and the Webinar) are where the answer is (possibly) more interesting. If I didn’t go (and present) at these events, I could still get my day job done (and in fact, you could argue that I’d be more effective at my day job because I wouldn’t be missing work). If I were a consultant, I would be using the events to drum up business, but that’s not the case here.

I think the best explanation has two parts. The first is that I sort of like engaging in the test community. PNSQC is a no-brainer since it’s close by and I love Portland (thus it’s the only conference where I’ve asked to present (via abstract submission) in the last several years). But PNSQC – and all of the other gigs on my list are great opportunities to meet testers and discuss hard problems in testing (and some solutions for these problems). I like talking about testing, and talking with non-MS testers gives me (I think) a better perspective on what’s happening in the world. I certainly don’t agree to every speaking offer, but I do try to accept a few opportunities every year. It just so happens that my opportunities this year are all consolidated into a five week period. I may do something in the spring, but as of now, I think the fall flurry will be it for me for a while.

I also like to think there’s value to Microsoft as well (at least my manager and I do). As I thought when writing HWTSAM, I think there’s value in sharing what happens at MS with the worldwide testing community. I often say, “in the absence of information, people will make stuff up”. I try to reduce the amount of stuff people make up  by sharing what I can about how teams at MS approach the practice of making quality software at a large scale. Sometimes it works, sometimes it does not – but I feel the effort is worth it.

So, what do I get out of it? I’d like to think I get better software testing – either through ideas I share with others, or by refining my ideas based on feedback, or brand new ideas I can bring back to MS. I think it’s worth it.

bookmark_borderFall Travel

I just got back from (yet another) vacation with the family – it’s been fun spending some extra time with the family this summer, but I think we’re (almost) done now, and it’s time to get back to work and to start thinking transitioning the kids into sleep schedules that accommodate school hours.

With the approaching end of summer comes my inevitable fall travel schedule.

  • I’ll be in Austin, Texas on September 13 & 14 to attend the UCIF members conference.
  • I will be at PNSQC in Portland on October 10 & 11. I will be giving a presentation on Customer Focused Test Design.
  • I’m giving a webinar on October 12 – details coming soon.
  • I’ll be in Israel October 26-28 speaking at Microsoft and Intel. This is my first trip to Israel, and I’m quite excited about this trip.
  • I’ll be giving a keynote at German Testing Day on November 9. I met some of the folks who put on this event last year, and I’m utterly honored to have been invited to speak at this event.

Considering I actually have a day job, this schedule is probably a bit rough, but I’m excited to have these opportunities to speak and to meet testers from around the world.

If you’re in the area of any of these events and want to meet up, please let me know (or come talk to me at one of the sessions if you’re attending).

bookmark_borderLiving Virtually

I wrote about using virtual machines for testing in hwtsam, and gave a talk at STAR West a few years back on using virtual machines for testing. What I remember most about the talk was the group of VMWare employees sitting about 10 rows back (I think to ensure that I didn’t say anything bad about them – I didn’t). In fact, I’m a fan of virtual machines no matter where they come from – they’re incredibly useful, and in many cases, and under used productivity aid.

One point I seem to forget to mention when I’m talking about virtual machines is how convenient they are beyond testing purposes. I wanted to share a recent experience of mine, but there’s a bit of a story leading to the punch line, so feel free to skip ahead a few paragraphs if you’re busy.

I have three computers in my office at work. My main “dev / test box” is a win2k8 server (for reasons beyond the scope of this post, we build our product only on server machines). We have fancy tools that let us build and deploy (for testing) quickly on this single machine, and in order to not break the consistency of this niceness, we never, ever install any shipped office bits on these machines – only bits that we generate through the build process. The consistent use of machine and build configurations eliminates nearly all of the “it doesn’t work that way on my machine”, or “my build is broken” issues.

Because real users don’t use our dev machines, I also have a “test machine” – which I use for (surprise, surprise), testing. Because I’m working on the next version of Office, that’s all it runs most of the time.

Then I have my laptop. Since I use it for everything else, it runs (usually – but see below) office 2010, and I use it for email, docs, etc. About a month ago, I needed to install our pre-beta bits on my laptop. In general, we can run our new stuff side by side with the shipped stuff, and it works fine. However, in a long bout of yak-shaving one day, I had to uninstall all versions of Office from my laptop. But I still needed the latest bits to reproduce a bug I found, so to save time, I just installed the new pre-beta bits.

You see where I’m at now. I don’t have any machines running shipped office bits anymore. While the pre-release apps all work quite well, they’re a bit…unrefined. I’m a big fan of ‘eating my own dogfood’, but sometimes I need the confidence that comes from using something a bit more baked.

I almost started to install the shipped office bits back on my laptop when I realized two things. The first was that I was extremely worried that I’d spend a few hours every week tweaking things on my laptop – installing, and uninstalling in order to clear up weirdness, or keeping things in sync switching between different versions of the same app. I want to continue running daily builds as much as I can, and knew I’d end up switching between different versions of the same application all the time, and I was worried about the time hit. The second thing I realized was that a virtual machine could be a nice solution for me.

It only took me an hour or two to set up a hyper-v vm (hosted on my “dev box”), install windows, office 14, and sync documents with my live mesh account and I had a vm that acted so much like a ‘real’ pc that after a month, I still forget sometimes that it’s not a real machine. I use our prerelease bits for nearly everything, but can rely on the vm (which I connect with over terminal services) when I feel the need to work with shipped software – for example, my vm currently has a copy of word open with my pnsqc paper. It’s cool, because I was working on it last night (connected from home), and just left word open. When I logged on to the vm this morning from work, it was right where I left it. It’s so damn convenient, in fact, that I can’t imagine not having a “work” vm anymore, and I can see a point in the not-too-distant future where we all use virtual machines in the cloud more than physical computers.

– written and published from a virtual machine

bookmark_border…and Now I’m Back

My interlude is over, and I’m back to blogging – at least that’s the plan, and I don’t see any reason why I won’t be back on the blog-waves on a semi-regular basis.

My summer was crazy with work. I probably let myself get spread too thin, and I paid the price of context switches and deadlines and delivered what I planned to deliver. One nice thing that happened amidst all of this work is that I had a chance to practice and polish my personal kanban process. I’m a big fan of pkb and should probably share my approach to the tool sometime.

But I took a break for more reasons than being busy. Every few months, it seems, I seem to have a personal crisis about the future of software testing. I sort of find it ridiculous that there are so many self-proclaimed experts in a field that is at best barely defined. Everyone seems to have a definition of what testing is, and what testers do. I think that’s perfectly fine…until testers begin to discount (or mock) the testing work of others just because what their definitions of roles and approach don’t match. I bet what I do is different than most of you – but we’re probably all still testers (of some form)and I expect we can find some value in the diversity of our approaches.

Then, after reading one too many articles from testing “experts” that failed to grasp many of the basic concepts of what they were writing about, I began to fear that we’re taking a step backwards for every step we advance. Rather than wrestle with so many things outside of my control, I decided it was time for me to focus on what I could control and give myself a break from the testing community. I love you all, but sometimes you drive me crazy.

I didn’t plan to solve any of my internal strive during my time away, and I didn’t. But – I think I’m better equipped to focus on what I can control and continue to share what I find valuable in my world. I still believe in a bright future for software testing, but we need more open, seeking minds to get us there.

bookmark_borderAn Interlude

As you may have noticed, I’ve had to take a bit of a break from blogging. I’ve kicked out at least a post or two a week for a few years now, but I’m not dry on ideas, and I definitely want to follow up on my recent Test This post (before a follow up becomes irrelevant).

The challenge is that my day job is kicking my ass. I expected the UCIF test and certification work I’m involved with would take about 20% of my time (actually – they told me to expect 20% – I was hoping to get away with a bit less), but lately it’s been making a much bigger dent in my work day. Currently, I’m preparing a progress report for the board, and a members only webinar for later this month, along with trying to gain some ground on the roadmap for the working group.

The Lync team is bearing down on an internal milestone and there are a lot of loose ends (both technical and pseudo-political) to tie up in the next several weeks. Yearly performance calibrations are going on at the same time, so along with my normal levels of coaching and mentoring, I’ve had a few extra sessions of tester therapy in the past weeks.

And then there’s the not-really-work stuff that I still need to do. Top of the list is finishing my PNSQC paper, and I’m also juggling and negotiating some potential speaking engagements for the fall. For the first time ever, I’ve also had to start saying no to some internal speaking requests (sorry – if you read my blog, I hope you ask me again later this summer).

So – once I find a light at the end of the tunnel (or when the day job finally breaks me), I’ll be back.