Archive for the ‘Software Testing’ category

LOL – UR AUTOMASHUN SUCKZ!

September 6th, 2010

If you're new here, you may want to subscribe to my RSS feed. Thanks for visiting!

There’s a pretty good presentation from the folks at Electric Cloud making the rounds on “why your automation isn’t” (and other variations on that title). The premise is, that testers spend too much time babysitting (supposedly) automated tests, and that testers end up doing a lot of manual work in order to keep their automation running. I know nothing about their products, but they have some reasonably good ideas and apparently some tooling to help.

But it’s not enough. Crappy automation and crappy automation systems is an insanely huge problem. But we tend to ignore it because – hey – we’re running lots of automation, and that must  be good. To be clear,I have nothing against test automation (although I’m leery of a lot of GUI automation). Done well, it absolutely aids any testing effort. The other 99% of the time I bet it’s actually slowing you (and your team) down. Please, please, get it through your head that your job as a tester is to test software and not to create the largest test suite known to humankind. .

…and here are some ideas to help you get to automation that doesn’t suck.

The obvious place to start is with your code. Do you treat your test code like production code? Do you do code reviews? Do you run static analysis tools? Do you step through the tests with the debugger to ensure they are doing what you think they are? Do you trust the results from your tests – i.e. if a test fails, are you confident that there’s a product bug? If you’re spending hours every day grepping through test results and system logs to try to figure out what happened, your automation sucks.

Now think about how your tests are run. Are they automatically built for you every day (or more frequently), and distributed to a test bed of waiting (physical and virtual) machines? Or do you walk around a lab full of computers manually typing in command lines? Or do you just run all of your automation on the spare machine in your office, then upload your results to some spreadsheet on a share where nobody can ever find it. In other words, do your tests execute automatically…or do they suck?

What about failures? Do you look at your automation failures in the morning, gather up some supporting data, then enter a bug? Or are bugs entered automatically – including additional information like logs, call stacks, screen shots, trace information, and other relevant information automatically? When a bug is resolved as fixed, do you go run that failing test manually to ensure that the bug was fixed – or does your automation system automatically take care of manual (and mundane) tasks such as this for you? How about reporting – how do you generate the all important “Test Result Report”? Does your automation system take care of it, or is it a largely manual task.

Do you really have automation – or just a suck-filled wrapper around some suck-infested tests? It’s ok, be honest. It’s your choice what to do, but, but I bet the maturity of how you test software certainly has much more to do with end user quality than the number of tests you have (mostly because the latter metric is useless, but you get the point).

I don’t have a solution to sell you – but I can give you an architecture for and end to end true automation system for free. The chapter I wrote for Beautiful Testing covers this exact topic – and I’ve finally got around to posting it here. It’s all yours – read it and comment here if you’d like – or read it and delete it to free up some disk space. Regardless of my chapter, I still recommend you buy the book – like the other authors, I don’t make any money, but the proceeds go to buy mosquito nets to prevent malaria in Africa. More importantly, the book is friggin’ cool and I think every tester should own it.

KTHXBBYE

p.s. No idea at all why I decided to use a lolcat title – it just sorta felt right.

Putting the pieces together

September 1st, 2010

My last two posts (one on Tester DNA, and the most recent on worries about the future) were written in isolation. The started the DNA post many months ago, and the future at least a year ago. I started and stopped completing the posts several times since their inception before finally completing them. For a bit of perspective, I currently have nine blog posts in my draft folder. Sometimes I think of something I want to write about, but realize that I don’t know what to say. When this happens, I start writing anyway – no matter what comes out. When I feel like I’ve written enough to dump my thoughts and save a draft. Then, I take a look at my drafts folder every week or so and see if there’s anything I want to finish – most just sit there until I give up and delete them. Overall, though, I would guess that a tenth or so of my posts, at most, sit in the draft folder before being posted – for blog posts, I favor much more of a stream of consciousness approach over heavy editing (which is easier for me, but I imagine it can be difficult for my readers).

Anyway…I ended up posting two random articles in a row from my drafts folder backlog. I didn’t mean for the posts to be related, but they are – I guess – or maybe I’m stretching things. At any rate, in response to my “the future is bleak” post Rikard Edgen asked which types of bugs I thought testers would miss. I thought I’d (kind of) answer that, and also explain why systems thinking is so important for testers.

Let’s start small – so small that we may not even need testers! Here’s a bug free program**.

void main()
{
    printf("Hello World");
}

Not very exciting – it doesn’t really do anything (**and it can’t be localized). Overall, it’s pretty useless, so for example’s sake, let’s expand our view of small to a small class or unit of functionality – you know, about the size you’d write unit tests for. You may have a bit of functionality – for example, selecting a book from a database. I like to picture a unit of functionality like this:

Look - here's what a UNIT looks like

The unit is much bigger than the “hello world” example above, but with good design and good unit testing, it’s quite possible to write an entire unit that’s nearly bug free (from a functional level at least). In fact – I often speak of a dream of mine where testers never find unit level functional bugs, and that developer tools and approaches eradicate these types of errors in software forever…but I dream a lot.

People expect software do be useful, and an application that does nothing but select a book from a database wouldn’t be very useful at all– people probably want to add or read reviews, see publisher information, sort book lists, or add the book to a shopping cart for purchase. By the time the software ends up being usable (or marketable), it has a lot of units.

A whole bunch of units

Now – even if all of the units are well designed and well unit tested, I would bet money and stock options that the application has plenty of bugs. This is because many bugs will occur where these units connect into larger pieces of functionality. In 1999, NASA lost $125 million because of a metric mismatch between two units (I bet the units functioned perfectly in isolation). These “connection points” (or dependencies) are a pretty common place for bugs to occur. The good esters understand where the pieces connect and “see the big picture” of the application in order to guide their testing. The testers with the systems thinking bit in their DNA, do pretty well at this, but it can be challenging for even the best of testers as applications get larger. At some point testers will need to use analysis tools to help them understand the connection points and dependencies, but their brain is still valuable in understanding the big picture of how the bits work together.

Now – as applications get bigger, this get’s harder. It’s impossible for a single tester to keep the big picture of something like SQL Server or Excel in their head – even a “trivial” application like the authoring client I’m using to write this post has a lot of moving parts and can be difficult to keep track of from a systems view.

One of my worries about the future is that trivial applications won’t exist anymore – ok – they’ll exist, but they’ll be part of much larger systems. In the examples above, those units that connected together into functional pieces became functional pieces that connect together into applications. Now, envision a world where applications (including services and devices) connect into mega-applications,services, and a world of interconnected deices.

Oh, good god - what have we done?

If you believe my theory about bugs occurring at connection points, and that hugely interconnected software systems of the (near) future will look like this, you should realize that this is going to be just a bit more difficult to test. My stance is that you just can’t test a system such as this the same way you test a simple application. Sure, we’ll need to design the software so there are fewer connection points – or that the connection points are well managed and tested (you know – better than they did on the orbiter team). That also means that it would be beneficial if testers could test the design of the software (and that the design is testable). I expect testers will use sophisticated tools to help target their testing and recognize where error prone areas exist (and be able to run the right types of tests to reduce risk). An exploratory approach will still be prominent, but the challenge is where to look first, and what to do when you get there. This is where I think the tester DNA – problem solving, quick learning, and big picture thinking will be critical. And it will still be extremely difficult to test such a system.

Of course, it’s a valid argument to just build a system first then wait for test to catch up. That’s pretty much what’s occurred in software engineering up to today, and software is still advancing (although software quality, arguably, is not). My worry is that gigantic software systems are even more prone to critical errors due to an exponential number of code paths as well as the “death by a thousand cuts” effect of thousands of components – each containing “only a few” bugs.

My opinion is that we actually need to improve the way we do testing (actually, the way we create software from beginning to end) before we can pull off building a system like this. There’s a good chance I’m completely wrong…but maybe I’m right.

Will we survive the future of software?

August 29th, 2010

I’ve been thinking about writing this post for years, but have had it floating around between my head and a draft for the last few months. I’ve held on to the post because even though it’s not my intent, I’m probably going to piss off someone who takes my points the wrong way or misinterprets what I say.I’m still not sure where I’m going with it, but I need to try to get these thoughts out sometime, and now is as good of a time as any.

I’m worried about the future of software testing. To be fair, I’m probably as much to blame as others, but I’m so bothered about this that I’m actually beginning to get a bit depressed. But I suppose I should explain myself before I rant any more – so let’s start with the current state of software testing.

But before that (and in the hope of diffusing the hard message a bit), let me say that I’m quite happy with what’s been happening in software testing recently. The “voice” of software testing is gaining momentum, and there are many active participants in discussing the basics of software testing so that the new crop of testers has a strong base of knowledge for the future. We have brilliant people emerging in the world of software testing, and I love the passion they have for the craft. It’s great to see so many testers online discussing their experiences and working hard to hone their skills. But we’re still talking about the same stuff testers were talking about ten years ago. The testing community is abuzz with discussions about basic tester skills, basic automation approaches and basic approaches to measurement. The discussions are lively, and people are learning – but they’re learning the same things testers were learning last year, the year before that, and the year before that. I like presenting and attending conferences like STAR and other similar conferences – but those conferences are targeted at new testers, and talk topics are dominated by the same things topics that have been around for years. I checked out the STAR conference proceedings from ten years ago (you can look too), and it looks remarkably similar to this years conference. In some cases, the arguments and points are more refined than ten years ago, but a remarkable number of those talks could just as easily show up on any conference program this year.

The state of the art in software testing will go nowhere if we never look beyond the basics. Highlighting flaws in simple applications written by the IT department of a small company (or parking garage) is interesting, and it develops basic skills, but it does nothing to help us understand better ways to test huge interconnected systems or services under constant load. The future software systems shown in many science fiction movies may never come to light – and it won’t be because of technology or cost factors – it will be because we won’t know how to test these systems adequately.

One way to view the current body of knowledge in software testing something like the triangle below. At the bottom of the triangle is the “101” level of software testing. This is important stuff. It’s the basis of the skills, approaches, and techniques that are the core of software testing. My triangle is disproportionate, as his is probably where 99% of the software testers spend their time (ok – to be fair, some never even make it on to the triangle). The big problem here, as I said above, is that this enables the advancement of our future, but does little to nothing to help the future.

testing triangle

The middle section of the triangle – new ideas and approaches do occur semi-frequently, but we more often than not view them as game-changers when the impact on the future is relatively minimal. Like the bottom section of the triangle, we need  thinking in this area, but it’s not going to take us where we need to go. The top section of the triangle contains the true game changers. The new thoughts (or more likely, thoughts and ideas from some other area applied to testing) that will bring us into a new era.

Let me return to my worry (and a problem I mentioned above). My fear is this: The software testing we are doing today will be inadequate for the software of tomorrow. It doesn’t matter how good we are at the basic skills, it won’t be enough. What I fear is going to happen is that we will go ahead and create the software of tomorrow but we’ll try to test it like we test software today. For example’s sake, envision the systems of “Minority Report” – most of the pieces of technology in this movie have been demonstrated, but never with the flawless interconnection that Hollywood showed us, Do we have the skills and tools (and people) to test such a system today? Not. Even. Close.

What I’m afraid will happen, is that we’ll eventually build a similar system, and we will try to test it with the tools, approaches, processes (and many of the people) who are testing software today.

And there will be massive software failure, and very bad things will happen. At least then we may find enough motivation to advance the state of the art in testing.

But we don’t have to wait. If you’ve take a reasonably close look at the software of today you know that we could use some innovation and advancement now. The right thing to do is to start moving beyond the basics and figure out how we can do better testing today. How do we test massively complex systems? How do we do so efficiently? Are the software testers of today the right people to test the software of the future?

Moving up the triangle is a hard move to make. Testing consultants make their money in the bottom of the triangle, so why would they have any motivation to make a move outside of the cash flow? But it’s not the consultant’s fault either. The testing profession has a high enough turnover that there are a huge number of new testers every year looking to get their foothold in the bottom rung of that triangle. Also consider that most of the testers today aren’t testing huge systems used by millions of users – they’re testing one-off IT apps used by hundreds (or dozens) of people who don’t care if the software has a few minor issues. There’s no need (or time) for most of these testers to ever get out of the bottom rung of the triangle.

It’s vicious circle, but we have to find a way out.The future depends on it.

I wish I could say that my next blog post will have all the answers. (it won’t). Instead, I’m going to continue to study, learn, and experiment. Somewhere out there is the knowledge that we’ll need to survive the software of the future, and I want to be part of making that future software successful. I hope that some of you can help make it happen as well.

Tester DNA

August 28th, 2010

In HWTSAM, we (Ken, actually) talked a bit about tester DNA – that bit of mental goo that makes some people better (or at least more prone to being) testers. As I’ve been talking to (potential) testers lately, I’ve had a chance to dwell on this a bit more. What is it that makes a good tester? Given that the answer to that question requires context, I won’t answer that exactly – instead, I thought I’d share some of my thoughts on what I look for in testers.

Testing is broad and evolving – nobody knows everything about it, so the ability to learn quickly is critical (this aids in problem solving as well). If you’re the type that takes a long time to ramp up in new technologies or concepts, you may struggle as a tester. Probably more critical is a passion for learning. Good testers don’t wait for ideas to come to them, instead, they seek out knowledge  – they not only learn what they know they need to learn, they find ways to learn what they don’t know (i.e. they strive to resolve second level ignorance). I believe that the big innovations in testing will come from applying knowledge from outside the field of software and software testing. In order to advance the state of the art in testing, we need testers who seek knowledge – and who are able to apply those abstract concepts to solve some of our big problems in test.

But – to take care of that last sentence, you’re going to need people who can see the big picture – systems thinkers. There are numerous people who claim to be systems thinkers, but systems thinking takes practice as well as some innate ability (or DNA) to be beneficial – and it’s much harder than many people think. Often when I interview testers, I ask a “testing” question that has two parts to solve. The first (the question I actually ask them) is obvious and has a solution that is difficult enough that they solve it as they would any other question. However – there’s a hidden problem in the question. The good testers quickly see the secondary problem as the far more difficult problem to solve and focus their answer on solving the underlying problem. These are the systems thinkers – they know to look at the whole rather than the parts and know that understanding interactions and patterns are keys to good problem solving. The great testers – and there are only a few of these – can actually solve the problem reasonably well (frankly, I worry about testers recognizing the problem more than solving it, but I’m frequently impressed by testers who nail every aspect of this question).

nitpickers moment – for those of you who will take this opportunity to gripe about SDETs, no part of solving this question relies on programming skills. It does require that you can think and see beyond the obvious. I’m not going to put the question on my blog, because I still want to use it. I would be happy to discuss it with you privately (or via an IM session) if you’re insanely curious.

There are numerous other skills I look for in testers, but I consider those to be supportive and of the “more is better” category. For instance, if you are completely disorganized, you may not be successful, but you don’t have to be the most anal note taker either. You need some degree of organization, self motivation, confidence, and trust to be successful, but those really only show up on my radar if you truly suck at them.

As I re-read this post, I realize that the things I mentioned above are also the things that make testers successful in the long term – so it makes sense that’s what I look for when hiring testers. And I think that’s good!

Some thoughts on remote work

August 19th, 2010

Recently, I spent two weeks working remotely (i.e. far, far away from my office). The opportunity was there, and I happen to work on a product that makes working (and interacting) remotely straightforward. Since I know some of you who read this blog work remotely (and others would like to), I thought I’d share some of my thoughts on the experience.

First off, I can say that I was highly productive – I got a huge amount of work done. But – I have to admit, it wasn’t the same work I would have done if I was in the office. I worked on a lot of semi-deep technology problems (e.g. tools, implementations, strategies, processes, etc.) that I would have done in spurts over the next several months. It’s nice to have the research and groundwork done now, and in the long run, it will be beneficial that I got a jump start on this work. I just would have done less of it and more interacting if I was in the office.

I realized this after just a few days of work, and have spent most of the time since then pondering why this is the case. A big part of this is my role on the team. I don’t own a testing area (or even a testing technique). In some ways, I’m a consultant for our test team – answering questions, giving advice, and providing guidance where it’s requested or needed. But the questions and advice and guidance rarely come out of a well formed email. Most develop out of informal conversations – many of which I stumble upon in the hallway, by the coffee machine, or in the lunch room. When you’re working remotely, you usually don’t have good opportunities to take part in these conversations (note, that I have ideas now, and we’re working on them internally).

I attended meetings remotely (video and desktop sharing worked great), and for the first time in my career, I actually wished for more meetings, as I was able to participate just as if I was in the room (including my typical smart-ass comments). But by the time I finished the two weeks of remote work, I was starting to feel pretty isolated. I had proposed a few discussion topics to team members, but surprisingly, most preferred to wait for the discussion until I returned (I think that’s a redmond culture issue that we need to think about). I did spend almost the entire first two days I was back meeting with people and talking about all of the sorts of things that don’t end up in email, so it was nice to feel immediately connected.

My hunch is that working remotely would be a no-brainer for people with more “task-oriented” work (e.g. a job solely focused on testing, development, or writing) – as long, of course, as they had the discipline to focus on work. I’ll have to track down a few industry colleagues who work remotely some time and ask about their experiences and thoughts.

I also think that remote work can be an option for me – but it will take some culture change (and technology tweaks) to be completely successful. As it turns out, after those two days in Redmond, my parents needed me to visit and help out for a while, so I’m working remotely again. Even two days in, it’s been a better experience, and think I’ll continue to learn a lot (it also helps that I’m just two hours away, and can, and will, commute to Redmond frequently when needed).

If you have your own thoughts or experiences, I’d love to hear about them.

ET and Me

August 15th, 2010

I’m a big fan of exploratory testing. I’ve used the approach long before I knew what it was called and think it’s the core of good testing. it’s so ingrained in the general approaches I’ve used in my career that I often don’t differentiate between exploratory testing and plain-old-testing (I’ve gone as far to say explicitly that all good testing is exploratory in nature, but as with most times I’ve used the word “all”, I’ve found exceptions).

I’ve been working on my ET skills for over 18 years, so it’s nice to see so much recent emphasis on the approach in blogs and books. As I said, I think it’s the core of good testing, so a strong foundation in ET can only help testing improve overall. As with any hot topic in any field, there are many strong advocates, and there are definitely “camps” of thought on what exactly ET is. On a side note, those of you who know me, know that I am certainly not afraid to take jabs on just about any of the subjects I’m passionate about – both in person, and in this blog. I once made a small (a few words) comment in a blog post about ET and “belonging to a club” that brought down a deluge of email reactions that I still ponder frequently. To this day, I both regret the remark (it was a thoughtless jab), and remain somewhat dumbfounded by the reactions to the comment.

Anyway…I recently introduced ET to my (still sort of new) team at Microsoft. Actually, I didn’t really introduce it as much as I revealed it, as I discovered a natural talent throughout the team that went far beyond what I was able of teaching them.

But let me back up a bit…

I have given several presentations to my team on a variety of testing topics. I’m a big believer in testers having a big “toolbox” and knowledge on when and where to use those tools. I noticed early on in my time on the team that some testers would get so caught up in “running tests”, that they forgot to engage their brains and think about what they were doing. I also noticed that although most testers were experts in their feature area, that some had limited knowledge of the rest of the product. So, at a regularly scheduled brown bag (lunch time tech talk), I gave an intro / overview on ET. To follow up, I asked if there were any volunteers who would like to take part in an ET session with me to practice (with the secondary goal of learning more about the overall product). I quickly had a group of four volunteers, so I set up a time, and we were off to the races.

The format was:

  • 90 minute meeting.
  • I took the first 10 to talk about our goals for the meeting (Learn ET, Learn about the product, and Learn about tools that may help us with ET). Finding bugs is a probable, but not necessary side effect of these goals.
  • For the next 75 minutes, we tested together.
  • Last 5 minutes was a quick debrief and sharing of thoughts.
  • I took notes during the meeting on what we discovered (and then wrote them up for distribution afterwards).
  • At the first session, I didn’t know at all what to expect. I didn’t know if people would learn, and didn’t now if we’d find bugs (I worried that if we didn’t find bugs that people wouldn’t see it as successful). I was also worried about engagement – what if people got stuck and “checked out”?

    It turns out that I didn’t really need to worry. The room rocked with engagement. We agreed on random place within the application to start, I threw out a few ideas, thought out loud for a moment, and before I knew it, I couldn’t keep up with the comments and bugs and excitement in the room. I was blown away how well it went (again, it was nothing to do with me – I just pointed them in the right direction).

    Based on the success, I tried another session. The same worries came to mind. The first session went so well that I had a high bar to live up to, and figured that it may have been due to the “early adopters” who signed up so quickly for the first session. Once again, I was proven wrong as a completely different set of people filled the room with ET energy.

    Since then, I’ve moderated four more sessions (including one via teleconference ) and have had similar results in each of them. Better yet, some of those attendees have conducted their own ET sessions within their own team (and had similar success). Individuals are also using the approach outside of specific sessions.

    Some points to share include:

    • We’ve had 3-4 attendees (plus me) for each session, and I’m pretty happy with the learning experience. I think this size group is big enough that people can learn from each other rapidly, and small enough that everyone gets to be heard. I also don’t think I could keep up on notes with a larger group.
    • The session length also seems to work well. People are engaged throughout and there’s barely a slow down before the session ends (one attendee mentioned that they felt guilty because the session didn’t feel like work!).
    • Everyone should get used to thinking out lout – especially early in the session. It helps people learn and build off of each other’s ideas
    • The focus on learning has worked well for us. I think that anytime testing focuses on finding bugs, that it veers off track

    And that’s pretty much it. We’ll definitely continue to hone our ET skills, and I’m sure we’ll continue to experiment, add to our skill set, and tweak our ideas and processes as we go.

    Scaling Code Coverage

    August 12th, 2010

    I’m going to do one more (I think) post on this subject. Markus asked the following about my latest post:

    But how do I answer the question for “what’s the coverage of your testing?” for a multiple component-based application, consisting of a C/C++ major component, an application server, and customized business logic?

    The answer is sort of in this post, but it’s worth elaboration.

    First and foremost, if someone asks “what’s the coverage of your testing”, you can answer in multiple ways. You can say “we’ve tested the requirements”, or “we have an average of 70% line coverage throughout the product”, or “we’ve covered all of the key customer scenarios”. But what you probablyshould ask is, “What do you really want to know?” Explain that “testing coverage” probably isn’t the best way to think about it, and “identifying missing tests” is a better train of thought.

    The sample I used was a simple command line app, but we usually test much larger systems, so let me also explain how you could scale code coverage from my previous example to a larger application. The solution I prefer isn’t complex – you just start at a high level, then drill down until you get to something actionable. Let’s say my overall product code coverage is 60%. That tells me that 40% of my code is untested. For the sake of easy math, let’s say the application has a million lines of code. That’s 400,000 untested lines of code. Which ones do we look at first?

    Let’s drill down into the four main feature areas of the (fictitious) app.

    Feature Area Priority Lines of Code Code Coverage
    UI Controls 3 100,000 60%
    Engine 1 400,000 60%
    Web Server 2 400,000 70%
    Utilities 4 100,000 50%

    OK – I notice that the Utilities only have 50% code coverage, but it’s the lowest priority, so I won’t start there. Instead, I’m going to look for testing holes in our pri 1 area (Engine). Let’s dig in again.

    Engine Drill Down Lines of Code Code Coverage
    Core 200000 60%
    File Access 75000 75%
    Protocols 50000 53%
    Filters 75000 56%

    Protocols are the worst (although we’ll definitely need to look at Filters as well at some point). Digging in further tells us …

    Protocols Drill Down Lines of Code Code Coverage
    XML Transform 20000 80%
    Format Codes 10000 55%
    Models 20000 40%

    Looking like I’m missing test cases for Models…

    Models Drill Down Lines of Code Code Coverage
    File1.cpp 1000 5%
    File2.cpp 1000 70%
    File3.cpp 2000 70%
    File4.cpp 500 60%
    File5.cpp 500 60%

    oh wow – I’ve barely tested the functionality in one of the files (File1.cpp). One step deeper confirms the story.

    File1.cpp Drill Down Code Coverage
    SomeFunc 10%
    SomeOtherFunc 0%
    SomePeculiarFunc 0%
    ImportantFunc 3%

    From here, I’d look at each of the functions in this file and see what testing I’m missing that may help me hit this code. Once again, all I’ve done is used the code coverage data to guide my testing. Once I’ve looked at what I may be missing in File1.cpp, I may back out and look at Format Codes, or I may back all the way back to looking at Filters. I still use my knowledge of risk and priority to tell me where to investigate, then use the coverage data as a tool to help guide me in the right direction.

    An important thing to reiterate, is that I’m still not concerned with improving the code coverage number – I just want to use the data to see what I’m missing. Sure, the number will go up anyway, but it’s not the point. The point is reducing risk by identifying missing test areas.

    Code Coverage with Manual Testing

    August 11th, 2010

    Here’s a tip. Want me to write about something in particular? Ask me a question. It’s that easy. Tim Coulter had a question about my last blog post. He asked:

    Would love to hear how you check code coverage of manual testing, actually. Is that just “feature coverage” (break it up into logical blocks and test those)?

    I realized that a lot of people may not be measuring code coverage during manual testing – but it’s quite valuable to do so, as measuring code coverage still helps you find missing tests.

    Let’s consider a silly command line app that does math operations. It takes 3 parameters, the first two are the values to operate on, and the third parameter is the operator to execute. For example, a command line of 2 3 + would return a value of 5. Let’s test it (manually).

    c:\example>calul8r.exe
    Usage:
    calcul8r [value] [value] [operation]
    for example:
      calcul8r 4 2 +
    
    c:\example>calul8r.exe 2 3 +
    The result of 2 + 3 is 5
    
    c:\example>calul8r.exe 4 2 -
    The result of 4 - 2 is 2
    
    c:\example>calul8r.exe 5 5 *
    The result of 5 * 5 is 25
    
    c:\example>calul8r.exe 15 3 /
    The result of 15 / 3 is 5

     

    OK – looks like it sort of works. But are there test cases we forgot? Let’s look at what code coverage tells us.

    1 int DoMath(int val1, int val2, int valOperator)

    2 {

    3       int result = 0xbaadf00d;

    4       switch (valOperator)

    5       {

    6       case ‘+’:

    7             result = val1 + val2;

    8             break;

    9       case ‘x’:

    10      case ‘*’:

    11            result = val1 * val2;

    12            break;

    13

    14      case ‘/’:

    15      case ‘\\’:

    16            if (val2 ==0)

    17            {

    18                  printf("Invalid input. Cannot divide by zreo");

    19            }

    20            else

    21            {

    22                  result = val1 / val2;

    23            }

    24            break;

    25

    26      case ‘-’:

    27            result = val1 – val2;

    28            break;

    29

    30      default:

    31            printf("Unknown oprator – %d\n");

    32            break;

    33      }

    34      return result;

    35 }

    The “meat” of this app is in the DoMath function, and the results show that I missed a few lines of code in my first round of tests. The first two lines I missed (lines 9 and 15) are a bit of discovery. It looks like the app can take the letter x as well as an * (asterisk) for multiplication, and a backslash (\) as well as a forward slash (/) for division. The functionality is the same, but it’s interesting nonetheless.

    Oh – but I was stupid and forgot to test for divide by zero. It’s nice that there is some error handling for that, but if you look closer, you’ll see that executing this particular test would reveal a typo in the output string.

    I also forgot to test with an invalid operator – and wouldn’t you know it, there’s another typo there. What I’ve done is used the code coverage data to guide my design for new tests. This app is simple, but the idea works on a larger scale too. I’ve known testers to discover large areas of functionality they weren’t aware of before, or (more often) hidden functionality within the areas they had tested extensively.

    An important thing to remember is that once I’ve added those test cases (and reported those bugs), is that I’ll have 100% code coverage. Remember though, that 100% code coverage doesn’t mean bug free. In fact, this app doesn’t check for integer overflow. Sure enough, check this out:

    c:\example>calul8r.exe 2147483647 1 +
    The result of 2147483647 + 1 is -2147483648

    Oops! Remember – just because the code is “covered”, it doesn’t mean it’s tested. Please don’t forget that (and don’t let your managers forget either).

    Thanks Tim, for prompting this post – I hope you (and others) find it helpful.

    An Approach to Code Coverage

    August 9th, 2010

    My team at Microsoft is starting to use code coverage a little more diligently. Code coverage has been used for some time on the team, but we’re just now getting to a common approach and recording mechanism. There are a few things about our approach that are a bit different than I’ve seen most other people use, so I thought I’d share it here.

    As a quick aside, we measure code coverage using block coverage (block coverage is similar to line coverage except that it groups continuous non-branching statements into a block for counting purposes). I’ve found that bullseye has a very good primer on code coverage measurement for those who are interested in more than I feel like writing about today.

    A small group of us worked on the toolset, process and strategy for our team’s use of code coverage. One topic that came up was what target percentage of code coverage we should set as a goal. Years ago, when I was on the CE team, I helped build and deploy our code coverage toolset. Once our tools were working and we measured coverage, we set a goal for the next release (65% if I remember correctly). We cranked up the number a few percentage points each release, and I think we were approaching 80% by the time I left the team. As I type this, I remember something about making a promise to shave my head if we got higher than 80%. That’s only funny because about a year ago, I did shave my head (it’s since grown back).

    That history is interesting, because I had been anticipating the question (target for code coverage percentage), and I was (and am) adamant about the code coverage goal I would like our team to shoot for.

    I’m adamant that we have no goal for code coverage.

    The problem of having a target goal for code coverage is that code coverage will improve – i.e. you get what you measure. Wait a minute – shouldn’t improving code coverage be a good thing? It is (for reasons I’ll explain lower), but here’s what usually happens in practice.

    Let’s say I own three features – one major feature with high customer impact, one area with low customer impact, and one that’s somewhere in the middle. Now, let’s assume that my team has a goal of 70% code coverage (measured as an average across the components I own). I measure code coverage, and here are my results.

      Feature 1 – High Impact Feature 2 – Medium Impact Feature 3 – Low Impact
    Code Coverage 70% 60% 50%

     

    Now, if your goal is to get your average to 70% (or even if your goal is a minimum of 70%), you are going to put your testing efforts into testing medium and low impact areas. Shooting for a code coverage goal can convince testers to throw out their prior knowledge of risk and impact, and instead focus on “improving the number” – by testing stuff that probably doesn’t need more testing, while ignoring potentially important stuff. If you never measured code coverage in the first place, would you really spend time doing additional testing on the medium and low impact areas of the product (with more effort needed for the low impact area)?

    It’s time to discuss the goal of measuring code coverage. It’s foolish to think that higher code coverage has anything at all to do with product quality. It only measures whether code is executed on at least one path. One of my common “Alan-isms” is this:

    The only thing that 80% code coverage tells you is that 20% of your code is completely untested

    What code coverage does do is point you to holes in your testing. The goal of measuring code coverage is helping you (as a tester) understand what is not being tested. Once you discover what’s not being tested, you can make a choice based on risk and impact on whether you need to add additional tests – or if your time is better spent elsewhere. By not having a percentage goal for code coverage, I hope the team can focus on improving tests and testing rather than improving a number.

    By the way – you can replace the words “code coverage” with “test automation” above and tell a pretty similar story.

    Another thing we’re starting to do is measuring code coverage on check-ins. We’re fairly late in the cycle now, so we want to be careful of regressions and ensure that all of the code coming into the product is well tested. What we do is filter our code coverage view to only look at changed lines of code in the changed list. Say we have a binary that has 10k blocks of code, but the latest checkin only changed (or added) 25 blocks. We can filter our testing and coverage on just those 25 blocks and ensure that we’ve at a minimum executed each line at least once during our initial testing. This helps a lot with regressions, as we can ensure that we look at error cases and other little used paths very close to the checkin rather than waiting until those errors pop up at a much later date.

    This gives us the potential for 100% code coverage on every changed line of code.

    But I would never make that a goal :}

    The broken bullet anti-pattern

    August 4th, 2010

    I’ve been meaning to write about this particular anti-pattern for a while now, as I think it contributes far too much to the lack of progress in advancing software testing. Up until five minutes ago, I called this the anti-silver bullet theory, in reference to Fred Brooks 1986 paper, No Silver Bullet as much as the Silver Bullet idiom in general. “Silver Bullets” refer to solutions that are extremely (or completely) effective for a given situation (like killing werewolves). In software, there are no silver bullets – there’s no practice, tool, language, approach, technique, or whatever that will solve all of your problems for you (and if some tool vendor tells you differently, don’t believe them!)

    The broken bullet is the backwards version of the silver bullet. In the broken bullet anti-pattern, people dismiss ideas, approaches, etc. just because they are not a silver bullet. I caught myself applying the broken bullet a few weeks ago in a discussion about GUI automation – I don’t like most GUI automation because it’s fragile and rarely achieves enough return to justify the investment and maintenance, but I made the mistake of dismissing GUI automation as a solution just because I knew it didn’t work everywhere, and it was easy to get wrong – even though it could have worked well in this particular situation. Fortunately I caught myself before I made too much of a fool of myself.

    Unfortunately, many others seem to embrace this anti-pattern regularly. The conversations usually go something like this:

    Tester: hey everyone, I’m checking out floober as a test approach – seems like it will help me

    Broken Bullet: don’t waste your time – floober is a mostly a myth and doesn’t work unless you use your brain. Here, I wrote a paper…

    Tester: thanks for the help – I’ll go back to what I was doing before

    Broken Bullet: no problem  – glad to keep you on the right track

    Sometimes it’s more proactive. I can’t go a week without seeing an article or blog post saying “Don’t do X” – “here are all the ways it can go poorly for you and ruin your product / team / company / life. Stay away and don’t even think about X”

    The problem is, that X (and floober for that matter) do work (if used carefully), and may be good solutions for some teams (and likely great solutions for others) – but will likely never get the attention they deserve because of broken bullets.

    My call to action (if you care) is this: If you believe in No Silver Bullets- that there are no magic solutions to solve your software challenges, then you should also believe that there are no (ok, few) universally bad practices. Some practices are indeed much easier to get wrong, but that should only scare you – not stop you.

    And the next time you see someone dismiss something because it can fail, tell them to take their broken bullets and leave you alone.