Tooth of the Weasel

Failure to Launch

Posted on May 11, 2017 by Alan Page

In my role on Teams, I was “in charge” of quality – which eventually turned into everything from the moment code was checked in until it was deployed to our end-users. At one point during development, we had a fully usable product with no known blocking issues. We were missing key features, performance was slow sometimes, and we had a few UI tweaks we knew we needed to make. In what would seem like a weird role for a tester, I pushed and pushed to release our product to more (internal) users. Those above me resisted, saying it “wasn’t ready.”

I was concerned, of course, in creating a quality product, but I was also concerned whether or not we were creating the right product. I wanted to know if we were building the right thing. To paraphrase Eric Ries – you don’t get value from your engineering effort until it’s in the hands of customers. I coined the phrase “technical ‘self-satisfaction'” to describe the process where you engineer and tweak (and re-tweak) only for you or your own team. While the product did improve continuously, I still believe it would have improved faster, had we released more often.

In my previous post, I talked about how it’s OK to wait for a future release to get users that next important feature. While I truly believe there’s no reason to rush, I’m absolutely not against getting customers early access to minimal features (or a minimum-minimum viable product – MMVP).

The decision on whether to release now or later isn’t a contradiction. It’s a choice (mostly) of how well you can validate the business or customer value of the feature in use – (and if possible, or necessary, remove the feature). If you have analytics in place that enable you to understand how customers are using the feature, and if that feature is valuable, it’s a lot easier to make the decision to “ship” the feature to customers. On the other hand, if you’re shipping blind – i.e. dumping new functionality on customers and counting on twitter, blog posts, and support calls to discover if customers find value in the feature, I suggest you wait. And perhaps investigate new lines of work.

One thing I consistently ask teams to do during feature design is to include how they plan to measure the value of the feature to customers or business value. Often, only a proxy metric is available, but those work way better than nothing at all. Just as BDD makes you think about feature behavior before implementation, this approach (Analysis Driven Development?) makes you think about how you’ll know if you’ve made the right thing before you start building the wrong thing.

Short story is that an analytics system that allows you to evaluate usage and other relevant data in production, along with a deployment system that allows you to quickly fix (or roll back) changes means that you can pretty much try whatever you want with customers. If you don’t have this net, you need to be very careful. There’s a fine line between the fallacy of now, and a failure to learn.

The Fallacy of Now

Posted on May 4, 2017 by Alan Page

A long time ago (for most of us), we built products over a long period of time, and hoped that customers liked what we built. It was important to give them all of the features they may need, as we wouldn’t be able to get them new features until the next release, which was usually at least a year away.

Today, we (most of us) try to understand what features our customers need and we try to please our customers. We’ve moved from we-make-it-you-take-it releases to we-listen-and-learn releases. We ship more often – most apps under development these days ship new features quarterly, monthly, or even more often.

The Joy of Features

One thing I’ve seen throughout my career is the excitement of getting a feature to customers. It’s a wonderful feeling to give a customer a feature they love. Maybe they’ll tweet about it; maybe they’l blog; maybe news of the feature will trend on reddit! For whatever the reason, getting features to customers is such an exciting task, that it sometimes can overshadow stability.

A decade or more ago, this mindset made sense. If the feature doesn’t “ship” now, it will be years before it’s available. But it’s just not as important today. If your product ships monthly, and you decide a feature isn’t ready for the monthly release, it’s a maximum of two months (assuming you canceled the feature on the first day) until the feature gets to customers at the end of the second month. Yes, two months is forever. And if you have a quarterly release, six months may seem like a million years. But my bet is that if you gamble and try and shove features in early, you’ll end up with a pile of half-finished features that don’t help customers at all. If you’re really, really lucky, you won’t lose too many of them.

The challenge is a mindset problem. People are motivated by progress, and seeing your feature move through the pipe is exciting. But if you have a predictable ship schedule, it’s like missing your train. If you miss your train (in any city with reasonable public transportation), you don’t cry and freak out, because you know another train is coming along soon. If you miss getting your feature into this month’s release, you know you’ll make the next release. It’s ok to miss your train.

When I worked on MS Teams, we shipped a new web front end every week. Every seven days. Still, every week someone requested that we hold the release for one more feature. If you wait one day, they said, we can make this really cool thing happen. Every week, I said, “nope – the really cool thing can happen as scheduled next week”.

Now

Whatever you’re doing, it doesn’t need to happen now. For 99.9% of the features you’re working on, your customers don’t need it now. They may ask for it, or you may tell them about it and excite them, but 99.9% of the time, they can wait. I know at least some of you are disagreeing, but I’m going to be a dick and just tell you that you’re wrong.

The point is, that I believe that customers (you remember that we make software to help people, right?) want to trust our software. Sure, they want new ways to solve their problems and new functionality that makes their experience awesome, but they want it to work. We certainly don’t need to make perfect software. However, we need to weigh whether the positive value of the new feature minus the distraction of unreliability or other flaws still results in a net positive for our customers.

It’s a challenge to get this balance right. It’s something I enjoy doing from my role (whether it’s QA, release management, tester, or whatever suits the situation). It’s hard, and it’s sticky, and it’s a big systems problem.

And that’s why I like it so much.

Two new…schools?

Posted on April 20, 2017April 20, 2017 by Alan Page

It’s been six years since James Whittaker proclaimed that test was dead (and many people still haven’t figured out what he really meant), and since then testing has continued to change dramatically.

For some of us.

But not for others.

Earlier this week, I linked to an article titled “Stop Hiring Testers”. The title, as you can imagine, scared the pants off of a bunch of afraid-for-their-job testers, but the content embraced the change in testing that many of us have been riding for a while now. Every time I see a person, an organization, or a company realize that they’ve left their old world of testing behind, I see an article like this, along with the accompanying cries of those who don’t want to admit that change is happening.

One thing that strikes me is that every time someone makes a statement like “test is dead”, or “stop hiring testers”, a lot of testers scream; but some inevitably say, “of course – this isn’t anything new to me”.

But that’s expected, I recognize that there are different worlds of testing out there. For lack of a better word, I see two different schools (which sucks given the existing connotations with this word, but is applicable considering that I could see testers self-selecting for these schools) of test prominent in discussions about testing on the internet.

The first, and likely the larger is the “traditional” test-as-information-provider (and often, test-last) school. This is how most (I have no metrics to back this up) testing is probably done. Most testing books, as well as the wikipedia page on software testing describes many aspects of this approach. Testing in this school is typically done by an independent test team, and involves a wide variety of analysis into the state of the software in order to provide information about the product for stakeholders. Admittedly, I’m quite familiar with this school, as I documented Microsoft’s version of this approach in How We Test Software at Microsoft. Attention to requirements, and risk-assessment are often key parts of this approach. I’d guess that a test phase of software development is prominient within practitioners of this school regardless of whether the team considers themselves “Agile” or not.

This school isn’t wrong by any means, but I believe it’s fading away. For what it’s worth (using replies to the Stop Hiring Testers post mentioned above as data), people in this school generally don’t like the idea that it’s going away.

Plus ca change…

What I’m seeing more and more of, is the test-always, or test-as-quality-accelerant – or maybe, with a nod to Atlassian, the quality-assistance school. In more and more teams, testers are taking on the role of quality assistant, quality coach, or quality experts. Rather than existing in a separate team, they are members of feature teams and take on the responsibility for ensuring that the entire team embraces a quality culture. Yes, they absolutely bring testing expertise to their product or feature area, and they test the product more than anyone else on the team; but they also help other developers write and improve their own testing and think a lot about the customer experience. Good developers are extremely capable of writing tests (including service monitoring and diagnostic tools) for their own code, but testing experts ensure that the end-to-end experience and other big picture items (e.g. load or perf) are covered as part of the daily work.

Teams with testers like these typically do not have a release phase since they are (typically) releasing often or always.

It’s worth mentioning that some testers would say that when you do many of these activities mentioned in the previous paragraph, you are no longer testing. I agree with this sentiment – and believe it’s part of the evolution of the role. As test teams evolve towards the quality-accelerant school, test “experts” will take on several roles beyond testing. You could say that testers in this school are getting themselves into the quality assurance business.

Definitely some food for thought here, and something I expect to ponder (and write) about more.

Why Unity?

Posted on March 24, 2017March 24, 2017 by Alan Page

A lot of people, both co-workers and not, have asked me why I “chose” Unity for my post-Microsoft career. Although I documented (sort of) why I broke up with Microsoft, and talked in a few other places about my role, I guess I haven’t publicly shared why I’m at Unity (vs any other tech company).

I had musings about leaving Microsoft for quite a while. In fact, I composed a version of my breakup blog post in my head as far back as 2010 when I almost took a role at another large tech company. It may be worth sharing that I almost accepted roles at several other companies over the years, and halfway pursued several others as well.

So, why Unity and why not any of those other companies?

The short answer is that I’m picky, and that Unity is my unicorn.

I wanted a role that was both challenging and where I could leverage my experience. I wanted to work in services (or, at the very least, in an Agile environment). I didn’t want to move, and I didn’t want to travel too much. I wanted to work on a product with avid and vocal customers.

Then along came Unity – a second time, actually, as I first talked with my current manager about Unity over a year ago. The role at the time, wasn’t quite right for my strict requirements, but when we restarted our conversation a few months ago, Unity felt a lot more like a good fit. I met with a few people (all people I now work with almost every day). I liked them, and they liked me. After a few short conversations with HR to get the financial side in order, and after a few longer conversations with my family about the move, I accepted the offer.

As soon as I made the decision, I knew it was the right one. I’ve had zero regrets, and every day I feel better about the move. So much to do, but so excited to do it.

That’s why.

Testing Smarter

Posted on March 17, 2017March 17, 2017 by Alan Page

The folks at Hexawise just published an interview with me as part of their new “Testing Smarter” series. It’s the typical stuff plus one mini-rant.

The interview is here, and there’s a(n empty) reddit thread as well.

Forty days in

Posted on March 10, 2017 by Alan Page

Calendar math says I’m a few hours into the 39th day since I started at Unity. I’m still well within the honeymoon window, but my optimism and excitement about working here continue to grow. It’s just a pretty damn cool place to work.

I’ve met most of my team face to face, and will spend more time with team members in Copenhagen and Helsinki next week. I still have half a dozen people in Austin and Montreal whom I will track down and meet in person sometime before I hit my hundredth day. I’ve spent a lot of time learning about Unity (I made a simple game), and a bigger chunk of time learning how Unity services work – and more importantly how they’re built, deployed, and tested.

I’ve also spent a lot of time thinking about my role – or in general, the role of a manager of a team of “embedded” testers. Organizationally, I’m the Quality Director for all Unity services – but everyone on my team is an integrated member of a feature team (for the record, one minor complaint I have with the phrase “embedded tester” is that it can sound like a foreign body inserted into a functioning team rather than a test and quality specializing generalist who is an equal member of a feature team). I’ve embraced the words of Steve Denning in The Leader’s Guide to Radical Management, and provide a framework for the team – and then get out of their way (quote likely plagiarized, but I can’t recommend Denning’s work enough, so take the time to read it yourself).

In a search to find the exact quote (I have a hard copy of the book…but at home…), I wasn’t surprised to see that I’ve heaped praise on Denning before and expanded on the phrase above.

Give your organization a framework they understand, and then get out of their way. Give them some guidelines and expectations, but then let them work. Check in when you need to, but get out of the way. Your job in 21st century management is to coach, mentor, and orchestrate the team for maximum efficiency – not to monitor them continuously or create meaningless work. This is a tough change for a lot of managers – but it’s necessary – both for the success of the workers and for the sanity of managers. Engineering teams need the flexibility (and encouragement) to self-organize when needed, innovate as necessary, and be free from micro-management.

Given that my team is globally distributed, working on a large number of feature areas, and are highly skilled and motivated, any approach within five-hundred miles of anything resembling micro-management would be silly. My role will include helping the team balance priorities; to facilitate learning, collaboration and community; coaching; communication; and, of course, a bit of management administrivia (budgets, approvals, planning, etc.).

I’ve been taking some notes on some of the biggest differences I’ve noticed (culture, practice, tools) between my job for the past 40 days, and my job for the previous ~8000 days. Someday soon, I’ll dedicate an entire post to these observations.

While talking with a dev lead earlier this week, I told him I felt like I was on the edge of almost being briefly effective. I’ll try to keep heading in that direction and post the victories (and setbacks) here as they happen.

Oh the tests I’ll run

Posted on February 27, 2017 by Alan Page

Last week, Katrina Clokie (@katrina_tester) asked this question on twitter:

Has anyone dynamically ordered automated checks so that those most likely to fail are executed first, then the build can fail fast?

— Katrina Clokie (@katrina_tester) February 24, 2017

I gave a few abbreviated answers based on my experience, and promised to write up a bit more, as this is something I’ve played with quite a bit before. I sort of meant to copy and paste an email I sent to an internal team (at msft) a few months back, but alas – I don’t have access to that email anymore :}

Why select tests?

A lot of people will ask, “If I can run every test we ever wrote in five minutes, why does this matter?” If this is the case, of course it doesn’t matter. I’m all for parallelization and leveraging things like selenium grid to run a massive number of tests at the same time; but it’s not always possible. On the Xbox team, for example, we had a limited (although large) number of systems we could use for testing, so test selection / prioritization / ordering was something we had to do.

Basics of Test Selection

OK – so you’ve decided you have need to select the tests most likely to fail to run first. Easy(ish) – just run the tests that exercise all the code that changed!

This is quite a bit easier than it sounds – that is if you’re already using code coverage tools. Now is a good time to remind you that code coverage is a wonderful tool, but a horrible metric. Test selection benefits from the former part of that statement. For every automated test in your suite, periodically collect the coverage information just for that test, and save it somewhere (I suggest a database, but you can shove it in json or excel if you feel the need). Now, you know exactly which lines, functions, or blocks (depending on your coverage tool) are hit by each test.

The slightly harder part may be to figure out which lines of code have changed (or have been added, or removed) since the last time the tests were ran (which may be the last check-in, the last day, or longer). I can leave it as an exercise to map source control information to the same database / json / excel as mentioned above, but once you have this key/map pair, test selection is just picking the tests that hit changed lines.

But there are a lot of caveats with this approach. If you’re changing a low level component, every test will hit it (as an example, I used this approach many years ago on a test system that communicated with external devices over winsock. Every change to winsock told us that we needed to run every test. While probably a correct approach, it didn’t really help with prioritization. You’ll also find that often enough, there aren’t any tests to cover the changed code – and I’ll let you figure out what to do when you have tests that hit code that was removed (hint: maybe run the test once anyway to make sure it fails).

Heuristics

What I’ve found, is that coverage is a good start – and may be enough for most teams (among teams who can’t run all of their tests quickly on every build). But adding some other selection factors (or heuristics) and applying some weights can take you a bit farther.

Some heuristics I’ve used in the past for test prioritization / selection include:

Has the test found a bug before? Some tests are good at finding product bugs. I give these tests more weight.
When was the last time the test ran? If a test has run every day for a year and never failed, I don’t give it much weight. We testers are always paranoid that the moment we choose not to run a test that a regression will appear. This weighted heuristic helps combat the conundrum of running the test that never fails vs. fear of missing the regression
How flaky is the test? If you never have flaky tests, skip this one. For everyone else, it makes sense to run my tests that return false positives less often (or at the end of the test pass)
How long does the test take? I put more weight on tests that run faster.

Then I give each of these a weight. You can give each a whole number (e.g. 1-5, or 1-10), a decimal value, or whatever. Then, do some math to turn the full set of weights into a value, and then sort tests by value. Voila – my tests are prioritized. As you run the tests and learn more, you can tweak the numbers.

You can add more test meta-data as needed, but the above is a minimum. For example, with just the above, you could run something like:

run the most important tests that will complete in under 15 minutes

Using whatever command line arguments would support the statement above, you can limit the test run based on test time (and optionally add even more weight to tests that run quickly).

Probably a lot more nuance here, but the concept of test selection is probably something any tester working with automation should know a bit about.

Just in case.

The New World

Posted on February 8, 2017 by Alan Page

I mentioned on twitter that Barack Obama and I both left our old jobs on the same day. The world has been a very different place for both of us since then.

Twitter is full of politics – and I’m completely ok with that, and happy to join in with my own opinions and thoughts. I’ve marched, I’ve protested, and I’ve done a lot of learning (and re-learning) about civics, politics, and done what I can to become as educated as possible.

Meanwhile, I have a new job, and a lot to learn about a new company, code, practices, people, and culture. I absolutely love it so far, and I know I’m only scratching the surface of what there is to learn. I’ve asked a lot of questions (even a few that actually seemed smart), but I’m still at the stage where everything I discover unveils three more things I need to learn.

In eight days at the company, I’ve spent two days in San Francisco (where we’re hiring testers for my team in Ads and Analytics – ping me for more info), and two days in San Diego (where I crashed a hack week to meet more people from my team). Combine that with the work-at-home-snow-day that was Monday, and tomorrow will be just my fourth day at my desk (but I think I’m in a non-traveling state for at least another week).

My new team is spread across offices in Bellevue, Austin, Helsinki, Odessa, Copenhagen, and Montreal. I’ve been to half of those cities before, and look forward to visiting folks in the others (odds are that Helsinki is up next).

Too early to share Unity testing stories, but I’m sure those (or stories inspired by Unity) will be on the blog soon. Until then, wish me luck on trying to get a reasonable drink of water from the current firehose of learning.

Unity

Posted on February 5, 2017 by Alan Page

For those of you who missed it on Twitter or in the AB Testing Podcast, my unemployment lasted (as planned) just a bit over week. On January 30, I started work at Unity, heading up quality for their various services.

I spent most of last week trying to learn as much as I can and trying to meet and get to know as many people as I can. I expect the next few weeks will be similar.

I’m massively excited to join Unity, and know I’m going to learn a lot. It’s different than the big M, but so refreshing in so many ways.

More updates here, as they happen.

The Breakup

Posted on January 19, 2017January 27, 2021 by Alan Page

This one goes out to the one I love. This one goes out to the one I left behind
-REM

Relationships are both challenging and rewarding. As adults, we’ve all gone through dozens of relationships – some short, some long, some very long. They all have their ups and downs; their ebbs and flows; and their joy and pain. This is an open letter reflecting on the longest relationship I’ve ever had.

It’s been over 20 years (22, actually) since we first crossed paths. In the early days, I was so excited to be with you. I knew of you, of course, long before our paths crossed – but couldn’t believe we were together – you were completely out of my league. There was no reason someone like you and someone like me should have been together. But it worked for both of us. I think it worked better than either of us thought it would. For many reasons, you brought out the best in me. In some ways, you still do.

We both grew a lot in those early years. I gave you everything I had, and you were there to support me and grow with me. I was proud and excited to be your partner.

As with most relationships, there were challenges from time to time. I was mad at you sometimes – other times even livid. Sometimes I didn’t respect your decisions – sometimes I don’t feel like you respected mine – but as we grew older together, we both began to change and grow – but I guess that’s normal in any relationship.

The problem is that we’ve been growing in different directions for many years now. I give you full credit for the effort you’ve put into changing and growing these past few years, but for me, it’s just not fast enough. I know you want to change, but I feel like you just have too much baggage from your history to ever change enough where I can be happy with you. I want to race through life like a cheetah – and while you’re no longer the tortoise you were for much of our relationship, in your best moments, you’re still just a bear lumbering through the forest.

I need more.

What’s happening now isn’t your fault – and it’s not mine either. We grew in different directions, and now it’s just too much for us to recover. At least it’s too much for me.

I’ve been thinking of leaving for at least five years. It’s been easy to stay – you’re comfortable, and I feel safe with you.

But I’m not happy. I need more – I need to be challenged more; I need to grow more; I need to be around those who won’t be afraid to take risks and fail with me as I learn.

I took some time over the last year or so to try even harder to make-it-work. In the end, I just don’t feel like I’m getting the same from you of what I put into this relationship. I don’t feel like you truly care about me and value the things I bring to the table.

It’s just not working.

Could I have tried harder – of course. Would trying differently have made a difference – likely… but in the end, there are just too many incompatibilities for me to get over. It’s sad and scary, but I need to move on. I need to give myself a chance to have a life without you and see if I’m up for the challenge.

I’ll miss you. I appreciate everything you’ve done for me, but our time is over. It’s time for me to move on and see what I can do without you.

My last day at Microsoft – the company where I’ve spent the last 22 years, is Friday, January 20.