{"id":34,"date":"2009-11-15T16:47:31","date_gmt":"2009-11-15T23:47:31","guid":{"rendered":"http:\/\/angryweasel.com\/blog\/?p=34"},"modified":"2009-11-16T18:40:59","modified_gmt":"2009-11-17T02:40:59","slug":"conflicting-results","status":"publish","type":"post","link":"https:\/\/angryweasel.com\/blog\/conflicting-results\/","title":{"rendered":"Conflicting Results"},"content":{"rendered":"<p>I\u2019m a huge soccer fan, and I\u2019m happily following the MLS Cup even though the local team was eliminated last week. Last night\u2019s match between Real Salt Lake (RSL) and the Chicago Fire went to penalty kicks before one team finally prevailed. After the game ended, I went to mlsnet.com to watch the highlights and check out some of the stats. When I got there, the front page had this headline and teaser:<\/p>\n<p><a href=\"http:\/\/angryweasel.com\/blog\/wp-content\/uploads\/2009\/11\/image.png\"><img loading=\"lazy\" decoding=\"async\" style=\"border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px\" title=\"image\" src=\"http:\/\/angryweasel.com\/blog\/wp-content\/uploads\/2009\/11\/image_thumb.png\" border=\"0\" alt=\"image\" width=\"644\" height=\"89\" \/><\/a><\/p>\n<p>Quick \u2013 which team won? Did the Fire edge Real Salt Lake, or dir RSL outlast the Fire?<\/p>\n<p>If you read a bit more, you\u2019ll see that \u201cRSL will face the Galaxy in the 2009 MLS Cup\u201d, so if you go with majority rules you\u2019ll be correct, since RSL did indeed edge the Fire last night. Headline errors aren\u2019t all that uncommon (e.g. <a href=\"http:\/\/en.wikipedia.org\/wiki\/Dewey_Defeats_Truman\">Dewey Defeats Truman<\/a>), so I don\u2019t fault the news site at all. Unfortunately, a very close relative of error, the <a href=\"http:\/\/en.wikipedia.org\/wiki\/False_positive#Type_I_error\">false positive<\/a>, has been bugging the crap out of me lately, and this headline reminded me that it\u2019s past time to share my thoughts.<\/p>\n<p>Let\u2019s say you have 10,000 automated tests (or checks for those of you who speak Boltonese). We had a million or so on a medium sized project I was involved with once, so 10k seems like a fair enough sample size for this example. For the purpose of this example, let\u2019s say that 98% of the tests are currently passing, and 2% (or 200 tests) are failing. This, of course, doesn\u2019t mean you have 200 product bugs. Chances are that many of these failures are caused by the same product bug (and hopefully you have a way of discovering this automatically, because investigating even 200 failures manually is about as exciting as picking lint off of astroturf). Buried in those 200 failures are false positives \u2013 tests that fail due to bugs in the test rather than bugs in the product. I\u2019ll be nice and say that 5% of the failures are false positives (you\u2019re welcome do do your own math on this one). Now we\u2019re down to <strong>10 failures<\/strong> <strong>that aren\u2019t really failures.<\/strong> You may be thinking that\u2019s not too big of a deal \u2013 it\u2019s only 1% of the total tests, and looking at 10 tests a bit closer to see what\u2019s going on is definitely worth the overall sacrifice in test code quality. Testers in this situation either just ignore these test results or quickly patch them without too much further thought.<\/p>\n<p>This worries me to no end. If 5% of your failing tests aren\u2019t really failing,<strong><em> I think it\u2019s fair to say that 5% of your passing tests aren\u2019t really passing.<\/em><\/strong>\u00a0 I doubt that you (or the rest of the testers on your team) are capable of <em>only<\/em> making mistakes in the failing tests \u2013 you have crappy test code everywhere. A minute ago, you may have been ok with only 10 false positives out of 10k tests, but I also think that <strong><em>490 of your \u201cpassing\u201d tests are doing so even though they should be failing. <\/em><\/strong>Now feel free to add zeroes if you have more automated tests. I also challenge you to <em>examine all 9800 tests to see which 490 are the \u201cbroken\u201d tests.<\/em><\/p>\n<p>Yet we (testers) continue to write fragile automation. I\u2019ve heard quotes like, \u201cIt\u2019s not product code, why should it be good\u201d, or \u201cWe don\u2019t have time to write good tests\u201d, or \u201cWe don\u2019t <em>ship<\/em> tests, we can\u2019t make it as high quality as shipping code\u201d. So, we deal with false positives, ignore the inverse problem, and bury our heads in the sand rather than write quality tests in the first place.\u00a0 In my opinion, it\u2019s beyond idiotic \u2013 we\u2019re wasting time, we\u2019re wasting money, and we\u2019re breeding the wrong habits from every tester who thinks of writing automation.<\/p>\n<p>But I remain curious. Are my observations consistent with what you see? Please convince me that I shouldn\u2019t be as worried (and angry) as I am about this.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I\u2019m a huge soccer fan, and I\u2019m happily following the MLS Cup even though the local team was eliminated last week. Last night\u2019s match between Real Salt Lake (RSL) and the Chicago Fire went to penalty kicks before one team finally prevailed. After the game ended, I went to mlsnet.com to watch the highlights and&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[1],"tags":[],"class_list":["post-34","post","type-post","status-publish","format-standard","hentry","category-allposts"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_likes_enabled":true,"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/angryweasel.com\/blog\/wp-json\/wp\/v2\/posts\/34","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/angryweasel.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/angryweasel.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/angryweasel.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/angryweasel.com\/blog\/wp-json\/wp\/v2\/comments?post=34"}],"version-history":[{"count":0,"href":"https:\/\/angryweasel.com\/blog\/wp-json\/wp\/v2\/posts\/34\/revisions"}],"wp:attachment":[{"href":"https:\/\/angryweasel.com\/blog\/wp-json\/wp\/v2\/media?parent=34"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/angryweasel.com\/blog\/wp-json\/wp\/v2\/categories?post=34"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/angryweasel.com\/blog\/wp-json\/wp\/v2\/tags?post=34"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}