The wrath of the mighty metric

Reasons why some software delivery teams don’t give a damn about their customers

It feels like a century ago, but once upon a time, less than a century ago, I was leading a traditional test team in an organization where 3 separate teams of Business Analysts, Developers and Testers were delivering software in an incremental iterative death march style. Each of the 3 teams had its own leads and managers and each of the 3 teams was measured by specifically tailored metrics. My team’s efficiency was to be measured based on the mighty DDI Defect Detection Index calculated as DDI = (Number of Defects detected during testing / Total number of Defects detected including production defects)*100. The DDI had to be greater than 90% otherwise our team would have been deemed inefficient, bonuses dropped and the test team itself branded as a bunch of losers.

Yes you guessed right, the other 2 teams were measured in a similar way, their efficiency was also based on number of defects, the lowest the better.

God I am glad this is only in the past. Even remembering this makes me sick in the stomach. Sick like every time a production defect was detected, sick like every time a defect that our team detected was rejected, sick like every time I had to go to the triage meetings and inevitably have an argument either with the BA lead or the DEV one because that defect that we found was not seen as a possible improvement for the product but as a threat to some team’s metric. I’m not even going to describe to you the awful discussions that followed the acceptance of a defect as valid when a decision had to be made on whether the defect was due to bad requirements or bad code.finger-pointing

The funny thing was that no matter which was the efficient team and which were the inefficient ones, the software delivered was the same, no change whatsoever, the customers were constantly quite unhappy. The real value that the metrics gave to the department was the ability to point fingers based on numbers. They say numbers never lie, maybe numbers don’t lie but how many lies can we tell to fabricate numbers?

Since then many things have changed in my professional life and today I don’t have to fight stupid battles to fabricate numbers in order to define efficiency so I can, funnily enough, use my time more efficiently.

Why calculating confrontational metrics doesn’t work? The problem is in the fact that we are humans; if you attach prestige and monetary value to a metric, the metric becomes the goal of the team and the battle can begin. The test team doesn’t care how useful the product delivered is, all they care is opening as many defects as possible so that the mighty DDI doesn’t go under 90%, if this means opening defects that are absolutely no harm to the customer but only to the development team and the schedule it doesn’t matter. The same logic can be applied to development and the BA teams that will spend their time obfuscating their requirements and defending their code from the stupid defects opened by the test team. All this creates a climate of tension, distrust and hostility. Nobody really cares whether the customers are happy as long as the individual teams metrics solemnly declare their efficiency and fingers can be rightly pointed :-(.

teamwork

The funny thing is that it is very easy to resolve this problem and put the focus back on the customer. Create a cross functional self organising team able to analyse, develop, test and deliver a complex software project and judge the team on how well they satisfy the customer needs. The team lives as one, produces quality as one, delivers customer value as one, succeeds as one or fails as one. The goal of the team matches the goal of the company and failure or success of the team determines failure or success of the company, it’s called agile team, try it out!

How to avoid the very dangerous ALWAYS-GREEN test

When a test passes the first time it’s ever run, a developer’s reaction is “Great! Let’s move on!”. Well this can be a dangerous practice as I discovered one cold rainy day.

It was a cold rainy day (kind of common in Dublin), I was happy enough with my test results being all shiny green, when I decided to do some exploratory testing. To my surprise I discovered that an element on a web page that had always been there before, was gone, departed, vanished!

First reaction was to say, where the hell is it? I run some investigation and I saw the cause of it, no worries, it got knocked out by the last change, easy fix. The worst feeling had yet to come, in fact when I went to write a test for that scenario I saw that there was already an existing one checking for exactly that specific element existence… WHAT? The damn test had passed and was staring at me in its shiny green suit!

When we write automated tests be it a unit test, acceptance or any other type of test it is extremely important that we make it FAIL at least ONCE.

In fact, until you make a test FAIL, you will never know if the damn bastard passes because the code under test is correct or because the implementation of the test itself is wrong.

A test that never fails is worse than having no test at all because gives the false confidence that some code is tested and clean while it might be completely wrong now or any other cold rainy day in Dublin after a re-factor or a new push and we will never know because IT WILL NEVER FAIL.

If you don’t follow what I’m talking about, have a look at this example:

Take a Web app and say I want to verify that one field is visible in the UI at a certain stage.

What I do is to build automation that performs a series of actions and at the end I will verify whether I can see that field or not.

To do this I will create a method isFieldVisible() that returns true or false depending on whether the field is visible or not, so that I can assertTrue(isFieldVisible(myField));

When this test passes I am only half way there because I need to demonstrate that when the field is not visible isFieldVisible() does return false, otherwise my test might never fail

To do this I write a temporary extra step in the automation that hides the field and then run the same assertion again

assertTrue(isFieldVisible(myField));

At this point I expect the assertion to fail, if it doesn’t it means that I just wrote a very dangerous ALWAYS-GREEN test

What if I did write a very dangerous ALWAYS-GREEN test? What do I do now?

I must change the code (the test code, not the app under test code) until the test FAILS, when it fails for the first time and the original test is still green I can be sure that the test can fail and will fail in the future if after a re-factor or any other change that introduces regression, rainy day or not.

At this point you might argue that, rather than simply changing the test to make it fail and revert it to the original test, we should write the negative test and execute it as part of the automation.

It is an interesting point and the answer depends on the specific situation. In some cases a negative test can be as important as the original test and it is necessary for covering a different path in the code, but this is not always the case and we will have to make an informed call every time.

Example1 – When writing a negative test makes sense:

I want to verify that when I hit the “Customer Feedback link” my “Company Search box” can still be seen by the user.

I write the following test:

To make it fail I will add an extra temporary step:

If the original test was green then this test MUST FAIL (otherwise we have written the very dangerous ALWAYS-GREEN test)

At this point I notice that this is a valid scenario and I can write a test for it (if I don’t have it already).

The test will be

I have positive and negative scenarios covered. The negative scenario verifies that the “Hide Search Box link” functionality works as expected.

Example2 – When writing a negative test does not make sense:

I want to verify that after performing a search for a company and getting search results back, the value of the latest search is persisted in the search box.

I write the following test:

 

To make it fail I remove the second and third step

 

If the original test was green then this test MUST FAIL (otherwise we have written the very dangerous ALWAYS-GREEN test)

At this point I look a the tests and realise that there is no point in adding a negative test like the above (the one with 2 steps only) because if we don’t actually type ” Jackie Treehorn corp.” in the field, it is very unlikely that Jackie Treehorn or Jeff Lebowsky or any other cool character will suddenly appear in a search box magically so I decide that a negative test is not required

To recap:

1. When you write a test you MUST be able to make it fail to demonstrate its implementation is valid in particular if it is a cold rainy day.

2. If while you make the test fail you realise that this represents a new valid scenario to be tested then write the scenario and a separate test with the negative assertion, it might come useful on one hot sunny day.