9 steps to effective metrics

My doctor is pretty good, so I wasn’t surprised when she noticed I was fat. She backed up her observation with a metric: my weight. She also noted that the trend was not good. (Thank you, Google, for all those brownies.) When you consider blood pressure, pulse, oxygen level, cell counts, etc., maintaining human health is all about the numbers.

No so with software development, where demonstrably useful numbers are not often seen in normal practice. Here are nine tips for using metrics for better process health:

1. Don’t measure what you won’t use

Metrics are expensive and tedious to gather. Unless they’ll drive a decision, don’t collect them. 

2. Embrace the limitations of your numbers

One of my hackers challenged code coverage as an inaccurate measure of test effectiveness. While he was correct, it was irrelevant. The role of a metric is to reduce uncertainty. Life is not a math quiz where only perfect answers counts.

3. Your metrics will be wrong; you need to know how

Neiderman and Boyum[1] note:

  • There is always more than one way to measure something.
  • Measurements are error prone.
  • Even when dead-on, measurements are often just an approximation for what you really want to know.

That last point is critical. When we use weight to track our health all the measurement problems apply: Weight is merely a proxy for fitness, not a direct measure of it. And fitness may itself be a proxy for another goal: 

Lester Burnham: I figured you guys might be able to give me some pointers. I need to shape up. Fast. 

Jim Olmeyer: Are you just looking to lose weight, or do you want increased strength and flexibility as well? 

Lester Burnham: I want to look good naked! [2]

An objective measurement for looking good naked is beyond the scope of this post, but be aware of the difference between what you want to know and what you can measure.

4. Simplicity trumps theory

In theory, measuring calories consumed and burned has several advantages over just weighing yourself:

  1. It tells you in advance if you’ll lose weight (it’s a leading indicator; weight is a lagging indicator)
  2. Both variables are under your direct control. 
  3. It avoids confounding issues like the impact of added muscle on your weight
  4. Home scales are inaccurate.

But caloric intake and use are difficult to measure with any accurately. Weighing yourself is flawed, but easy and ‘good enough’. Rough, frequent measures are often better than theoretically correct ones. 

5. Metrics without a model teach us nothing

A few years ago I was serious about losing weight. I didn’t count calories, but ate less and exercised more. I recorded my exercise. I weighed myself regularly on the more accurate gym scale. I charted my weight on a timeline: If progress stalled, I exercised more. 

I lost about three pounds per week for four months. Measurements were no problem. They were flawed, but useful because I knew how they were flawed and had a good model of the underlying process:

  • Food provides energy.  What isn’t needed is converted to fat.
  • Exercise uses energy and reduces fat if calories used exceed calories taken in.
  • Exercise increases muscle mass, which increases weight.
  • Increasing muscle mass makes calories burn faster, but that effect lags.

And so on. This is neither complete or fully accurate, but it worked. With a good model, flawed metrics can be effective. Without one, perfect measurements won’t help. Your model of software development must be clear. This is a corollary to Deming’s rule that experience without theory teaches you nothing.

6. Respect the difference between critical variables and indicator variables

Critical variables interact with the largest number of other variables. Control those and you can exert great influence on the system.

Indicator variables depend on other variables, but have little impact. Your weight and a project’s schedule are indicator variables. Trying to manage them directly is like breaking the glass on your dashboard and moving the dials with your fingers. If you want results you have to get under the hood. Good models include both indicator and critical variables, and know which is which.

7. Put raw numbers in context

Joan Magretta[3] uses weight-loss to convey some additional insights:

“If we learn that Tyler weighs 145 pounds we know something objective, but it isn’t, to use the managerial term, “actionable.” If we learn next that Tyler is a six-foot-tall man the data begins to tell one story. If Tyler is a five-foot-tall woman, it’s quite another story. Now add one more piece of context. Suppose we know that three months ago, Tyler weighed over 200 pounds. That gives us not just a a story, but a call for urgent intervention” 

She makes data meaningful by taking a measure and putting it in context. First, she converts it to a ratio by comparing it to height. (This is what the Body Mass Index (BMI) does.) Then she puts it in an historical context by stating what it was 3 months ago.

While the BMI ratio provides more info than weight alone, the benefit of ratios is often greater. Height doesn’t change, so over time, a chart of BMI and one of weight will look similar. If both numbers in a ratio change simultaneously, charts will reveal new information hidden in the raw numbers.

As Tufte writes: “Nearly all the interesting worlds we seek to understand are multivariate in nature.”[4] Three of his six principles of analytic design (comparison, multivariate, and integration of evidence) are based on visually relating and combining metrics to produce new information.  The is particularly true in complex systems like software projects that are characterized by a many causally interrelated variables and dominated by feedback loops.

8. Combine metrics for greater insight

Don’t calculate lots of unrelated metrics. Pareto tells us only a few really matter, so use additional metrics to bolster the key ones. Current Open Bug Count says more about quality when backed by test coverage. Current Total Work Remaining is misleading without the percentage of tasks yet to be estimated. In both cases, the second metric tells us about the accuracy of the first. 

9. Use your program to help the team improve.

Once you have everything setup, you can begin the hard and creative work of helping your team improve, but as Magretta cautions: “don’t lose sight of the underlying human behavior.” Trends tell us how things change, but not why and not what to do about it. Using your program to reward or, especially, punish individuals is sure-fire way to encourage cheating or counterproductive behavior.

Conclusion

As my doctor will attest, I regained all the weight I once lost. No charts illustrated the decline. “Numbers are essential to organizational performance” writes Magretta. ”Doing the numbers begins with the simple act of measurement. If you want to know, objectively, how much you weigh, you have to get on the scale.” In other words, without measurement you can’t manage and without management (the process, not the ‘suits’) you can’t succeed.  

Here, summarized, are the 9 points:

  1. Don’t measure what you won’t use
  2. Embrace the limitations of your numbers
  3. Your metrics will be wrong; you need to know how
  4. Simplicity trumps theory
  5. Base your metrics on a model of the system
  6. Respect the difference between critical variables and indicator variables
  7. Put raw numbers in context
  8. Combine metrics for greater insight
  9. Use your program to help the team improve

Footnotes:

  1. Derrick Niederman, David Boyum: What the Numbers Say: A Field Guide to Mastering Our Numerical World
  2. American Beauty
  3. Joan Magretta: What Management Is: How It Works and Why It’s Everyone’s Business. 
  4. Edward Tufte: Beautiful Evidence
text
1 note
  1. deathrayresearch posted this
About

Deathray Research

deathrayresearch
Deathray Research is Larry White's software engineering blog. Larry is an engineering manager and hacker at Google, and lives in Beverly, MA. He's been managing large software projects for years and finally thinks he knows what he's doing.* The opinions expressed here are his own.

*Actually, he thought he knew what he was doing the whole time.

PS - I bought the domain deathrayresearch.com years ago thinking i would use it for a startup. Or a blog, maybe.

Recent Tweets