Main menu:

Site search

Categories

Tags

Links

Don’t Fly Blind

Flying blind Very few professionals would consider running a business without having reliable revenue and cost data.  Yet there are many software development teams who forget that having data to monitor the business of software development is critical.  Of course it’s hard to measure revenue, but we can definitely measure the costs and gather data that are indicators to show how efficient a software project is.

Why is having data so important?  Here are a few reasons:

  1. You can’t hide from facts.  Numbers can be misinterpreted (more on that below) but they don’t lie.  Having a gut feeling that a project is proceeding well isn’t good enough.
  2. Data often provides an early warning that can be used to get a project back on track with minimal effort.
  3. Data focuses minds and sparks conversations.
  4. If it isn’t visible it isn’t important and it’s usually just talk.  If you want your team to think something like quality is important, and you want to reward them for it, it must be visible.
  5. Published data ensures transparency.  Transparency inspires confidence, both from executives but also with customers and end users.
  6. Data helps with decision-making and is essential for critical decisions such as whether to cancel a project.

The Danger of Taking Data Too Seriously

There’s no question that having current data for analysis and discussion is critical to success, but you have to watch out for the dark side. Unlike revenue and costs for a business, much of the data you can collect for software projects is open to interpretation and too open for over-analysis.

For example, I know of one group that decided to achieve 100% test coverage. The team’s manager introduced bonus incentives for everyone on the team. They hit the target but their product didn’t ship because it was less stable than previous versions. Why? Anyone who has written a lot of automated tests knows that it is really hard to test that last 10% of branch conditions; the special cases that aren’t supposed to happen but often do. The solution this team reached was to delete those branch conditions so instead of returning from a function when a pointer was null for example the function would continue execution until it crashed.

The key point is that you need to keep your brain on when analyzing data. Use the data to make rational decisions, change conversations, and guide your business. Every number has a story behind it: concentrate on the stories not the numbers.

Essential Metrics

There are a few pieces of data that I think every software team should have. These are:

  • Historical bug count trends. Every team should have a weekly (automated) report mailed to the team and management that shows the bug trends, by severity, of at least the previous six months to the current day. Why? You need to keep discussions about quality front and center and the bug count is the best indicator that is not open to interpretation. If a course correction is needed, it’s much easier to do it when you’re dealing with an increase of 10 bugs versus 100 accumulated over a few weeks.
  • Test code coverage. A monthly report that shows the historical trend on the percentage of source code covered by automated tests for each area of your project helps spark discussions about areas of code that need more coverage. Automated testing is the best way to keep your team productive and away from boring manual tests that can best be done by a computer. I call this the ruthless testing mind-set in my book.
  • Source code versus test code lines of code. A monthly report that shows the historical trend of total lines of code for source and automated test code for each area of your project, minus comments and blank lines, is a useful report to compare with the test coverage. All you really want to verify with this report is that your team continues to add test code at an appropriate rate for source code growth. If you have an area of code that is being added without tests you can catch it quickly and stop adding technical debt to your project.
  • Source and test code modifications per developer. This one has a big-brother potential to it so be careful! However, I have found it useful to get a monthly report that shows how much source and test code was added, modified, and deleted by each developer in my team. It’s not that I don’t trust them, but rather that I use this as a good way to ask questions. Quite often I’ve found cases where someone is struggling with a problem and they’ve been afraid to speak up. The solution is often simple and I can help get them back on track. Of course, I’ve also found some rare cases where someone needs a reminder that they need to contribute, but I’d rather have that conversation after a month of relative inactivity than a few months…!

The Future: Live Telemetry

A couple of years ago I read an article about Rolls Royce’s jet engine division that has inspired me. Rolls Royce collects live data via satellite from every one of their engines world-wide that is in service. They are able to detect and diagnose problems in real-time, send notices to airlines about servicing requirements for engines, and even in rare situations tell an airline that a plane in flight needs to shut down an engine. This is an amazing level of service that has led to increasing revenues for Rolls Royce and a new business model.

The main point of Rolls Royce’s engine monitoring is that they are able to provide a service that is highly valued by customers. Google also collects vast amounts of data on its users and they are also able to provide a google toolbar with type-ahead for common searches, and these are also valued services. These are features that users value and keep them using Google’s services.

It is my belief that there is a huge opportunity in the software industry with the collection and analysis of live telemetry data. We of course have an obligation to end users to ensure that sensitive data is kept private, but if we can do that then there are opportunities to create new value and revenue streams, as Google and Rolls Royce have discovered.

One of the biggest challenges that I see with live telemetry data is analyzing the data. For example, in my current team we have a huge database that is updated every time a user starts or quits our application, plus all reported errors and warnings along with stack traces. It’s a vast amount of data and I wish that we could analyze more of it. What I’d really like is some kind of dashboard to be able to watch and analyze the data and only dive into the details when I want. We have to invest in that next, but talk about a huge opportunity! We’ve already had a few cases where we’ve seen users get an error repeatedly and sent them a fix without their asking. I want more of that!

Comments

Comment from zaklady bukmacherskie
Time June 10, 2010 at 10:00 am

You post informative posts. Bookmarked !

Write a comment