jvdb.org Blog

The Chaos of Code Development

on Saturday 17 March 2007 @ 11:18 in Software Maintenance

In my experience people tend to see metrics as some kind of gimmick: fun to let loose on your software, but not something that is actively used to make estimations, track progress or assess situations. It’s a shame, because a typical project (especially once it’s been around for a while and a couple of releases have been made) is a big source of measurable information waiting to be extracted and used. The trick ofcourse, is knowing what to measure and how to interpret the results.

One of the more useful metrics I’ve been using recently is code development complexity. This metric doesn’t look at the actual code itself, but at the distribution of modifications across the system, to predict the complexity in modifying it. The paper that proposed and first described this metric is called Studying the Chaos of Code Development by Ahmed Hassan and Richard Holt. It basically states that two factors can be used to measure the complexity of a piece of software: the amount of modifications that have to be made for each piece of functionality that is added and the distribution of these changes across all parts of the system.

This means that if you’ve added a set of new functions to a large system and all you’ve done is modified the same one or two classes for each change, the system is probably not very complex because most of it has never changed. Also, the design is apparantly appropriate given the types of changes that have to be made. An example of this is a system that uses the strategy pattern and that basically adds new implementations of an algorithm: the system itself doesn’t change, it just gets a new class every now and then. However, if a strategy is not used, new algorithms would have to be added to the user interface (for selection), to the business logic (the actual algorithm implementation) and maybe even to the storage (to determine how it deals with storing results).

Code development complexity is a great metric to see whether the way the system was designed is correct with regards to the required evolution over time. Furthermore, in the cases where it’s not, the actual data used to calculate the metric (which files are modified) are exactly where you have to look to get the complexity down: if a couple of classes are modified everytime something new is implemented, it may be wise to investigate what the cause is. Some things may be hardcoded there that don’t have to be.

To calculate it (unfortunately I haven’t found any tools to automate this yet, I only have a couple of scripts I’ve written that are tied to Subversion and a custom format for log messages that I use on projects), you basically need the following:

  • A repository containing as much of the project’s history as possible, preferably everything since initial development started.
  • A method to determine whether a change to a file in the repository was to fix a bug, do some general maintenance or add functionality (because changes of the last type are all you want to measure, initially).
  • A coding standard and/or development environment that forces files to contain logical parts of a system. Java forces you to use a file per class, Visual Studio automatically generates a file when you add a class, which is good.

Then you have to extract all the information from the repository, choose good parameters for the algorithm (such as how long each iteration lasts, I usually pick release dates) and calculate all the information, then plot the results and voila: an overview of the evolution of your project’s complexity over time. I’ve done this on a couple of projects now and am really happy about the results: two of the systems were developed initially by myself and the results very much reflected my own opinion of where the bottle necks in easy modification of the system were. They also provided a good view of what’s been going on in one case where other developers had modified the system.

Makes me wonder why none of the source control vendors implement this kind of stuff, especially now that they’re all integrating their bugtrackers anyway. Just shows how little people really care about metrics.

One Response to 'The Chaos of Code Development'

Subscribe to comments with RSS or TrackBack to 'The Chaos of Code Development'.

  1. Interesting finding - 03/17/2007 « Another .NET Blog said, on March 18th, 2007 at 3:20:

    [...] The Chaos of Code Development [...]

Leave a Reply