Evaluating software architecture documentation
Little under a month ago, a kick-off event was being held as the official start of the Netherlands Institute for Research on ICT (NIRICT), which included a talk by David Parnas (best known for laying the foundation of object orientation) called Software Documentation: The Research Topic that Computer Science has Neglected. He argues that in order to make any real progress in the area of software quality, software documentation must improve radically. One of the issues he mentions is that reviewing, testing and inspection can only be reliably performed if you have accurate and thorough documentation. (Read the presentation for some of the other issues, it’s quite interesting.)
At the same time, most agile methodologies have been saying that documentation is a thing of the past. In Extreme Programming Explained, Kent Beck states that XP relies on “oral communication, tests, and source code to communicate system structure and intent”, instead of on design documentation. This approach has some problems however, as Lionel Brand pointed out in his 2003 keynote at the European Conference on Software Maintenance and Reengineering: how do you know that the tests you have are complete? How do you know whether everyone working on the project has intellectual control over the technical design? With documentation and reviewing you can find these things out much more easily.
In software maintenance, having good documentation is invaluable when attempting to understand a system you need to maintain that was developed by someone else. The most important part for maintenance is ofcourse the architecture, so having good architecture documentation is essential. To evaluate provided documentation, consider that good software architecture documentation is the following things:
-
Up-to-date — obvious perhaps, but something that must always be considered. As a survey by Andrew Forward and Timothy Lethbridge shows, documentation is rarely updated. Technical documents that are not up-to-date are basically exceptionally boring works of fiction, in other words, useless.
-
Written by the right people — sometimes organizations let interns or people from outside the project write documentation because the developers didn’t want to. This doesn’t work because as far as architecture is concerned, only the developers that came up with it can properly recognize and understand what is part of it and what is not (as I wrote about before in You can’t measure architecture.) In a 2003 IEEE Software article called Who Needs an Architect?, Martin Fowler cites Ralph Johnson as writing “In most successful software projects, the expert developers working on that project have a shared understanding of the system design. This shared understanding is called ‘architecture.’” So in order to have any chance of getting the message across, those expert developers should have written the documentation.
- Complete — for this you can usually only hope that this follows from the documentation being both up-to-date and written by the right people, since it’s not easily determined if you’re new to a system that needs to be maintained. Personally, I expect several things to be present in an architecture document to be at all usable to facilitate initial understanding of a system. First is a high-level view of the entire system showing all components (usually a diagram of some kind). In addition to this, the following questions must be answered for every component: What are its responsibilities? Why does it exist (in other words, why isn’t it part of another component)? What is its interface? What does this component depend on? These are very broad questions but they must all be answered in detail in order for the document to be worth anything in trying to understand the system it documents.
You could argue that the source code can satisfy these requirements and the first two properties are in fact always satisfied by the source code, but unfortunately, no matter how good your naming convention or directory layout is, the third property is never satisfied by the source code for non-trivial, real-world systems. So architecture documentation is a necessity, at least if you expect maintenance to be required at some point. About the odds of that happening, it’s easy to reference Meir Lehman’s First Law of Software Evolution: “[A real-world] program that is used must be continually adapted else it becomes progressively less satisfactory.”
Very well said: Documentation is the step child of software development. And we don´t teach reading software, but just writing it. (Although software has to be read it´s mostly read by its producer. However, the producer does not really read it. He mostly remembers it and is guided by the source through his memory.)
But then, what is documentation anyway? Although you say, it should at least be up-to-date, written by the right people and be complete, you leave out the essential nature of documentation: documentation is a map! It´s supposed to guide someone through a terrain. That´s also where XP fails when it says, source code is documentation.
So documentation needs to be on a different level of abstraction than source code. Tests are somewhat on a different level, but still are very low level. Also class diagramms are not really on a higher level. So whereever I´m presented with class diagramms as “that´s our architecture” I´m not satisfied.
Documenation needs to show a much bigger picture of software than source code and tests.
-Ralf
Good points Ralf, I agree that a class diagram, like source code, is useless when trying to understand a system: it just shows the result of the design process and the physical structure of the system. Both very important but they don’t facilitate an understanding of how it turned out this way in the first place, which is crucial.
Some of the other diagram types can be of more help though, especially component diagrams at least show the division between subsystems, which is essential when determining how to go about in performing maintenance. And sequence diagrams give a good overview of how the system actually operates, on a higher level than source code, since you don’t have to wade through the plumbing. But I rarely come across people that put sequence diagrams into documents, unfortunately.
I think, there are two essential views of every system, we cannot do without: the structural or responsibility view and the collaborative view.
The first one answers questions like: Who is doing what? What are the locations for code? Which responsibilities or roles constitute the system and contribute to the overall solution?
Very importantly the structural view can and should exist on any (!) number of levels of abstraction, not just two like class and component. Also the structural view should not be limited to delineate different roles according to physical containers for code like class or component.
The second view answers questions like: Who needs whom? What´s the chain of responsibilities to produce a certain feature or fulfill a requirement? What exactly should an element of the structural view implement in terms of functionality?
UML collaboration diagrams are one expression of the collaborative way. But they easily can be too complex and they sure cannot be drawn up from scratch. They need to be developed step by step and synthezised from other diagrams which UML is lacking and which I call value stream diagrams.
Structural and collaborative diagrams should exist on several levels of abstraction. Only then do they provide a map on different scales of a software system.
Sequence diagrams sure are helpful too, if the interaction between structural elements are complicated. But I don´t deem them as inevitable as structural and collaborative views.
I guess, I´m getting carried away now. I could go on about how much is lacking in software system (!) design in terms of “system thinking” ;-)
It´s good to know, though, we seem to be on the same page.
-Ralf