http://www.spinellis.gr/pubs/inbook/2010-MakingSW-QualityWars/html/Spi10k.html This is an HTML rendering of a working paper draft that led to a publication. The publication should always be cited in preference to this draft using the following reference:
|
Table of Contents
Talk is cheap. Show me the code.
When developers compare open source with proprietary software, what should be a civilized debate often degenerates into a flame war. This need not be so, because there is plenty of room for a cool-headed objective comparison.
Researchers examine the efficacy of open source development processes through various complementary approaches.
One method involves looking at the quality of the code, its internal quality attributes, such as the density of comments or the use of global variables [Stamelos et al. 2002].
Another approach involves examining the software's external quality attributes, which reflect how the software appears to its end users [Kuan 2003].
Then, instead of the product one can look at the process: examine measures related to the code's construction and maintenance, such as the how much code is being added each week or how swiftly bugs are closed [Paulson et al. 2004].
Another approach involves discussing specific scenarios. For instance, Hoepman and Jacobs [Hoepman and Jacobs 2007] examine the security of open source software by looking at how leaked source code from Windows NT and Diebold voting machines led to attacks and how open source practices lead to cleaner code and allow the use of security-verification tools.
Finally, a number of arguments are based on plain hand waving: more than a decade ago Bob Glass [Glass 1999] identified this trend in the hype associated with the emergence of Linux in the IT industry.
Although many researchers over the years have examined open source artifacts and processes [Fitzgerald and Feller 2002], [Spinellis and Szyperski 2004], [Feller 2005], [Feller et al. 2005], [von Krogh and von Hippel 2006], [Capiluppi and Robles 2007], [Sowe et al. 2007], [Stol et al. 2009], the direct comparison of open source systems with corresponding proprietary products has remained an elusive goal. The reason for this is that it used to be difficult to find a proprietary product comparable to an open source equivalent, and then convince the proprietary product's owner to provide its source code for an objective comparison. However, the open-sourcing of Sun's Solaris kernel and the distribution of large parts of the Windows kernel source code to research institutions provided me with a window of opportunity to perform a comparative evaluation between the open source code and the code of systems developed as proprietary software.
Here I report on code quality metrics (measures) I collected from four large industrial-scale operating systems: FreeBSD, Linux, OpenSolaris, and the Windows Research Kernel (WRK). This chapter is not a crime mystery, so I'm revealing my main finding right up front: there are no significant across-the-board code quality differences between these four systems. Now that you know the ending, let me suggest that you keep on reading, because in the following sections you'll find not only how I arrived at this finding, but also numerous code quality metrics for objectively evaluating software written in C, which you can also apply to your code. Although some of these metrics have not been empirically validated, they are based on generally accepted coding guidelines, and therefore represent the rough consensus of developers concerning desirable code attributes. I first reported these findings at the 2008 International Conference of Software Engineering [Spinellis 2008]; this chapter contains many additional details.
The very ink with which all history is written is merely fluid prejudice.
Researchers have been studying the quality attributes of operating system code for more than two decades [Henry and Kafura 1981], [Yu et al. 2004]. Particularly close to the work you're reading here are comparative studies of open source operating systems [Yu et al. 2006], [Izurieta and Bieman 2006], and studies comparing open and closed source systems [Stamelos et al. 2002], [Paulson et al. 2004], [Samoladas et al. 2004].
A comparison of maintainability attributes between the Linux and various Berkeley Software Distribution (BSD) operating systems found that Linux contained more instances of module communication through global variables (known as common coupling) than the BSD variants. The results I report here corroborate this finding for file-scoped identifiers, but not for global identifiers (see Figure 1.11, “Common coupling at file (left) and global (right) scope”). Furthermore, an evaluation of growth dynamics of the FreeBSD and Linux operating systems found that both grow at a linear rate, and that claims of open source systems growing at a faster rate than commercial systems are unfounded [Izurieta and Bieman 2006].
A study by Paulson and his colleagues [Paulson et al. 2004] compares evolutionary patterns between three open source projects (Linux, GCC, and Apache) and three non-disclosed commercial ones. They found a faster rate of bug fixing and feature addition in the open source projects, which is something we would expect for very popular projects like those they examine. In another study focusing on the quality of the code (its internal quality attributes) [Stamelos et al. 2002] the authors used a commercial tool to evaluate 100 open source applications using metrics similar to those reported here, but measured on a scale ranging from accept to rewrite. They then compared the results against benchmarks supplied by the tool's vendor for commercial projects. The authors found that only half of the modules they examined would be considered acceptable by software organizations applying programming standards based on software metrics. A related study by the same group [Samoladas et al. 2004] examined the evolution of a measure called maintainability index [Coleman et al. 1994] between an open source application and its (semi)proprietary forks. They concluded that all projects suffered from a similar deterioration of the maintainability index over time.