http://www.spinellis.gr/pubs/jrnl/2005-IEEESW-TotT/html/v22n5.html This is an HTML rendering of a working paper draft that led to a publication. The publication should always be cited in preference to this draft using the following reference:
|
Version Control Systems
A source code control system [is] a giant UNDO key—a
project wide time machine.
— A. Hunt and D. Thomas
Diomidis Spinellis
Sane programmers don’t write production code without the help of an editor and an interpreter or a compiler, yet I’ve seen many software projects limping along without using a version control system. We can explain this contrast if we think in terms of the increased startup costs and the delayed gratification associated with the adoption of a VCS. We humans typically discount the future, and therefore implementing version control in a project appears to be a fight against the human nature. It is true that you can’t beat the productivity boost that compilers and editors have provided us, but four decades after punched card programming in assembly language has gone out of fashion we must look elsewhere to reap our next gains in efficiency. And if you or your project is not using a VCS, adopting one may well be the single most important improvement you can undertake.
Acquiring a VCS need not be expensive; depending on the operating system you are using you may in fact find out that one (probably CVS or RCS) is already installed and ready to run. If not, you have the luxury to choose the system to use based on your budget. If you’re on shoestring budget you can safely pick a free, open source system: some of these systems have been used for multi-million line projects for over a decade. If you can shell-out some cash, you will find that several commercial systems offer additional features and a more polished interface. Installation of the VCS typically also involves setting up the repository, the location where the definite version of your source code and its changes will reside. Be sure to include the repository in your scheduled backups.
Normal software development with a VCS is only marginally more complicated than without it. Initially you start out a new project, or import your existing project into the VCS. From then on, to work, you check-out a version of the project into a private directory. Every time you are happy with a change you’ve made—like a bug fix, or the addition of a new feature—you commit your change to the repository, accompanied with an explanatory message. Also, whenever you feel in the mood for some excitement you synchronize or update your private version of the software with the changes committed by your colleagues. This action will provide you with endless hours of fun as you battle against your colleagues’ mistakes, but also ensures that you’re all working on roughly the same source code base. Finally, when you roll-out a release, you label or tag all files with the release’s name. And this is basically it.
Now that you are convinced that adopting a VCS isn’t a Herculean task, let us briefly see some of the benefits you will reap. First of all, if you are working in a team, you will stop stepping on each other’s toes by writing over other people’s code. If both you and Mary change the same file, the system will either unobtrusively merge your changes, or warn you that these are conflicting. In addition, every time you commit a change, you create a new version of the corresponding files. With the version information that the VCS stores you can access each file’s history of changes, and you can see who changed which lines when. Now that you can always go back to a specific version of the file, you don’t need to comment out code blocks, “in case they are needed in the future”: your older version of the code is safely stored in the VCS repository. You can therefore see the differences between versions of the same file, and in many VCS implementations you can get an annotated listing of the file indicating the name and date of each line’s most recent change. The repository also acts as the source of truth regarding the files stored in it. Source code distribution simply involves obtaining or updating a private workspace from the VCS repository. Once you label a project’s files for a given release you can use the release’s name to obtain again an exact copy of that historic file set. Furthermore you can split development into different branches each branch for example tracking the fixes associated with a given software release. You can then easily obtain the file versions associated with a given branch, and apply the same fix to multiple branches. Finally, with all the project’s history neatly stored in the repository, you can mine the VCS data to see how you’re doing: How many lines were changed for version 3.1? Which are the most and least productive days of the week? Which developers work on the same files?
Even if you’re already using a VCS for some time you may be able to squeeze more juice out of it. Here are some ideas.
Put everything under version control. Version control is not only for the source code; use it for you build scripts, help files, design notes, documentation, translated messages, GUI elements, everything that comprises your project.
Use VCS on your personal projects. You don’t have to work on a team to adopt a VCS. Consider using a VCS for you personal files, like your hobby projects, your web page, or your phone book. Some developers even use a VCS to synchronize their home directories among different hosts.
Think carefully about file name and organization. Some VCSs get confused when a file name changes: you have the unattractive choice between loosing either the file’s revision history, or the ability to retrieve older versions of the software with the correct file name. Therefore, it makes sense to adopt from the beginning of the project file names and a directory organization that will remain relatively stable through the project’s life.
Perform one separate commit for every change. Do not lump multiple changes into a single commit. Separating changes allows you to see precisely which lines were affected by the change, and apply the change selectively to other branches. This rule is especially important if a change involves global stylistic changes, which will affect thousands of code lines.
Label all releases. Whenever you release the software (even to the testing group next door), label it. This provides everyone with a concrete name to associate with bug reports and their fixes.
Establish and follow policies and procedures. VCS actions can affect all developers. You will therefore benefit from clear policies covering developer etiquette or the content of commit messages, and procedures covering heavy operations, such as branching and releases.
$Id: tot-5.doc 1.4 2005/07/05 15:49:38 dds Exp dds $
Diomidis Spinellis is an associate professor in the Department of Management Science and Technology at the Athens University of Economics and Business and the author of Code Reading: The Open Source Perspective (Addison-Wesley, 2003). Contact him at dds@aueb.gr.