http://www.spinellis.gr/pubs/jrnl/2005-IEEESW-TotT/html/v23n4.html This is an HTML rendering of a working paper draft that led to a publication. The publication should always be cited in preference to this draft using the following reference:
|
Tools of the Trade
Choosing a Programming Language
A language that doesn’t have everything is actually
easier to program in than some that do.
— Dennis M. Ritchie
Diomidis Spinellis
Computer languages fascinate me. Like a living person, each one has its own history, personality, interests, and quirks. Once you’ve learned one, you can use it again after years of neglect, and it’s like reconnecting with an old friend: you can continue discussions from the point they were broken off years before. For a task I recently faced I adopted a language I hadn’t used for 15 years, and felt enlightened.
Let me start by stressing that I don’t think there’s one language suitable for all tasks, and probably there won’t ever be one. In a typical workweek I seldom program in fewer than three different languages. The most difficult question I face when starting a new project is what language to use. Factors I balance when choosing a programming language are programmer productivity, maintainability, efficiency, portability, tool support, and software and hardware interfaces.
Often a single one of these factors is decisive and leaves little room for choice. If you have to squeeze your interrupt-driven code in a microcontroller’s 1024 bytes of memory assembly language or maybe C is the only game in town. If you’re going to interface with a Java-based application server then you write in Java. Sometimes tradition plays an important role. Systems code, like operating systems, device drivers, and utility programs, is typically written in C. Following this tradition means that the code will mesh well with its surrounding environment and won’t impose on it onerous requirements for libraries and runtime environments.
At other times the choice of the programming language is a fine balancing act. I find the power of C++ and its standard template library amazing: the combination provides me with extreme efficiency and expressiveness. At a price. The language is large and complex; after 15 years of programming in C++ I’m still often puzzled by the compiler’s error messages and I routinely program with a couple of reference books on my side. Time I spend looking up an incantation is time not spent programming. Modern object-oriented languages like Java and C# are more orthogonal and hide fewer surprises for the programmer, although the inevitable accumulation of features makes this statement less true with each new version of each language. It looks like Lehman’s laws of software evolution (“as a program is evolved its complexity increases”) haunt us on every front. On the other hand, sometimes you just can’t afford Java’s space overhead. I recently wrote a program that manipulated half a billion objects. Its C++ implementation required 3GB of real memory to run; a Java implementation would easily need that amount of memory just for storing the objects’ housekeeping data. I could not afford the additional memory space, and I’m sure even our more generously funded colleagues at CERN facing a one petabyte per second data stream in their large hadron collider experiment feel the same way.
The situations however I described are outliers. In many more cases I find myself choosing a programming language based on its surrounding ecosystem. If I’m targeting a Windows audience, the default availability of the .NET framework on all modern Windows systems makes the platform an attractive choice. Conversely, if the application will ever be required to run on any other system, then using the .NET framework will make porting it a nightmare. Third party libraries also play here an important role. Nowadays many applications are built by gluing together many other libraries. I recently calculated that each of the 20 thousand applications that have been ported to the FreeBSD system depends on average on 1.7 third party libraries that are not available on the system’s default installation; one application depends on 38 different libraries. Thus for example if your application requires support for 3D rendering, Bluetooth communications, the creation of PDF documents, an interface to a particular RDBMS, and public key cryptography you may find that these facilities are only available for a particular language.
When efficiency, portability, and library availability don’t force a language on me the next decisive factor is programmer productivity.
Interestingly here, I’ve found that the same language features can promote or reduce productivity depending on the work’s scope. For small tool-type programs I write in the course of my work I prefer a language that sustains programmer abuse without complaint. When I want to put together a program or a one-line command in a hurry, I appreciate that Perl and the Unix shell scripting facilities don’t require me to declare types and split my code into functions and modules. Other programmers use Python and Ruby in the same way.
However, when the program is going to grow large, will be maintained by a team, or be used in a context where errors matter a lot, I want a language that enforces programming discipline. One feature I particularly appreciate is strict static typing. Type errors that the compiler catches are bugs my users won’t face. Language support for splitting programs into modules and hiding implementation details is also important. If the language (or the culture of developing in that language) enforces these development traits, so much the better. Thus, although I realize one can write well structured hundred thousand line programs in both Perl and in Java, I feel that the discipline required to get this right in Perl is an order of magnitude higher than that required for Java, where even rookie programmers routinely split their code into classes and packages.
A language’s supporting environment is also important here. Nowadays, a programmer’s productivity in a given language is often coupled with the use of an IDE. Some tasks, like developing a program’s GUI layout, are painful without an appropriate IDE, and some colleagues have become attached to a particular IDE in the same way I’m clinically dependant on the vi editor. Thus choosing a language often involves selecting one of those a particular IDE supports.
There are also cases where a program’s application domain will favor the expressive style of a specific language. The three approaches here involve using an existing domain-specific language, building a new one, or adopting a general-purpose declarative language.
If you want to get some figures from a database you might write SQL queries; if you want to convert an XML document into a report you should try out XSLT. Building a special-purpose language may sound daunting, but is actually not that difficult if one takes the appropriate shortcuts. Such an approach can be a tremendous productivity booster. Fifteen years ago I designed a simple line-oriented DSL to specify the parameters of CAD system’s objects. Instead of designing an input window layout for each input group one simply specifies declaratively what the user should see and manipulate. Thus, the system’s initial 150 parameters have effortlessly swelled over the years to 2400 surviving intact a port to a different GUI platform.
When I recently set out to design a way for specifying complex financial instruments my first attempt was to design a DSL. However, the more I worked on the problem the more I realized that many of the features I wanted, like the manipulation of lists and trees, were already available on declarative languages like Prolog, Lisp, ML, and Haskell. After expressing a small subset of the problem in a number of these languages I singled out Haskell, a language I had to learn when writing a compiler for it as an undergraduate student. It seemed to offer a concise way to express everything I wanted and a no-frills but remarkably effective development environment.
My biggest surprise came when I started testing the code I wrote. Most programs worked correctly the first time on. I can attribute this to three factors. Haskell’s strong typing filtered out most errors when I compiled my code. Furthermore the language’s powerful abstractions allowed me to concisely express what I wanted, limiting the scope for errors (research has shown that the errors in a program are roughly proportional to its size). Finally, Haskell as a pure functional language doesn’t allow expressions to have side effects and thus forced me to split my program into many simple, easy to verify functions. Over the years many friends and books have prompted me to evaluate the use of a functional language for implementing domain-specific functionality; as I continue to add Haskell functions to my program I can see that the choice of the appropriate programming language can make or break a project.
Diomidis Spinellis is an associate professor in the Department of Management Science and Technology at the Athens University of Economics and Business and the author of Code Quality: The Open Source Perspective (Addison-Wesley, 2006). Contact him at dds@aueb.gr.