http://www.spinellis.gr/pubs/jrnl/2005-IEEESW-TotT/html/v23n3.html This is an HTML rendering of a working paper draft that led to a publication. The publication should always be cited in preference to this draft using the following reference:
|
Tools of the Trade
Debuggers and
Logging Frameworks
Diomidis Spinellis
As soon as we started programming, we found to our
surprise that it wasn’t as easy to get programs right as we had thought. Debugging had to be discovered.
— Maurice Wilkes discovers debugging, 1949
The testing, diagnostic, and repair equipment of many professions is horrendously expensive. Think of logic analyzers, CAT scanners, and dry docks. For us the cost of debuggers and logging frameworks is minimal; some of them are even free. All we need to become productive, is to invest some time and effort to learn how to use these tools in the most efficient and effective way.
Assuming that the bug-finding systems we discussed in our previous column have given our program’s code a clean bill of health, using debugger or logging instrumentation is the most productive way for pinpointing errors that have managed to creep in our code. With these tools we can often get a starting point for locating a bug, and then also verify our hypotheses on what is going wrong. As one would expect, adopting an appropriate strategy and mastering the corresponding techniques are the important factors for making the best out of these tools.
The most efficient debugging strategy is a bottom-up one: we
start from the symptom and look for the cause.
The symptom can be a memory access violation (for example the
dereferencing of a NULL pointer), an endless loop, or an uncaught exception. A debugger will typically allow us to get a
snapshot of the program at the point where the symptom occurred. From that snapshot we can examine the
program’s stack frame: the sequence of function or method invocations that led
to the execution of the problematic code.
At the very least we thus obtain an accurate picture of our program’s
runtime behavior. Even better, we can
also examine the values of variables at each level of the stack frame to really
understand what brought our program belly-up.
Unfortunately,
there are times when we can’t adopt a bottom-up strategy. This situation crops up when the bug’s
symptom can’t be precisely tied to a debugger event. Our program may cause a problem in another application, or the
contents of a variable may be wrong for reasons we can’t explain. In such cases top-down is the name of the
game. Debuggers allow us to step
through the code, stepping over or into functions and methods. When we debug in a top-down fashion we
initially step over bodies of code we consider irrelevant, narrowing down our
search as we come nearer the problem’s manifestation. This strategy requires patience and persistence. Often we step-over a crucial function and
find ourselves having to repeat the search aiming to step-into the function the
next time round. This process can be
tiring, but sooner or later will produce results.
There are also cases where we may have to debug a program at the level of assembly code: either because we don’t trust the compiler, or because we don’t have access to the program’s source code. What I’ve found over the years is that assembly code is a lot less intimidating than it appears. Even if we don’t know the processor’s architecture, a few educated guesses and a bit of luck often allow us to decipher the instructions needed to pinpoint the problem.
Stack frame printouts and stepping commands are the basic and indispensable debugging tools, but there are more powerful commands that can often help us locate a tricky problem. A code breakpoint allows us to stop the program’s execution at a specific line. We often use those to expedite a top down bug search, by placing a breakpoint before the point where we think the problem lies. In such cases we use the breakpoint as a bookmark for the location where we want to look at the program’s operation in more detail.
Less known, but no less valuable, are data breakpoints—also known as watchpoints. Many modern processors provide hardware support that will interrupt a program’s execution when the code accesses the contents of some specified memory locations. Data breakpoints leverage this support allowing us programmers to specify that the program’s execution will stop when its code reads or writes a variable, an array or an object. Note that debuggers that implement such commands without hardware support slow down the program’s execution to a crawl rendering this command almost useless (Java tool builders take note).
Although the typical set-up involves us starting the misbehaving program under a debugger, there are also other debugging options that can often help us escape a tight corner.
Consider non-reproducible bugs, also known as Heisenbugs, because they make our program appear as if it’s operating under the spell of Heisenberg’s uncertainty principle. We can often pinpoint those by debugging a program after it has crashed. On typical Unix systems crashed programs will leave behind them an image of their memory, the core dump. By running a debugger on this core dump we get a snapshot of the program’s state at the point of the crash. Windows, on the other hand, offers us the possibility to launch a debugger immediately after a program has crashed. In both situations we can then look at the location of the crash, and examine the values the variables had at the time. If the program hasn’t crashed but is acting weirdly, we can attach a debugger to that running process, and examine its operation from that point onward using the debugger’s commands.
Another class of applications that are difficult to debug are those with an interface that’s incompatible with the debugger’s. Embedded systems, operating system kernels, games, and applications with a cranky GUI fall in this category. Here the solution is remote debugging. We run the process under a debugger, but interact with the debugger’s interface on another system, connected through the network or a serial interface. This leaves the target system almost undisturbed, but still allows us to issue debugging commands and view their output from our debugging console.
Instructions in the program’s code that generate logging and debug messages allow us to inspect a program’s behavior without using a debugger. Some believe that logging statements are only employed by those who don’t know how to use a debugger. There may be cases where this is true, but it turns out that logging statements offer a number of advantages over a debugger session, and therefore the two approaches are complimentary.
First of all, the location and output of a logging statement is program-specific. Therefore, it can be permanently placed at a strategic location, and will output exactly the data we require. A debugger, as a general purpose tool, requires us to follow the program's control flow, and manually unravel complex data structures.
Furthermore, the work we invest in a debugging session only has ephemeral benefits. Even if we save our set-up for printing a complex data structure in a debugger script file, it would still not be visible or easily accessible to other people maintaining the code. I have yet to encounter a project that distributes debugger scripts together with its source code. On the other hand, because logging statements are permanent we can invest more effort than we could justify in a fleeting debugging session to format their output in a way that will increase our understanding of the program’s operation and, therefore, our debugging productivity.
Finally, logging statements are inherently filterable. Many logging environments, such as the Unix syslog library, Java’s util.logging framework, and the log4j Apache logging services, (http://logging.apache.org/) offer facilities for identifying the importance and the domain of a given log message. More impressively, Apple’s OS X logging facility stores log results in a database and allows us to run sophisticated queries on them. We can thus filter messages at runtime to see exactly those that interest us. Of course, all these benefits are only available to us when we correctly use an organized logging framework, not simple println statements.
As you can see our tool-bag is full of useful debugging tools. Being an expert user of a debugger and a logging framework is a sign of professional maturity. So, next time you encounter a bug, select the appropriate tool, go out, and slay it.
Diomidis Spinellis is an associate professor in the Department of Management Science and Technology at the Athens University of Economics and Business and the author of Code Quality: The Open Source Perspective (Addison-Wesley, 2006). Contact him at dds@aueb.gr.