An exception handling revelation
I’ve been working with exceptions offered by languages,
such as Java and Python,
for more than 20 years, invariably as their consumer:
catching them when raised by an API and then doing my thing.
For the systems I worked on, exception handling mostly involved
either quitting the program with an error or re-prompting the user
to fix some input.
Consequently, my view of them was as a fancy error handling mechanism:
syntactic sugar and static enforcement for checking a function’s
successful completion.
Recently,
I refactored
the error handling in
Alexandria3k,
a library and a command-line tool providing efficient relational
query access to diverse publication open data sets.
Through this the full power of exceptions clicked for me.
I suspect that others may share my previously limited appreciation
of exception handling,
so here is a brief description of the refactoring.
Continue reading "An exception handling revelation"Last modified: Monday, February 5, 2024 5:48 pm
Extending the life of TomTom wearables
TomTom recently announced
it would stop operating their supporting infrastructure by the end of September
following its earlier decision
to exit the wearables market.
This means that its products, such as sports watches, will become effectively
useless, as they will no longer be able to export their activities and
sync them with tracker sites.
Throwing away an otherwise fine watch only because its maker decided to
shut down its proprietary infrastructure seems like a sad waste.
Here is how you can download the watch’s data and
upload it to Strava, a popular activity tracker,
using open source software.
Continue reading "Extending the life of TomTom wearables"Last modified: Friday, November 3, 2023 5:57 pm
Convert file I/O into pipe I/O with /dev/fd
Some Unix commands read data from files or write data to files,
without offering an obvious way to use them as part of a pipeline.
How can you write a program to interact with such a command
in a streaming fashion?
This would allow your program and the command run concurrently,
without the storage and I/O overhead of a temporary file.
You could create and use a named pipe, but this is a clunky solution,
requiring you to create and destroy a unique underlying file name.
Here’s a better approach.
Continue reading "Convert file I/O into pipe I/O with /dev/fd"Last modified: Saturday, December 14, 2019 3:19 pm
Debugging had to be discovered!
I start my Communications of the ACM article titled
Modern debugging techniques: The art of finding a needle in a haystack
(accessible from this page without a paywall)
with the following remarkable quote.
“As soon as we started programming, […] we found to our surprise that
it wasn’t as easy to get programs right as we had thought it would be.
[…] Debugging had to be discovered.
I can remember the exact instant […] when I realized that a large part of
my life from then on was going to be spent in finding mistakes
in my own programs.”
A Google search for this phrase
returns close to 3000 results, but most of them are cryptically
attributed as
“Maurice Wilkes, discovers debugging, 1949”.
For a scholarly article I knew I had to do better than that.
Continue reading "Debugging had to be discovered!"Last modified: Friday, November 16, 2018 5:38 pm
An Embarrassing Failure
My colleague Georgios Gousios and I are
studying the impact of software engineering research in practice.
As part of our research, we identified award-winning and highly-cited
papers, and asked their authors to complete an online survey.
Each survey was personalized with the author’s name and the
paper’s title and publication venue.
After completing a trial and a pilot run, I decided to contact the
large number of remaining authors.
This is when things started going horribly wrong.
Continue reading "An Embarrassing Failure"Last modified: Tuesday, October 3, 2017 6:11 pm
Of BOOL and stdbool
The C99 standard has added to the C programming language a
Boolean type, _Bool
and the bool
alias for it.
How well does this type interoperate with the Windows SDK BOOL
type?
The answer is, not at all well,
and here’s the complete story.
Continue reading "Of BOOL and stdbool"Last modified: Tuesday, September 5, 2017 7:50 pm
Debugging in Practice: dgsh Issue 85
Fixing an insidious bug in the new Unix directed graph shell
dgsh
allowed me to demonstrate in practice 10 of the 66
principles, techniques, and tools
I describe in the book Effective Debugging.
Almost all steps all documented in the corresponding
issue and
commits.
Here’s a detailed retrospective.
Continue reading "Debugging in Practice: dgsh Issue 85"Last modified: Tuesday, August 15, 2017 12:12 am
Modular SQL Queries with Unit Tests
I’m sure I’m not the only person on earth facing a
complex and expensive analytical processing task.
The one I’ve been working on for the past couple of years,
runs on the GHTorrent 98.5 GB data set of
GitHub process data.
It comprises 99 SQL queries (2599 lines of SQL code in total)
and takes more than 20 hours to run on a hefty server.
To make the job’s parts run efficiently and reliably I implemented
simple-rolap,
a bare-bones relational online analytical processing tool suite.
To ensure the queries produce correct results,
I wrote RDBUnit,
a unit testing framework for relational database queries.
Here is a quick overview on how to use the two.
Continue reading "Modular SQL Queries with Unit Tests"Last modified: Sunday, August 5, 2018 2:01 pm
The Road to Debugging Success
A colleague recently asked me how to debug a Linux embedded system that
crashed in the Unix shell (and only there),
when its memory got filled through the buffer cache.
He added that when he emptied the buffer cache the crash no longer occurred.
Continue reading "The Road to Debugging Success"Last modified: Thursday, February 16, 2017 10:55 am
First, Do No Harm
Let’s face it: not all software developers are superstar programmers (and, trust me, not all luminary developers program in a sane way.) This means that when we maintain existing code, we must be very careful to avoid breaking or degrading the system we work on. Why? Because a failure of a running system can affect operations, people, profits, property, and sometimes even lives. Here are the rules.
Continue reading "First, Do No Harm"Last modified: Thursday, September 25, 2014 10:32 am
The Frictionless Development Environment Scorecard
The environment we work in as developers can make a tremendous difference on our productivity and well-being. I’ve often seen myself get trapped in an unproductive setup through a combination of inertia, sloth, and entropy. Sometimes I put-off investing in new, better tools, at other times I avoid the work required to automate a time-consuming process, and, also, as time goes by, changes in my environment blunt the edge of my setup. I thus occasionally enter into a state where my productivity suffers death by a thousand cuts. I’ve also seen the same situation when working with colleagues: cases where to achieve a simple task they waste considerable time and energy jumping through multiple hoops.
Continue reading "The Frictionless Development Environment Scorecard"Last modified: Friday, December 6, 2013 2:49 pm
Differential Debugging
If estimating the time needed for implementing some software is difficult, coming up with a figure for the time required to debug it is nigh on impossible. Bugs can lurk in the most obscure corners of the system, or even in the crevices of third-party libraries and components. Ask some developers for a time estimate, and don’t be surprised if an experienced one snaps back, “I’ve found the bug when I’ve found the bug.” Thankfully, there are some tools that allow methodical debugging, thereby giving you a sense of progress and a visible target. A method I’ve come to appreciate over the past few months is differential debugging. Under it, you compare a known good system with the buggy one, working toward the problem source.
Continue reading "Differential Debugging"Last modified: Wednesday, September 11, 2013 0:12 am
Portability: Goodies vs. the hair shirt
“I don’t know what the language of the year 2000 will look like, but I know it will be called Fortran”
— Tony Hoare
Continue reading "Portability: Goodies vs. the hair shirt"Last modified: Thursday, July 25, 2013 1:00 pm
Systems Software
Systems software is the low-level infrastructure that applications run on: the operating systems, language runtimes, libraries, databases, application servers, and many other components that churn our bits 24/7. It’s the mother of all code. In contrast to application software, which is constructed to meet specific use cases and business objectives, systems software should be able to serve correctly any reasonable workload. Consequently, it must be extremely reliable and efficient. When it works like that, it’s a mighty tool that lets applications concentrate on meeting their users’ needs. When it doesn’t, the failures are often spectacular. Let’s see how we go about creating such software.
Continue reading "Systems Software"Last modified: Sunday, August 10, 2014 3:32 pm
Systems Code
If I program in many high and low-level languages, but dont write systems code, I am a quiche programmer or a code monkey. And if my code runs without errors, and I know the complexity of all algorithms; and if my servers have hundreds of cores and gigabytes of RAM, but dont write systems code, I am nothing. And if I run the hippest kernel, and install the neatest apps, but dont write systems code, it profiteth me nothing.
Continue reading "Systems Code"Last modified: Thursday, February 21, 2013 4:04 pm
The Importance of Being Declarative
A declarative programming style focuses on what you want your program to do rather than how to perform the task. Through diverse programming techniques, libraries, and specialized languages, you end up with code that sidesteps nitty-gritty implementation details, dealing instead with a task’s big picture.
Continue reading "The Importance of Being Declarative"Last modified: Wednesday, January 23, 2013 5:27 pm
APIs, Libraries, and Code
Let’s say you want to display a JPEG-compressed image, calculate Pearson’s correlation coefficient, parse an XML file, or create a key-value store. You can often choose between using the functionality of the application’s platform (Java EE or .NET), calling one of several available external libraries, or writing the code on your own. It isn’t an easy choice because you have many factors to consider. Specifically, you must take into account the task’s complexity, as well as the licensing, quality, and support of competing alternatives. See how you can narrow down your choice by eliminating alternatives at the earliest possible decision point.
Continue reading "APIs, Libraries, and Code"Last modified: Wednesday, December 19, 2012 11:44 am
Programming Languages vs. Fat Fingers
A substitution of a comma with a period in project Mercury's working Fortran code compromised the accuracy of the results, rendering them unsuitable for longer orbital missions.
How probable are such events and how does a programming language's design affect their likelihood and severity?
In a paper I recently presented at the
4th Annual International Workshop on Evaluation and Usability of Programming Languages and Tools
I showed results obtained by randomly perturbing similar programs written in
diverse languages to see whether the compiler or run-time system
would detect those changes as errors,
or whether these would end-up generating incorrect output.
Continue reading "Programming Languages vs. Fat Fingers"Last modified: Wednesday, December 5, 2012 10:40 am
How to Calculate an Operation's Memory Consumption
How can you determine how much memory is consumed by a specific
operation of a Unix program?
Valgrind's Massif subsystem could help you in this regard,
but it can be difficult to isolate a specific operation from
Massif's output.
Here is another, simpler way.
Continue reading "How to Calculate an Operation's Memory Consumption"Last modified: Saturday, September 22, 2012 5:46 pm
Refactoring on the Cheap
The refactorings that a good integrated development environment can perform are impressive. Yet, there are many reasons to master some cheap-and-cheerful alternative approaches. First, there will always be refactorings that your IDE won’t support. Also, although your IDE might offer excellent refactoring support for some programming languages, it could fall short on others. Modern projects increasingly mix and match implementation languages, and switching to a specialized IDE for each language is burdensome and inefficient. Finally, IDE-provided refactorings resemble an intellectual straightjacket. If you only know how to use the ready-made refactorings, you’ll miss out on opportunities for other code improvements.
Continue reading "Refactoring on the Cheap"Last modified: Wednesday, January 11, 2012 5:23 pm
Faking it
This column is about a tool we no longer have: the continuous rise of the CPU clock frequency. We were enjoying this trend for decades, but in the past few years, progress stalled. CPUs are no longer getting faster because their makers can’t handle the heat of faster-switching transistors. Furthermore, increasing the CPU’s sophistication to execute our instructions more cleverly has hit the law of diminishing returns. Consequently, CPU manufacturers now package the constantly increasing number of transistors they can fit onto a chip into multiple cores—processing elements—and then ask us developers to put the cores to good use.
Continue reading "Faking it"Last modified: Sunday, August 5, 2018 2:04 pm
How I Dealt with Student Plagiarism
Panos Ipeirotis,
a colleague at the
NYU Stern School of Business,
received considerable media attention when,
in a blog post he subsequently removed,
he discussed how his aggressive use of plagiarism detection software
on student assignments poisoned the classroom atmosphere and
tanked his teaching evaluations.
As detailed in
a story posted on the Chronicle of Higher Education blog,
Mr. Ipeirotis proposes instead that professors should design assignments that
cannot be plagiarized.
Along these lines here are two methods I've used in the past.
Continue reading "How I Dealt with Student Plagiarism"Last modified: Saturday, July 23, 2011 6:35 pm
Code Verification Scripts
Which of my classes contain instance variables?
Which classes call the method userGet
,
but don't call the method userRegister
?
These and similar questions often come up when you want to verify
that your code is free from some errors.
For example, instance variable can be a problem in servlet classes.
Or you may have found a bug related to the
userGet
and userRegister
methods,
and you want to look for other places where this occurs.
Your IDE is unlikely to answer such questions,
and this is where a few lines in the Unix shell can save
you hours of frustration.
Continue reading "Code Verification Scripts"Last modified: Saturday, May 21, 2011 9:40 pm
Choosing and Using Open Source Components
The developers of the SQLite open source database engine estimate that it’s deployed in roughly half a billion systems around the world (users include Airbus, Google, and Skype). Think of the hundreds of thousands of open source components, just one click away from you. If you know how to choose and use them effectively , your project can benefit mightily.
Continue reading "Choosing and Using Open Source Components"Last modified: Sunday, May 1, 2011 10:05 pm
elytS edoC
Sure, you can write English right to left. You can also write software code to look like a disc or even a train (see www.ioccc.org/1988/westley.c and 1986/marshall.c). However, you can’t then complain when you have to fight with your magazine’s editor or production staff about accepting your column’s title for publication, or if your colleagues refuse to touch your code with a 10-foot pole. Writing code in a readable and consistent style is difficult, uninteresting, tedious, underappreciated, and, extremely important.
Continue reading "elytS edoC"Last modified: Sunday, February 27, 2011 7:49 pm
Farewell to Disks
A classic web-comic illustrates how idle Wikipedia browsing can lead us from the Tacoma Narrows Bridge to Fatal hilarity (and worse). The comic doesn’t show the path leading from A to B, and finding it is an interesting challenge—think how you would engineer a system that could answer such questions. I believe that this problem and a solution I’ll present demonstrate some programming tools and techniques that will become increasingly important in the years to come.
Continue reading "Farewell to Disks"Last modified: Saturday, October 30, 2010 8:37 pm
Sane vim Editing of Unicode Files
Being able to use plain alphabeitc keys as editing commands
is for many of us a great strength of the vi editor.
It allows us to edit without hunting for the placement of
the various movement keys on each particular keyboard,
and, most of the time,
without having to juggle in order to combine particular keys with
ctrl
or alt
.
However, this advantage can turn into a curse when editing files
using a non-ASCII keyboard layout.
When the keyboard input method is switched to another script
(Greek in my case, or, say, Cyrillic for others)
vi will stop responding to its normal commands, because it will
encounter unknown characters.
Here is how I've dealt with this problem.
Continue reading "Sane vim Editing of Unicode Files"Last modified: Tuesday, August 24, 2010 1:24 am
Code Documentation
Technical prose is almost immortal.
— Frederick P. Brooks, Jr.
Continue reading "Code Documentation"Last modified: Sunday, July 11, 2010 1:32 pm
Software Tracks
A generous car reviewer might praise a vehicle’s handling by writing that it turns as if it’s running on railroad tracks. Indeed, tracks offer guidance and support. When you run on tracks you can carry more weight, you can run faster, and you can’t get lost. That’s why engineers, from early childhood to old age, get hooked on trains. Can we get our software to run on tracks?
Continue reading "Software Tracks"Last modified: Thursday, March 4, 2010 12:48 am
Applied Code Reading: Debugging FreeBSD Regex
When the code we're trying to
read is inscrutable,
inserting print statements and running various test cases can be
two invaluable tools.
Earlier today I fixed
a tricky problem in the FreeBSD regular expression library.
The code,
originally written by Henry Spencer in the early 1990s,
is by far the most complex I've ever encountered.
It implements sophisticated algorithms with minimal commenting.
Also, to avoid code repetition and increase efficiency,
the 1200 line long main part of the regular expression execution engine is
included in the compiled C code
three times after modifying various macros to adjust the code's behavior:
the first time the code targets small expressions and operates
with bit masks on long integers,
the second time the code handles larger expressions
by storing its data in arrays,
and the third time the code is also adjusted to handle multibyte characters.
Here is how I used test data and print statements to locate and fix the problem.
Continue reading "Applied Code Reading: Debugging FreeBSD Regex"Last modified: Wednesday, September 16, 2009 9:44 am
Job Security
My colleague, who works for a major equipment vendor, was discussing how his employer was planning to lay off hundreds of developers over the coming months. “But I’m safe,” he said, “as I’m one of the two people in our group who really understand the code.” It seems that writing code that nobody else can comprehend can be a significant job security booster. Here’s some advice.
Continue reading "Job Security"Last modified: Wednesday, September 2, 2009 3:35 pm
Applied Code Reading: GNU Plotutils
Robert, a UMLGraph user sent me an email
describing a problem with the
GNU plotutils
SVG output on Firefox.
I firmly believe that
code reading is a lot
easier than many think:
one can easily fix most software problems without detailed knowledge
of the underlying system.
I therefore decided to practice what I preach.
Continue reading "Applied Code Reading: GNU Plotutils"Last modified: Tuesday, August 11, 2009 4:40 pm
A Tiny Review of Scala
Earlier today I finished reading the
Programming in Scala book.
My review of the book should appear soon in the
reviews.com site and the
ACM Computing Reviews.
Here I outline briefly my view of the
Scala language.
Continue reading "A Tiny Review of Scala"Last modified: Wednesday, July 22, 2009 7:32 pm
Fixing the Orientation of JPEG Photographs
I used to fix the orientation of my photographs through an application
that would transpose the compressed JPEG blocks.
This had the advantage of avoiding the image degradation of a
decompression and a subsequent compression.
Continue reading "Fixing the Orientation of JPEG Photographs"Last modified: Sunday, June 14, 2009 8:20 pm
A Tiling Demo
Over the past (too many) days I've been preparing my presentation for the
ACCU 2009
conference.
At one point I wanted to show how loop tiling increases locality of reference
and therefore cache hits.
Surprisingly, I could not find a demo on the web, so I built one from scratch.
Here are two applets demonstrating memory accesses during a matrix raise to the
power of two operation.
Continue reading "A Tiling Demo"Last modified: Tuesday, April 21, 2009 5:39 pm
Precision in Comments
As I was writing some code for the
CScout
refactoring browser today,
I reflected on the importance of writing precise and clear comments.
Continue reading "Precision in Comments"Last modified: Wednesday, April 8, 2009 12:18 am
Start With the Most Difficult Part
There’s not a lot you can change in the process of constructing a building. You must lay the foundation before you erect the upper floors, and you can’t paint without having the walls in place. In software, we’re blessed with more freedom.
Continue reading "Start With the Most Difficult Part"Last modified: Wednesday, February 25, 2009 1:58 pm
The Information Train
The Information Train is a scientific
experiment that I presented at the
Wizards of Science 2009 contest over the past weekend.
The entry demonstrates how computers communicate with each other by
setting up a network in which a model train transfers a picture's pixels
from one computer to the other.
You can find
a video of the experiment
on YouTube, and, if you're interested, you can also download
the corresponding software and schematics from
this web page.
Continue reading "The Information Train"Last modified: Wednesday, February 18, 2009 3:21 pm
Beautiful Architecture
What are the ingredients of robust, elegant, flexible, and maintainable software architecture?
Over the past couple of years, my colleague
Georgios Gousios
and I worked
on answering this question through a collection of intriguing essays
from more than a dozen of today's leading software designers and architects.
Continue reading "Beautiful Architecture"Last modified: Wednesday, February 4, 2009 12:48 am
The World's Smallest Domain-Specific Language
Domain-specific languages, also known as little languages, allow us
to express knowledge in a form close to the problem at hand.
In contrast to general-purpose languages, like Java or C++,
they are specialized for a narrow domain.
Earlier today I wanted to initialize a rectangular array of Boolean
values to represent the stick figure of a human.
For that I devised a tiny domain-specific language (DSL) consisting of
two symbols (representing an on and an off pixel) and wrote its
commensurably simple interpreter.
Continue reading "The World's Smallest Domain-Specific Language"Last modified: Tuesday, February 3, 2009 12:04 am
A Well-Tempered Pipeline
I am studying the use of open source software in industry.
One way to obtain empirical data is to look at the operating systems and
browsers used by the Fortune 1000 companies by examining browser logs.
I obtained a list of the Fortune 1000 domains and wrote a pipeline
to summarize results by going through this site's access logs.
Continue reading "A Well-Tempered Pipeline"Last modified: Sunday, January 25, 2009 7:01 pm
The Value of Computing Paradigm Diversity
Today I wrote a combinatorial optimization algorithm to match members of
pair programming
teams according to the psychological traits of each pair's members.
The program appeared to rearrange the initial random allocation of pairs
in a way that might match my specifications.
However, as I'll use this allocation for an experiment that I'll be able
to perform only once, I realized that I wanted to carefully verify the results.
How does one verify the operation of such a program?
Continue reading "The Value of Computing Paradigm Diversity"Last modified: Friday, November 7, 2008 5:03 pm
A Look at Zero-Defect Code
The US
National Security Agency
has released a case study showing how to
develop zero-defect code in a cost-effective manner.
The researchers of the project conclude that,
if adopted widely, the practices advocated in the case study
could help make commercial software programs more reliable and less vulnerable.
I examined a small part of the case study's code, and was not impressed.
Continue reading "A Look at Zero-Defect Code"Last modified: Saturday, October 18, 2008 1:39 pm
Suspend Windows from the Command Line
I used to leave my computer up all night, but I've come to realize that this
is ecologically unsound.
Now I suspend it before going to sleep, but this missed running
a daily job that used to run at 03:00 am.
The job marks my students' exercises and send me email with the next day's
appointments.
I thus decided to schedule the task to wakeup my computer at 3:00 am,
run the job, and then suspend it again.
The Windows scheduler allows you to specify a wakeup option,
but not a subsequent suspend.
Furthermore, it seems that Windows lacks a way to suspend from the
command line (while maintaining the ability to hibernate), and the
only free tools on the web are distributed in executable form,
so I ended writing a small tool myself.
Continue reading "Suspend Windows from the Command Line"Last modified: Monday, October 6, 2008 7:25 pm
Web Services Come of Age
For years I've reacted to the hype surrounding web services with skepticism.
I found SOAP, WSDL, and UDDI to be too complex and brittle for wide deployment,
and I also wondered what types of services could be better provided over the
web rather than locally.
A new excellent developer site,
Stack Overflow,
answers both of my concerns.
Continue reading "Web Services Come of Age"Last modified: Wednesday, September 24, 2008 10:47 am
Saving the Editor's History
I recently spent a few days writing some tricky bit-twiddling code to
implement a radix tree.
I found myself making many programming mistakes, and I thought it would be
interesting to study them, examine their contributing factors, and
think how each of them could be prevented.
Continue reading "Saving the Editor's History"Last modified: Monday, August 25, 2008 4:32 pm
The Way We Program
If the code and the comments disagree, then both are probably wrong.
Continue reading "The Way We Program"Last modified: Thursday, June 26, 2008 12:40 am
Software Builders
The tools and processes we use to transform our system’s source code into an application we can deploy or ship were always important, but nowadays they can mean the difference between success and failure. The reasons are simple: larger code bodies, teams that are bigger, more fluid, and wider distributed, richer interactions with other code, and sophisticated tool chains. All these mean that a slapdash software build process will be an endless drain on productivity and an embarrassing source of bugs, while a high-quality one will give us developers more time and traction to build better software.
Continue reading "Software Builders"Last modified: Tuesday, May 6, 2008 12:04 am
Assigning Responsibility
Over the past few days I worked over a large code body correcting various
accumulated errors and style digressions.
When I finished I wanted to see who wrote the original lines.
(It turned out I was not entirely innocent.)
Continue reading "Assigning Responsibility"Last modified: Sunday, April 20, 2008 9:34 pm
A Minute Minute Minder
Today I delivered the opening
keynote address
at the 4th Panhellenic Conference on Computer Science Education.
For a number of reasons (more on that later) I wanted to keep track of
my progress during the presentation.
For this I put together a minute minder that displayed the
time from the presentation's start and the slide I should be in.
I could thus adjust my pace to finish as planned.
Continue reading "A Minute Minute Minder"Last modified: Saturday, March 29, 2008 6:22 am
Using and Abusing XML
Words are like leaves; and where they most abound,
Much fruit of sense beneath is rarely found.
Continue reading "Using and Abusing XML"Last modified: Friday, May 2, 2008 11:11 am
The Mysterious TreeMap Type Signature
For my lecture notes on file handling
I wrote a small Java program to display the number of characters
that fall in each
Unicode block,
and got bitten by an unexpected
runtime error.
Angelika Langer,
a wizard of Java Generics, kindly provided me with an explanation
of the JDK design,
which I'd like to share.
Continue reading "The Mysterious TreeMap Type Signature"Last modified: Wednesday, January 23, 2008 11:34 am
Rational Metaprogramming
Metaprogramming, using programs to manipulate other programs, is as old as programming. From self-modifying machine code in early computers to expressions involving partially applied functions in modern functional-programming languages, metaprogramming is an essential part of an advanced programmer’s arsenal.
Continue reading "Rational Metaprogramming"Last modified: Sunday, January 13, 2008 10:52 am
The Relativity of Performance Improvements
Today, after receiving a 1.7MB daily security log message containing
thousands of ssh failed login attempts from bots around the
world, I decided I had enough.
I enabled IPFW to a FreeBSD system I maintain, and added a script
to find and block the offending IP addresses.
In the process I improved the script's performance.
The results of the improvement were unintuitive.
Continue reading "The Relativity of Performance Improvements"Last modified: Monday, January 7, 2008 10:58 am
Curing MIDlet Bluetooth Disconnects
Over the last few days I've been writing a
MIDlet
to collect GPS coordinates and cell identifiers.
I'm doing this
in an effort to look at what algorithms might be needed
in order to implement something similar to Google's
My Location
service.
Here is a Google Earth example of the data I'm collecting.
Yesterday,
I reached a point where I was collecting all the information I needed,
but the program was often plagued by random disconnections of the Bluetooth
link to the GPS.
Continue reading "Curing MIDlet Bluetooth Disconnects"Last modified: Friday, January 4, 2008 1:13 pm
Many Ways to Skin a Window
Every couple of years,
users of a Microsoft Windows application I wrote a long time ago
start complaining that the application crashes when they exit from it.
Every time it turns out that the reason is a Windows message that tells
the application's main window to close
in a way that was not originally foreseen.
Continue reading "Many Ways to Skin a Window"Last modified: Thursday, December 13, 2007 9:15 pm
On Paper
A box of crayons and a big sheet of paper provides a more expressive medium for kids than computerized paint programs.
— Clifford Stoll
Continue reading "On Paper"Last modified: Saturday, November 10, 2007 8:10 pm
A Programmer's Bookshelf
A first year student at a nearby university wrote to me asking for
advice on becoming a hacker
(according ESR's
definition, he clarified).
He sent me a laundry-list of 18 programming languages he aimed to learn
by the time he graduated, and asked for other recommendations.
I've learned a lot from reading books,
so I compiled two reading lists for him.
Continue reading "A Programmer's Bookshelf"Last modified: Thursday, September 27, 2007 10:49 am
Abstraction and Variation
“Master, a friend told me today that I should never use the editor’s copy-paste functions when programming,” said the young apprentice. “I thought the whole point of programming tools was to make our lives easier,” he continued.
The Master stroked his long grey beard and pressed the busy button on his phone. This was going to be one of those long, important discussions.
Continue reading "Abstraction and Variation"Last modified: Sunday, September 2, 2007 12:02 am
Palindromic Palindrome Checking
Stan Kelly-Bootle's column in the April 2007
ACM Queue, titled
Ode or Code? — Programmers Be Mused!,
was as always very enjoyable.
However, I found its ending,
a C function that returns true when given a palindromic string
(e.g. ABCCBA), anticlimactic.
The function given is recursive; I was expecting it to be palindromic.
How difficult can it be to write such a function?
Continue reading "Palindromic Palindrome Checking"Last modified: Wednesday, June 6, 2007 6:43 pm
Using the Open-Sourced Java Platform
Having access to a system's source code is liberating.
I've felt this since I first laid my eyes on the source code of the
9th Edition Unix in 1988, and I saw this again as I used the freshly
open-sourced Java platform
to implement a UMLGraph
feature that has been bugging me for more than a month.
Continue reading "Using the Open-Sourced Java Platform"Last modified: Thursday, May 10, 2007 5:23 pm
I Spy
Knowledge is power.
—Sir Francis Bacon
Continue reading "I Spy"Last modified: Monday, April 9, 2007 9:54 pm
Software Development Productivity Award
Yesterday, at the
17th annual Jolt Product Excellence and Productivity Awards
my book
Code Quality: The Open Source Perspective won a Software Development Productivity Award
in the Technical Books category.
Continue reading "Software Development Productivity Award"Last modified: Friday, March 23, 2007 11:13 am
Software Rejuvenation is Counterproductive
In the February issue of the Computer magazine
Grottke and Trivedi propose four strategies for
fighting bugs that are difficult to detect and reproduce.
Retrying an
operation and replicating software are indeed time-honored and practical
solutions. When coupled with appropriate logging, they may allow an
application to continue functioning, while also alerting its maintainers
that something is amiss. On the other hand, the proposal to restart
applications at regular intervals (rejuvenation as the authors call
it), doesn't allow us to find latent bugs, sweeping them instead under
the carpet. This lowers the bar on the quality we expect from software,
and will doubtless result in a higher density of bugs and increasingly
complicated failure modes.
Continue reading "Software Rejuvenation is Counterproductive"Last modified: Friday, March 9, 2007 2:37 pm
A Peek at Beautiful Code
An exciting new book is about to hit the shelves,
and I consider myself very lucky to be among its contributors.
Beautiful Code,
subtitled "leading programmers explain how they think",
contains 33 chapters where contributors describe some code
they consider noteworthy.
Although I don't consider myself worthy of the book's subtitle,
I love coding, and
I'm extremely happy that code is taking the leading role among such an
illustrious cast.
Here is the complete table of the book's contents.
Continue reading "A Peek at Beautiful Code"Last modified: Tuesday, February 27, 2007 7:45 pm
The Escape of a Small Program
C. A. R. Hoare's
Law of Large Programs states that
inside every large program is a small program struggling to get out.
The parking receipt I got yesterday returning from a
SQO-OSS meeting proves this fact.
Continue reading "The Escape of a Small Program"Last modified: Thursday, December 21, 2006 9:59 am
The Return of Performance Engineering and Trendy Programmers
In the 1950s, when processor cycle times were measured in microseconds,
algorithm design and clever programming could make or break an application.
These fields continued to be popular in the 1960s and 1970s, because
widespread computers were used to attack ever larger problems.
Programming was a hip and trendy occupation.
Today's $500 computers operating on GHz clocks allow anybody who has
(just about) mastered the syntax of a programming language to write
code that drives dynamic web sites serving hundreds of transactions each
minute.
Managers consider code a commodity, and enrollments to computer science
degrees are dwindling.
However, change is in the air.
Continue reading "The Return of Performance Engineering and Trendy Programmers"Last modified: Friday, November 3, 2006 8:43 pm
Research in Domain Specific Languages
My research colleague
Vassilis Karakoidas
is working on better programming support for domain specific languages (DSLs).
Today he claimed that DSLs were hyped during 1998-2002,
and now interest has waned.
Continue reading "Research in Domain Specific Languages"Last modified: Friday, October 13, 2006 1:32 pm
Code Finessing
When I set out to apply CScout
on the Linux kernel source code, I
discovered that it failed to correctly expand a couple of C macros,
causing the analysis to fail. This prompted me to reimplement CScout's
macro expansion using a
precise functional specification,
then optimize
the code's severe degradation in time performance, and finally tidy up
the optimized code mess.
Continue reading "Code Finessing"Last modified: Friday, October 6, 2006 9:44 am
Cross Compiling
Cross compiling software on a host platform to run on a different
target used to be an exotic stunt to be performed by
the brave and desperate.
One had first to configure and build the compiler, assembler, archiver,
and linker for the different architecture, then cross-build the other
architecture's libraries, and finally the software.
This week, while preparing a new release of the
CScout refactoring browser
I realized that what was once a feat is nowadays a routine operation.
Continue reading "Cross Compiling"Last modified: Saturday, September 30, 2006 10:32 pm
Choosing a Collection: A Discussion with Kent Beck
Recently I reviewed the mansucript of Kent Beck's upcoming
book Implementation Patterns.
I will certainly put it in the list of books any professional programmer
should read.
When discussing collections (containers in C++ STL parlance),
Kent mentions that
his overall strategy for performance coding with collections is to use the
simplest possible implementation at first and pick a more specialized collection
class when it becomes necessary.
My view is that
we should choose the most efficient implementation from the start.
With prepackaged collections this doesn't have any cost associated with
it, and it avoids nasty surprises when a dataset increases beyond the
size the programmer envisaged.
I added a comment to that effect in my review, and later I sent him
an email with a supporting citation, which
kindled an interesting exchange.
I reproduce our email exchange here, with his permission.
Continue reading "Choosing a Collection: A Discussion with Kent Beck"Last modified: Wednesday, September 27, 2006 3:36 pm
The Verbosity of Object-Oriented Code
As I refactored a piece of code from an imperative to an
object-oriented style I increased its clarity and reusability,
but I also trippled its size.
This worries me.
Continue reading "The Verbosity of Object-Oriented Code"Last modified: Monday, September 25, 2006 0:32 am
UML Class Diagrams from C++ Code
I needed a UML class diagram of the classes I use in the implementation of
CScout refactoring browser.
I drew the last such diagram on paper about four years ago, so it was
definitely out of date.
I always say that whenever possible documentation should be automatically
generated from the code, so I decided to automate the task.
Continue reading "UML Class Diagrams from C++ Code"Last modified: Thursday, September 21, 2006 11:49 am
Open Source and Professional Advancement
Doing really first-class work, and knowing it, is as good as wine, women (or men) and song put together.
— Richard Hamming
Continue reading "Open Source and Professional Advancement"Last modified: Friday, December 15, 2006 11:32 am
Choosing a Programming Language
A language that doesn't have everything is actually easier to program in than some that do.
— Dennis M. Ritchie
Continue reading "Choosing a Programming Language"Last modified: Friday, December 15, 2006 11:32 am
Debuggers and Logging Frameworks
As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered.
— Maurice Wilkes discovers debugging, 1949
Continue reading "Debuggers and Logging Frameworks"Last modified: Friday, December 15, 2006 9:15 am
Xerces v Flex
What is the fastest way to process and XML file?
I was faced with this question when I recently wanted to
process a 452GiB XML file; for this amount of data speed matters.
Some obvious choices were XML libraries, hand-crafted code, and
lexical analyzer generators.
Continue reading "Xerces v Flex"Last modified: Thursday, September 22, 2016 9:48 am
Code Quality: The Open Source Perspective
My new book
Code Quality: The Open Source Perspective
got published,
three years after I started writing it.
The book owes more to open source software than any of the books
dealing with Linux, PHP, Apache, Perl or any other book covering
a specific technology.
Continue reading "Code Quality: The Open Source Perspective"Last modified: Wednesday, April 12, 2006 12:05 am
Efficiency Will Always Matter
Many claim that today's fast CPUs and large memory capacities make
time-proven technologies that efficiently harness a computer's power irrelevant.
I beg to differ, and my experience in the last three days demonstrated
that technologies that originated in the 70s still have their place today.
Continue reading "Efficiency Will Always Matter"Last modified: Monday, April 3, 2006 0:42 am
Bug Busters
Although only a few may originate a policy, we are all able to judge it.
— Pericles of Athens
Continue reading "Bug Busters"Last modified: Friday, December 15, 2006 11:32 am
A General-Purpose Swap Macro
A couple of days ago I came up with a general-purpose macro for swapping
values in C programs.
My colleague Panagiotis Louridas suggested an improvement, and
this prompted me to see the two macros got compiled.
Continue reading "A General-Purpose Swap Macro"Last modified: Monday, January 30, 2006 10:37 am
If STL Had Been Designed by a Committee
I've been reading on XML schema, and it's embarrassingly obvious
that it has been designed by a committee.
Continue reading "If STL Had Been Designed by a Committee"Last modified: Wednesday, December 7, 2005 2:40 pm
How to Sort Three Numbers
Quick: how do you sort three numbers in ascending order?
Continue reading "How to Sort Three Numbers"Last modified: Thursday, November 17, 2005 10:01 am
Supporting Java's Foreach Construct
Java 1.5 supports a new
foreach
construct for iterating over collections.
The construct can be used on arrays and on all classes in Java's Collection
framework.
I searched the internet for an example on how to make my own
classes iterable with this construct, but could not find an example.
Continue reading "Supporting Java's Foreach Construct"Last modified: Sunday, November 13, 2005 10:27 pm
C++0X Enhancement: Rational Metaprogramming
In a recent article
Bjarne Stroustrup
presented the evolution of C++ toward the 0X standard, and asked the C++
community for ideas regarding C++ enhancements.
This is a proposal to add to C++ support for rational metaprogramming.
Continue reading "C++0X Enhancement: Rational Metaprogramming"Last modified: Wednesday, July 20, 2005 1:19 pm
GCC Obfuscated Code
For years I've struggled to understand the
GNU compiler collection internals,
I am ashamed to say, without much success.
I always thought that the subject was intrinsically too complicated
for me, but after struggling to understand a two line gcc
code snippet of a fairly simple operation for more than two minutes,
I realized that the code style may have something to do with my problems.
Continue reading "GCC Obfuscated Code"Last modified: Sunday, July 17, 2005 1:11 pm
C++0X Enhancement: Packaged Libraries
In a recent article
Bjarne Stroustrup
presented the evolution of C++ toward the 0X standard, and asked the C++
community for ideas regarding C++ enhancements.
This is a proposal to add to C++ support for using packaged libraries,
and a standardizing a library distribution format.
Continue reading "C++0X Enhancement: Packaged Libraries"Last modified: Wednesday, July 20, 2005 1:19 pm
Tool Writing: A Forgotten Art?
Merely adding features does not make it easier for users to do things—it just makes the manual thicker. The right solution in the right place is always more effective than haphazard hacking.
— Brian W. Kernighan and Rob Pike
Continue reading "Tool Writing: A Forgotten Art?"Last modified: Tuesday, December 12, 2006 8:20 pm
XML Abstraction at the Wrong Level
Over the last month I've encountered two applications
that use XML at the wrong level of abstraction.
Instead of tailoring the schema to their needs, they
use a very abstract schema, and encode their elements
at a meta level within the XML data.
This approach hinders the verification and manipulation of the corresponding
XML files.
Continue reading "XML Abstraction at the Wrong Level"Last modified: Thursday, June 23, 2005 11:52 am
Today's Dynamic is Tomorrow's Static
Today at the IEEE Software's
editorial and advisory board
meeting, the issue of service-oriented architectures came up.
Robert Glass wondered whether this was the upcoming fad,
following structured programming and object-oriented programming,
to which Stan Rifkin replied that service-oriented architectures
are a lot more dynamic.
Interestingly, the previous approaches, which we today consider as
static, were also thought-off as dynamic in their day.
Continue reading "Today's Dynamic is Tomorrow's Static"Last modified: Thursday, May 26, 2005 8:04 am
Warum einfach, wenns auch kompliziert geht?
(Why make it simple, when you can also make it complicated?)
Consider the task of associating code with specific data
values.
Using a multi-way conditional can be error-prone, because
the data values become separated by the code.
It can also be inefficient in the cases where we have to use cascading
else if
statements, instead of a switch
,
which the compiler can optimize into a hash table.
In C I would use an array containing values and function pointers.
My understanding is that the Java approach involves using the
Strategy pattern: a separate class for each case,
and an interface "to rule them all".
Continue reading "Warum einfach, wenns auch kompliziert geht?"Last modified: Friday, May 13, 2005 9:54 am
Ordnung muss sein
A free-form translation of the above German phrase (orderliness must exist)
would be that orderliness is not negotiable.
In the domain of information technology I find this motto particularly
pertinent.
Continue reading "Ordnung muss sein"Last modified: Wednesday, May 11, 2005 4:25 pm
Java Makes Scripting Languages Irrelevant?
Simplicity does not precede complexity, but follows it.
— Alan J. Perlis
Continue reading "Java Makes Scripting Languages Irrelevant?"Last modified: Tuesday, December 12, 2006 8:20 pm
The Efficiency of Java and C++, Revisited
A number of people worked on replicating the results and optimizing
the programs I listed in my earlier blog entry.
Continue reading "The Efficiency of Java and C++, Revisited"Last modified: Tuesday, February 15, 2005 8:42 am
Macro-based Substitutions in Source Code
A friends asks:
"How can one easily replace a method call (which can contain
arguments with brackets in its invocation code) with a simple
field access?
Continue reading "Macro-based Substitutions in Source Code"Last modified: Tuesday, February 8, 2005 12:50 am
Measuring the Effect of Shared Objects
For the Code Quality
book I am writing I wanted to measure the memory savings of
shared libraries.
On a lightly loaded web server these amounted to 80MB,
on a more heavilly loaded shell access machine these ammounted
to 300MB.
Continue reading "Measuring the Effect of Shared Objects"Last modified: Saturday, December 11, 2004 8:54 pm
Code Reading Example: the Linux Kernel Load Calculation
A colleague's Linux machine was exhibiting a very high load value,
for no obvious reason.
I wanted to make him point the kernel debugger on the routine calculating
the load.
It has been more than 7 years since the last time I worked on a Linux
kernel,
so I had to find my way around from first principles.
This is an annotated and slightly edited version of what I did.
Continue reading "Code Reading Example: the Linux Kernel Load Calculation"Last modified: Thursday, November 25, 2004 9:40 am
Book Review: C++ Coding Standards
A number of years ago, reading Koenig's and Moo's
Ruminations on C++ [1] I made a wish for more of the
same, updated to reflect current C++ practice.
My wish has come true.
The book
C++ Coding Standards: 101 Rules, Guidelines, and Best Practices
by Herb Sutter and Andrei Alexandrescu [2]
is an indispensable book for all serious C++ programmers.
Continue reading "Book Review: C++ Coding Standards"Last modified: Sunday, November 14, 2004 4:41 pm
Cracker Code Review
According to a popular myth, crackers are computer whiz kids:
brilliant software developers who run circles around their
"peers" in the corporate world.
When my undergraduate student Achilleas Anagnostopoulos sent me a
pointer
to the source code of the
Microsoft GDIPlus.DLL JPEG Parsing Engine Buffer Overflow
exploit, I decided to test the myth
by performing a code review of the exploit's source code.
The results are not flattering for the exploit's developers:
no self-respecting professional would ever write production code of
such an abysmally low quality.
Sorry M4Z3R.
Continue reading "Cracker Code Review"Last modified: Tuesday, October 5, 2004 10:47 pm
A Survey of Language Popularity
My PhD student
Vassilios Karakoidas
pointed my to an on-line
language popularity survey.
Continue reading "A Survey of Language Popularity"Last modified: Saturday, September 25, 2004 7:59 pm
Digital Data Makes Anything Possible
Once data becomes digital anything and everything becomes possible.
Consider arranging the books on your bookshelf by the color of
their book cover.
Continue reading "Digital Data Makes Anything Possible"Last modified: Tuesday, August 31, 2004 2:59 pm
Continous Bookmarking
When editing documents or code, my not so agile fingers, often trigger
a movement or search command that accidentally throws me to a random
location in the text I am editing.
How can I return back?
Amazingly, I noticed I am using exactly the same trick for returning back
on both the vim editor I use for most of
my editing tasks, and Microsoft Word I use for collaborating with many
colleagues.
Continue reading "Continous Bookmarking"Last modified: Wednesday, August 25, 2004 9:57 am
The hypot() Mystery
I was writing a section for the
Code Reading
followup volume, and wanted to demonstrate the pitfalls of
using homebrewn mathematical functions instead of the library
ones.
As an example, I chose to compare the C library
hypot(x, y)
function,
against
sqrt(x * x, y * y)
.
I created a plot of "unit in last place" (ulp) error values between
the two functions, which demonstrated how the error increased for larger
values of y.
Continue reading "The hypot() Mystery"Last modified: Monday, August 16, 2004 7:05 pm
Patching Framework III
Time warp.
I needed to read some old files I wrote in 1992 using the Ashton-Tate
Framework III program.
Unfortunately, trying to run the program under Windows XP resulted in a
"Divide overflow
" error.
A bit of searching on the web revealed that the problem was related
to the system's speed (1.6GHz).
Apparently, Framework tries to calculate the speed of the machine
by dividing a fixed number with a loop counter;
on modern machines this results in the overflow.
Continue reading "Patching Framework III"Last modified: Thursday, August 12, 2004 9:47 am
Optimizing ppp and Code Quality
The Problem
While debugging a problem of my ppp connection I noticed that
ppp was apparently doing a protocol lookup (with a file open,
read, close sequence) for every packet it read.
This is an excerpt from the strace log, one of my
favourite debugging tools.
Continue reading "Optimizing ppp and Code Quality"Last modified: Sunday, May 16, 2004 1:09 pm
Computer Languages Form an Ecosystem
(This is a copy of an
article I posted on
slashdot on March 15th,
in response to a discussion titled
C Alive and Well Thanks to Portable.NET.
Many posters argued that the C language is dead.
I add my response here, because one month after its original slashdot submission,
I am still getting web site hits from it.)
Continue reading "Computer Languages Form an Ecosystem"Last modified: Sunday, April 18, 2004 1:10 pm
Binary File Similarity Checking
How can one determine whether two binary files
(for example, executable images) are somehow similar?
I started writing a program to perform this task.
Such a program could be useful for determing
whether a vendor had included GNU
Public License (GPL)
code in a propriatary product, violating the GPL license.
After writing about 20 lines, I realized that I needed an accurate
definition of similarity than the vague
"the two files contain a number of identical subsequences"
I had in mind.
Continue reading "Binary File Similarity Checking"Last modified: Friday, March 19, 2004 2:15 pm
A Unix-based Logic Analyzer
A circuit I was designing was behaving in unexpected ways:
the output of a wireless serial receiver based on Infineon's TDA5200
was refusing to drive an LS TTL load.
To debug the problem I needed an oscilloscope or a logic analyzer,
but I had none.
I searched the web and located
software to convert the PC's parallel port to a logic analyzer.
I downloaded the 900K program, but that was not the end.
Unfortunately the design of Windows 2000 does not allow direct access
to the I/O ports, so I also downloaded
a parallel port device driver and a program to give the appropriate privileges to other
programs.
Finally, I also downloaded from a third site the Borland runtime libraries
required by the logic analyzer.
Needless to say that the combination refused to work.
Continue reading "A Unix-based Logic Analyzer"Last modified: Sunday, October 26, 2003 10:52 pm
Well-behaved Web Applications
Very few web-based applications are designed to match the
web metaphor.
As a result they are often irritating, counteproductive,
or simply unusable.
During the last two months I've been working on an
IEEE Software theme issue titled "developing with
open source software".
Most of my work is performed over the
IEEE Computer Society
Manuscript Central
web application.
The application is an almost perfect example of everything that
is often wrong with such interfaces.
Continue reading "Well-behaved Web Applications"Last modified: Friday, September 26, 2003 9:17 pm
Code Reading: The Open Source Perspective
In July 2000, while working on a paper on the use of slicing for
choosing parts of an application to develop in a scripting language
(don't ask), I found myself searching open-source programs for
motivating examples, and experimenting with a tool for annotating the
corresponding source code. At some point, a loud click sound in my mind
brought to my attention the fact that although most books and courses
teach us how to program, we actually spend most of our time reading code
others have written. I reasoned that by applying my annotation tool on
open source software I could write a book to present the ideas,
techniques, and tools that go behind code reading.
Continue reading "Code Reading: The Open Source Perspective"Last modified: Friday, October 3, 2003 6:12 pm