On Tuesday March 17th 2020 my free online massive open online course (MOOC)
on the use of Unix command line tools
for data, software, and production engineering
goes live on the edX platform.
Already more than one thousand participants from around the world
have registered for it;
you should still be able to enroll through
this link.
In response to the course’s announcement
seasoned researchers from around the world have commented that this is an
indispensable course
and that it is
very hard to beat the ROI of acquiring this skillset, both for academia and industry.
In an age of shiny IDEs and cool GUI tools, what are the reasons for
the enduring utility and popularity of the Unix command line tools?
Here’s my take.
Continue reading "Seven reasons to add Unix command line expertise to your tool chest"Last modified: Monday, March 16, 2020 0:34 am
Applied Code Reading: Debugging FreeBSD Regex
When the code we're trying to
read is inscrutable,
inserting print statements and running various test cases can be
two invaluable tools.
Earlier today I fixed
a tricky problem in the FreeBSD regular expression library.
The code,
originally written by Henry Spencer in the early 1990s,
is by far the most complex I've ever encountered.
It implements sophisticated algorithms with minimal commenting.
Also, to avoid code repetition and increase efficiency,
the 1200 line long main part of the regular expression execution engine is
included in the compiled C code
three times after modifying various macros to adjust the code's behavior:
the first time the code targets small expressions and operates
with bit masks on long integers,
the second time the code handles larger expressions
by storing its data in arrays,
and the third time the code is also adjusted to handle multibyte characters.
Here is how I used test data and print statements to locate and fix the problem.
Continue reading "Applied Code Reading: Debugging FreeBSD Regex"Last modified: Wednesday, September 16, 2009 9:44 am
Monitor Process Progress on Unix
I often run file-processing commands that take many hours to
finish, and I therefore need a way to monitor their progress.
The Perkin-Elmer/Concurrent OS32 system I worked-on for a couple
of years back in 1993 (don't ask)
had a facility that displayed for any executing
command the percentage of work that was completed.
When I first saw this facility working on the programs I maintained,
I couldn't believe my eyes, because I was sure that those rusty
Cobol programs didn't contain any functionality to monitor their progress.
Continue reading "Monitor Process Progress on Unix"Last modified: Monday, October 27, 2008 1:34 pm
Open and Closed Source Kernels Go Head to Head
Earlier today I presented at the
30th International Conference on Software Engineering a
research paper comparing the
code quality of Linux, Windows (its
research kernel distribution),
OpenSolaris, and
FreeBSD.
For the comparison I parsed multiple configurations of these systems (more than ten million lines), and stored the results in four databases, where I could run SQL queries on them. This amounted to 8GB of data, 160 million records.
(I’ve made the databases and the SQL queries available
online.)
The areas I examined were file organization, code structure, code style, preprocessing, and data organization.
To my surprise there was no clear winner or looser, but there were interesting differences in specific areas.
Continue reading "Open and Closed Source Kernels Go Head to Head"Last modified: Friday, May 16, 2008 1:44 am
The Power of an Integrated Platform
FreeBSD, unlike Linux, is not a kernel, but a complete
operating system.
This allows a much smoother integration of its components,
which is a real boon when you try to locate and fix a problem.
The source code for all the parts is all ordered in a single
directory tree for you to examine and experiment with.
Continue reading "The Power of an Integrated Platform"Last modified: Thursday, February 21, 2008 8:15 am
The Relativity of Performance Improvements
Today, after receiving a 1.7MB daily security log message containing
thousands of ssh failed login attempts from bots around the
world, I decided I had enough.
I enabled IPFW to a FreeBSD system I maintain, and added a script
to find and block the offending IP addresses.
In the process I improved the script's performance.
The results of the improvement were unintuitive.
Continue reading "The Relativity of Performance Improvements"Last modified: Monday, January 7, 2008 10:58 am
International BSD Conference in Turkey
I'm on my way back from the
International BSD Conference in Turkey,
which a group of enthusiastic members of our community organized on
Friday and Saturday.
Continue reading "International BSD Conference in Turkey"Last modified: Sunday, October 21, 2007 4:15 pm
The Memory Savings of Shared Libraries
A recent thread in the
FreeBSD ports mailing list
discusses the benefits and drawbacks of static builds.
How can we measure the memory savings of shared libraries?
Continue reading "The Memory Savings of Shared Libraries"Last modified: Saturday, October 6, 2007 9:16 pm
The Tools we Use
It is impossible to sharpen a pencil with a blunt ax. It is equally vain to try to do it with ten blunt axes instead.
— Edsger W. Dijkstra
Continue reading "The Tools we Use"Last modified: Sunday, September 2, 2007 12:06 am
A Humbling Upgrade
Yesterday I upgraded one of the servers I maintain from
FreeBSD 4.11, which had reached its
end of life, into the latest
production release 6.2.
It was a humbling experience.
Continue reading "A Humbling Upgrade"Last modified: Wednesday, April 4, 2007 1:48 pm
Cross Compiling
Cross compiling software on a host platform to run on a different
target used to be an exotic stunt to be performed by
the brave and desperate.
One had first to configure and build the compiler, assembler, archiver,
and linker for the different architecture, then cross-build the other
architecture's libraries, and finally the software.
This week, while preparing a new release of the
CScout refactoring browser
I realized that what was once a feat is nowadays a routine operation.
Continue reading "Cross Compiling"Last modified: Saturday, September 30, 2006 10:32 pm
NASSCOM Quality Summit 2006
Last week I attended NASSCOM's 2006 Quality Summit in Bangalore, India.
There I gave a tutorial on tooling with open source software, and
delivered a talk on Global Software Development in the FreeBSD Project.
It was an edifying trip.
Continue reading "NASSCOM Quality Summit 2006"Last modified: Sunday, September 17, 2006 10:35 pm
Hardware and Software Debugging
Debuggging day.
The MySQL
5.0 server I tried to run as part of a
MediaWiki installation under
FreeBSD,
crashed during initialization, and a Tomy Walkabout
digital baby monitor started emitting a low beeping sound.
I solved both cases through educated guesses.
Continue reading "Hardware and Software Debugging"Last modified: Saturday, September 2, 2006 11:47 pm
Surprising Findings on Software Reuse
Kevin DeSouza and his colleagues in a recent
article in the
Communications of the ACM published some surprising
findings regarding software reuse:
reuse happens more by novices rather than by experts,
within projects rather than across them, and in
transient teams rather than permanent ones.
The statement regarding the higher propensity of rookies to reuse
compared to older professionals rang particularly true to my ears.
Continue reading "Surprising Findings on Software Reuse"Last modified: Sunday, May 7, 2006 10:01 pm
Public Bookmarking
You're searching the internet to answer a question you have,
and after some painstaking detective work you locate the answer.
Where do you store the answer for future reference?
Continue reading "Public Bookmarking"Last modified: Monday, April 24, 2006 3:42 pm
A Tree of Mentors
In the FreeBSD project, new
committers are assigned a mentor who overlooks their work, until they
are judged to be confident enough to work on their own.
As lots of things in the open-source landscape, having a mentor is a loan,
which we should pay back by mentoring somebody else.
Continue reading "A Tree of Mentors"Last modified: Saturday, January 7, 2006 4:05 pm
A Pipe Namespace in the Portal Filesystem
The portal filesystem allows a daemon running as a userland program
to pass descriptors to processes that open files belonging to its
namespace.
It has been part of the *BSD operating systems since 4.4 BSD.
I recently added a pipe namespace to its FreeBSD implementation.
This allows us to
perform scatter gather operations without using temporary files,
create non-linear pipelines, and
implement file views using symbolic links.
Continue reading "A Pipe Namespace in the Portal Filesystem"Last modified: Wednesday, September 21, 2016 10:38 pm
Maintainability of the FreeBSD System
Last November Ioannis Samoladas and his colleagues published an article
in the Communications of the ACM [1] that compared the maintainability
of open-source versus-closed source projects.
I applied the maintainability index [2] they used on the FreeBSD source
repository following the code's maintainability over time, and comparing
the maintainability of different modules.
Here are the results.
Continue reading "Maintainability of the FreeBSD System"Last modified: Friday, February 4, 2005 5:50 pm
Measuring the Effect of Shared Objects
For the Code Quality
book I am writing I wanted to measure the memory savings of
shared libraries.
On a lightly loaded web server these amounted to 80MB,
on a more heavilly loaded shell access machine these ammounted
to 300MB.
Continue reading "Measuring the Effect of Shared Objects"Last modified: Saturday, December 11, 2004 8:54 pm
System administration stories: The Revolt
Can a small embedded system the size of a paperback
lead a group of machines into revolt?
Apparently yes.
Continue reading "System administration stories: The Revolt"Last modified: Saturday, September 11, 2004 10:47 pm
Detective Work and Dropped TCP Connections
I had problems with TCP connections (mostly long-lasting ssh sessions)
getting dropped on my ADSL line.
In the end, I found that the problem had two different roots.
The detective work behind establishing them is, I believe, interesting.
It also shows how accessible source code, and the will to use it,
can be a tremendous boost to difficult system administration problems.
Continue reading "Detective Work and Dropped TCP Connections"Last modified: Monday, August 23, 2004 11:12 am
The hypot() Mystery
I was writing a section for the
Code Reading
followup volume, and wanted to demonstrate the pitfalls of
using homebrewn mathematical functions instead of the library
ones.
As an example, I chose to compare the C library
hypot(x, y)
function,
against
sqrt(x * x, y * y)
.
I created a plot of "unit in last place" (ulp) error values between
the two functions, which demonstrated how the error increased for larger
values of y.
Continue reading "The hypot() Mystery"Last modified: Monday, August 16, 2004 7:05 pm
Optimizing ppp and Code Quality
The Problem
While debugging a problem of my ppp connection I noticed that
ppp was apparently doing a protocol lookup (with a file open,
read, close sequence) for every packet it read.
This is an excerpt from the strace log, one of my
favourite debugging tools.
Continue reading "Optimizing ppp and Code Quality"Last modified: Sunday, May 16, 2004 1:09 pm
Binary File Similarity Checking
How can one determine whether two binary files
(for example, executable images) are somehow similar?
I started writing a program to perform this task.
Such a program could be useful for determing
whether a vendor had included GNU
Public License (GPL)
code in a propriatary product, violating the GPL license.
After writing about 20 lines, I realized that I needed an accurate
definition of similarity than the vague
"the two files contain a number of identical subsequences"
I had in mind.
Continue reading "Binary File Similarity Checking"Last modified: Friday, March 19, 2004 2:15 pm
Software Complexity: Open Source vs Microsoft
In a readable and interesting paper titled
CyberInsecurity: the cost of a monopoly
seven notable security experts argue that the Microsoft's near monopoly
in the desktop operating system and office productivity markets is creating
a dangerous monoculture that exacerbates the effect of security vulnerabilities.
Continue reading "Software Complexity: Open Source vs Microsoft"Last modified: Friday, October 3, 2003 7:33 pm
FreeBSD Committer
I became a FreeBSD committer.
I've been using BSD Unix systems
since 1986 starting with 4.3 BSD on a pair of VAX 780 machines. In
1992, as a bored PhD student, I reimplemented sed(1) and contributed it
the unencumbered BSD version that was then being put together; it is now
part of the *BSD family. I crossed again paths with BSD software when
the prize of the 2000 Usenix technical conference ``win a pet Shark
contest'', Digital's Network Appliance Reference Design-DNARD, came with
a NetBSD boot image. I used that code for drawing about 500 examples
for my book
Code Reading: The Open Source Perspective (Addison-Wesley
2003), detailing how to read software code others have written
.
Since 2001 I 've been using
FreeBSD to control my home's security, communications, and entertainment
systems as described in a
SANE conference paper
and a
recent article
in Personal and Ubiquitous Computing
(as an academic I have to live by the "publish or perish" motto).
Continue reading "FreeBSD Committer"Last modified: Friday, October 3, 2003 7:34 pm