Unix make vs Apache Airflow

 

In an IEEE Software “Adventures in Code” column titled Modular Data Analytics I describe the benefits and use of simple-rolap, a tool suite for relational online analytical processing. I have built simple-rolap based on the Unix make tool and a few shell scripts. With make approaching its 50th birthday, before writing the column I looked for possible modern and better alternatives I might be ignoring.

When I asked on Mastodon and X for recommendations regarding a general-purpose, domain-agnostic tool for executing tasks based on their dependencies, the most common answer was make. Another suggested tool that matched my requirements was Apache Airflow, which I decided to investigate further.

I was disappointed by what I found. Airflow takes the unfortunate kitchen-sink approach to software development. Rather than focusing on doing one thing well, it has accreted features attempting to cover every imaginable need. It includes a scheduler, executors, event sensors, a web interface, and email notifiers. As these elements are not Airflow’s primary function, they are likely to be substandard. This design decision makes it awkward to integrate Airflow in environments that already have such features, as there will be incompatibilities and feature clashes.

As a consequence of feature bloat I also witnessed severe dependency bloat. In all, Airflow’s installation included 138 packages, which occupy 244 MB. Contrast this with the 235 kB executable size of make and the 2.2 MB of its libraries.

Finally, I was also shocked by a bizarre API design decision in the Airflow’s database operators, where the sql argument can be either a string to execute or a file path. Which of the two it is, is decided at runtime by trying to access the string as a file. This design’s inelegance and potential unreliability caused me to stop investigating further.

Comments   Toot! Share


Last modified: Tuesday, October 15, 2024 2:19 pm

Creative Commons Licence BY NC

Unless otherwise expressly stated, all original material on this page created by Diomidis Spinellis is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.