The early Research Edition Unix versions featured a program that would turn a stream of ASCII text into utterances that could be played by a voice synthesizer. The source code of this program was lost for years. Here’s the story of how I brought it back to life.
The (early 1973) Third Research Edition of Unix documented a program that would receive as input ASCII text and convert it into phonemes that could then be played by a Votrax voice synthesizer made by the Vocal Interface Division of Federal Screw Works. The program was written by M. D. McIlroy, who documented its operation in a detailed technical report.
Although the program appeared in the Unix manual pages up to the 1975 Sixth Research Edition, its source code was missing from the archives that had survived. Even its author lacked a copy.
Fortunately, in 2011, Jonathan Gevaryahu found most parts of the program’s source code in unallocated space of a Sixth Research Edition disk dump. (This means that the code was once stored on disk, but was later deleted, and the parts where it resided were never allocated to other uses.) Even better, he could reconstruct a single block that was missing from the program’s compiled version, which was also available. Based on these findings, I added the speak source code and the speech rules to the GiHub repository of Unix history I am maintaining.
To see how the program was working, I experimented with making it run and compile. As the program was written in an ancient dialect of C and was also unlikely to be portable, I first tried to make it work on a Sixth Edition Unix running on a SIMH PDP-11 emulator. This attempt quickly failed, because the console wasn’t reliable enough to allow me to transfer the code via copy-paste.
I then run the PDP-11 2.11 BSD Unix on the same emulator, which offers rudimentary internet connection capabilities. After configuring a .rhosts
file to allow remote copying (to obtain remote access, you simply add your remote host and user name), I was able to move the code to that machine.
However, compiling the code wasn’t immediately possible. To make it compile I
=+
, =^
to the modern +=
, ^=
operators,int tflag 0;
became int tflag = 0;
— I didn’t even know this form ever existed),seek
to lseek
, andexit
.At that stage the program could compile, but was crashing when I tried to run it. Given that 2.11 BSD lacked gdb and was generally slow and difficult to use, I decided to port the program to modern Unix/Linux. I also added more declarations, including full function prototypes to find other problems. (In early versions of C you didn’t need to declare a function before using it.) I then methodically removed all compiler warnings, which allowed me to pinpoint a variable that was declared as a pointer but used as an integer. By correcting its declaration I fixed the initial crash.
Now I had a program that compiled and run, but was still crashing in some cases, and also wasn’t producing correct output. For this further changes were needed.
char
array.'u1'
) with a macro that initialized the value in an endian-neutral manner.After these changes the program was able to compile the rules file and produce Votrax phoneme codes.
Votrax voice synthesizers and their descendant chips (which appear to use similar phoneme codes) are not longer marketed. In order to listen to the generated voice I needed a workaround. My first attempt was to use samples from the votrax-speak GitHub repository. Converting the phoneme Votrax codes into their mnemonic names, and passing the corresponding sample files as arguments to SoX, allowed me to create a sound file consisting of the phonemes played together. However, the generated sound file was almost unintelligible. As I read later, a great advantage of Votrax synthesizers was how they merged together the phonemes into continuous speech, which was not the case with my approach.
My second attempt involved using the phoneme output functionality of the espeak-ng program. For this I created a map between the Votrax phoneme codes and the corresponding espeak phonemes, which I then coded into a sed script that would feed espeak with the output of Unix speak. Through this method I was finally able to produce somewhat intelligible speech with a pipeline, such as the following.
echo Hello world |
speak speak.m |
LC_ALL=C ./votrax-espeak.sed |
espeak
The revived source code is available in this GiHub repository.
Last modified: Saturday, January 2, 2021 4:49 pm
Unless otherwise expressly stated, all original material on this page created by Diomidis Spinellis is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.