I decided to do a little bit of something useful with Erlang, both to have some code to show, and to get some sense of what it’s like writing something beyond a completely trivial example. Because the capabilities of Erlang shine when you get into low-level bit oriented things, I thought that writing a bit of data compression code would be make for a good example. I’m going to present it in two parts: first a simple but stupid version of the algorithm; and then the encoding part, which into bit twiddling, and potentially gets more interesting.
I’m going to use a simple version of Lempel-Ziv compression. Before I get into the detail
of how LZ compression works, I’ve got an interesting story about how I learned about it. My first summer in grad school, I was looking for a job. One of my college friends worked for a subsidiary of Travellers insurance, and he got me an internship with them.
Our group (3 of us) worked on programs that ran on salepeoples’ laptops. Since this
was 1991, laptops were still pretty primitive. We had to run in 128K of memory, with the best machines having 10 megabytes of hard disk, and the worst machines having two floppies. So memory use was always a huge issue.
Being an insurance company, naturally things were pretty bureaucratic. They hired me to write a prototype of a new program. I don’t remember the details of what I was supposed to write. But the way things worked, they wanted to build this new tool for the salespeople. But the company wouldn’t let them propose a new project without having staffing for it. So they hired me to work on it. But because they hadn’t proposed the project before they hired me,
I had nothing to do while they proposed it, and worked out their requirements. So I worked for them for a total of about 12 weeks; and it took them about 9 1/2 weeks to get to the point where they had an approved project for me to work on. So I had over two months with nothing
to do. So I kept pestering the other two guys about what I could do to help them.
One of the things they wanted was a way of doing a splash screen. They had a GIF image of the company logo, and they thought it would be nice to be able to splash it on the screen whenever the sales app loaded. But they didn’t have an image viewer that they could call from inside their program. So they asked me to write one. GIF images are encoded using LZ. So I coded that up the LZ decoder to get the bitmap out of a GIF in C, and they were happy. Then I decided, as long as I had an LZ decompressor, I should write an LZ compressor, and then we’d have some generic data compression code. So I went ahead and did that, and added a
nice, effective set of data compression routines to our libraries. But the manager was actually pissed at me: I’d added a feature to our library – data compression – without getting permission. The right way of doing things was to write a proposal, and pass it around the various levels of petty bureaucrats for two months while I sat and twiddled my thumbs on the payroll.
Anyway… Back to compression.