Category Archives: Programming

Binary Heaps

One of the most neglected data structures in the CS repertoire is
the heap. Unfortunately, the jargon is overloaded, so “heap” can mean
two different things – but there is a connection, which I’ll explain
in a little while.

In many programming languages, when you can dynamically create
objects in memory, the space from which you allocate the memory to
create them is called the heap. That is not what I’m talking about.

What I am talking about is an extremely simple but powerful structure
that is used for maintaining a collection of values from which you want
to be able to quickly remove the largest object quickly. This may sound
trivial, but this turns out to be an incredibly useful structure. You can use this basic structure to implement a very fast sorting algorithm; to maintain a prioritized set of values/tasks; to manage chunks of memory; etc. (The
prioritized set is the most common case in my experience, but that’s probably because I’ve spent so much of my career working on distributed systems.)

Continue reading

Meta out the wazoo: Monads and Monoids

Since I mentioned the idea of monoids as a formal models of computations, John Armstrong made the natural leap ahead, to the connection between monoids and monads – which are a common feature in programming language semantics, and a prominent language feature in Haskell, one of my favorite programming languages.

Continue reading

Databases are hammers; MapReduce is a screwdriver.

A bunch of people have sent me links to an article about MapReduce. I’ve hesitated to write about it, because the currently hyped MapReduce stuff was developed, and extensively used by Google, my employer. But the article is really annoying, and deserves a response. So I’m going to be absolutely clear. I am not commenting on this in my capacity as a Google employee. (In fact, I’ve never actually used MapReduce at work!) This is strictly on my own time, and it’s purely my own opinion. If it’s the dumbest thing you’ve ever read, that’s my fault, not Google’s. If it’s the most brilliant thing you’ve ever read, that’s my credit, not Google’s. I wasn’t asked to write this by Google, and I didn’t ask their permission, either. This is just me, the annoying geek behind this blog, writing solely
on my own behalf, speaking for no one but me. Got it?

The reason that I’m interested in this is because it’s related to my PhD work. Back in
grad school, what I worked on was applying parallelism to structured data in
non-scientific applications. That’s pretty much what MapReduce does. And the solution
that I proposed was a kind hierarchical scatter/gather operation – which is, very nearly, the
way that MapReduce works. The big difference? MapReduce beats what I did. The guys who designed MapReduce noticed something that I didn’t, and took advantage of it, which made M/R programs a lot cleaner and easier to write. The article that I’m going to discuss criticizes M/R for exactly that key idea.

Continue reading

The Genius of Donald Knuth: Typesetting with Boxes and Glue

Today is the 70th birthday of Donald Knuth.

If you don’t know who Knuth is, then you’re not a programmer. If you’re a programmer and you don’t know who Knuth is, well… I have no idea what rock you’ve been hiding under, but you should probably be fired.

Knuth is one of the most famous and accomplished people in the computer science community. He’s done all sorts of great research, published a set of definitive textbooks on computer algorithms, and of particular interest to me, implemented a brilliant, hideous, beautiful, godawful piece of software called TeX.

When I went to grad school, instead of being a teaching assistant, I worked for the department as a sysadmin. I ended up spending close to six years doing almost nothing but technical support for TeX, which explains my mixed emotions about it in the description above.

Continue reading

Simple Lempel-Ziv Compression in Erlang

I decided to do a little bit of something useful with Erlang, both to have some code to show, and to get some sense of what it’s like writing something beyond a completely trivial example. Because the capabilities of Erlang shine when you get into low-level bit oriented things, I thought that writing a bit of data compression code would be make for a good example. I’m going to present it in two parts: first a simple but stupid version of the algorithm; and then the encoding part, which into bit twiddling, and potentially gets more interesting.

I’m going to use a simple version of Lempel-Ziv compression. Before I get into the detail
of how LZ compression works, I’ve got an interesting story about how I learned about it. My first summer in grad school, I was looking for a job. One of my college friends worked for a subsidiary of Travellers insurance, and he got me an internship with them.

Our group (3 of us) worked on programs that ran on salepeoples’ laptops. Since this
was 1991, laptops were still pretty primitive. We had to run in 128K of memory, with the best machines having 10 megabytes of hard disk, and the worst machines having two floppies. So memory use was always a huge issue.

Being an insurance company, naturally things were pretty bureaucratic. They hired me to write a prototype of a new program. I don’t remember the details of what I was supposed to write. But the way things worked, they wanted to build this new tool for the salespeople. But the company wouldn’t let them propose a new project without having staffing for it. So they hired me to work on it. But because they hadn’t proposed the project before they hired me,
I had nothing to do while they proposed it, and worked out their requirements. So I worked for them for a total of about 12 weeks; and it took them about 9 1/2 weeks to get to the point where they had an approved project for me to work on. So I had over two months with nothing
to do. So I kept pestering the other two guys about what I could do to help them.

One of the things they wanted was a way of doing a splash screen. They had a GIF image of the company logo, and they thought it would be nice to be able to splash it on the screen whenever the sales app loaded. But they didn’t have an image viewer that they could call from inside their program. So they asked me to write one. GIF images are encoded using LZ. So I coded that up the LZ decoder to get the bitmap out of a GIF in C, and they were happy. Then I decided, as long as I had an LZ decompressor, I should write an LZ compressor, and then we’d have some generic data compression code. So I went ahead and did that, and added a
nice, effective set of data compression routines to our libraries. But the manager was actually pissed at me: I’d added a feature to our library – data compression – without getting permission. The right way of doing things was to write a proposal, and pass it around the various levels of petty bureaucrats for two months while I sat and twiddled my thumbs on the payroll.

Anyway… Back to compression.

Continue reading

Macros: why they're evil

I’ve gotten both some comments and some e-mail from people in response to my mini-rant about Erlang’s macros. I started replying in comments, but it got long enough that I thought it made sense to promote it up to a top-level post. It’s not really an Erlang issue, but a general language issue, and my opinions concerning it end up getting into some programming language design and philosophy issues. If, like me, you think that things like that are fun, then read on!

Continue reading

Records in Erlang

One of the things I discovered since writing part one of my Erlang introduction is that Erlang has grown a lot over the last few years. For example, the idiom of tagged tuple as a way of creating a record-like structure has been coded into the language. There is, unfortunately, a bit of a catch. They aren’t really added to the language. Instead, there’s a pre-processor in Erlang, and records
are defined by translation in the pre-processor. This to me typifies one of the less attractive
attributes of Erlang: much of Erlang has a very ad-hoc flavor to it. There are important high-level features – like record data and modules – which aren’t really entirely part of the language. Instead, they’re just sort of spliced in however it was easiest for the implementors to cram ’em in. And there are other things that were added to later versions of the language that, while first class, are awkward – like the “fun” prefix for first-class functions.

The effect of that kind of thing is mainly syntactic. The implementation is good enough that
even though things like records are really pre-processor constructs, they feel almost as
if they’re built-ins.

Anyway – let’s take a look at records.

Continue reading

Erlang: a Language for Functional Concurrency (Updated!)

Several commenters pointed out that I made several mistakes in my description of Erlang. It turns out that the main reference source I used for this post, an excerpt from a textbook on Erlang available on the Erlang website, is quite out of date. I didn’t realize this; I assumed that if they pointed towards that as a tutorial, that it would represent the current state of the language. My bad. As a result, several things that I said about Erlang – including some negative comments, were inaccurate. I’ve updated the article, and the changes are marked with comments.

As long-time readers will recall, one of my greatest interests in programming languages. A while back, I wrote a tutorial on Haskell, which is one of the most influential languages currently in the programming language research community. Haskell is a pure, lazy, functional language which gained a lot of attention in recent times for being incredibly expressive, while maintaining a solid theoretical basis with well-defined semantics. What made Haskell unusual was that it’s a completely pure functional language – meaning no true state at all – not for I/O, not for assignment, not for mutable data – but through the
use of a clever construct called a monad, you can create the effect of state without disturbing the functional semantics of the language. It’s quite a nice idea, although I must admit that I remain somewhat skeptical about how scaleable it might be.

One of the competitors of Haskell for mindshare in the community of people who are interested in functional programming languages is a language called Erlang. In many ways, Erlang is a polar opposite to Haskell. Erlang is, basically, a functional language, but it’s designers didn’t object to tossing in a bit of state when it made things easier. It’s dynamically typed,
and even for a dynamically typed language, it’s support for typing is weak. It’s gotten its attention for a couple of big reasons:

  1. Erlang was designed by Ericsson for implementing the software in their switches and routers. It’s the first functional language designed by a company for building production applications for extremely complex performance-critical low-level applications like switching and routing.
  2. Erlang is specifically designed for building distributed systems. As I’ve mentioned before, programming for distributed systems is incredibly difficult, and most programming languages are terrible at it.

Concurrency and distributed systems are a big deal. I’d argue that programming for concurrency and distribution are the biggest problems facing software developers today. Pretty much every computer that you can buy has multiple processors, and to take full advantage of their power, code needs to use concurrency. And the network is an unavoidable part of our environment: many of the applications that we’re writing today need to be aware of the internet, and need to interact with other systems. These are just simple facts of the modern computing world – and software developers need tools to deal with them.

Continue reading

Making Graph Algorithms Fast, using Strongly Connected Components

One of the problems with many of the interesting graph algorithms is that they’re complex. As graphs get large, computations over them can become extremely slow. For example, graph coloring is NP-complete – so the time to run a true optimal graph coloring algorithm on an arbitrary graph grows exponentially in the size of the graph!

So, quite naturally, we look for ways to make it faster. We’ve already talked about using
heuristics to get fast approximations. But what if we really need the true optimum? The other approach to making it faster, when you really want the true optimum, is parallelism: finding some way to subdivide
the problem into smaller pieces which can be executed at the same time. This is obviously a limited
technique – you can’t stop the exponential growth in execution time by adding a specific finite
number of processors. But for particular problems, parallelism can make the difference between
being able to do something fast enough to be useful, and not being able to. For a non-graph
theoretic but compelling example, there’s been work in something called microforecasting, which is precise weather forecasting for extreme small areas. It’s useful for predicting things like severe lightning storms in local areas during events where there are large numbers of people. (For example, it was used at the Atlanta olympics.) Microforecasting inputs a huge amount of information about local conditions – temperature, air pressure, wind direction, wind velocity, humidity, and such, and computes a short-term forecast for that area. Microforecasting is completely useless if it takes longer to compute the forecast than it takes for the actual weather to develop. If you can find a way to split the computation among a hundred processors, and get a forecast in 5 minutes, you’ve got something really useful. If you have to run it on a single processor, and it takes 2 hours – well, by then, any storm
that the forecast predicts is already over.

For graphs, there’s a very natural way of decomposing problems for parallelization. Many complex graphs for real problems can be divided into a set of subgraphs, which can be handled in parallel. In
fact, this can happen on multiple levels: graphs can be divided into subgraphs, which can be divided
into further subgraphs. But for now, we’ll mainly focus on doing it once – one decomposition of
the graph into subgraphs for parallelization.

Continue reading

Colored Petri Nets

Colored Petri Nets
The big step in Petri nets – the one that really takes them from a theoretical toy to a
serious tool used by protocol developers – is the extension to colored Petri nets (CPNs). Calling them “colored” is a bit of a misnomer; the original idea was to assign colors to tokens, and allow color-based expressions on the elements of the net. But study of analysis models quickly showed
that you could go much further than that without losing the basic analyzability properties
that are so valuable. In the full development of CPNs, you can associate essentially arbitrary
collections of data with Petri net tokens, so long as you use a strong type system, and keep
appropriate restrictions on the expressions used in the net. The colors thus become
data types, describing the types of data that can be carried by tokens, and the kinds of tokens that
can located in a place in the net.

So that’s the fundamental idea of CPNs: tokens have types, and each token type has some data value associated with it. Below the fold, we’ll look at how we do that,
and what it means.

Continue reading