Category Archives: Good Math

Category Diagrams

One of the most unusual things about category theory that I really love is category diagrams. In category theory, many things that would be expressed as equations or complex formulae in most mathematical formalisms can be presented as diagrams in category theory. If you are, like me, a very visual thinker, than category diagrams can present information in a remarkably clear form – and the categorical form of many statements of proofs can be much clearer than the alternatives because it can be presented in diagram form.

A category diagram is a directed graph, where the nodes are objects from a category, and the edges are morphisms. Category theorists say that a graph commutes if, for any two paths through arrows in the diagram from node A to node B, the composition of all edges from the first path is equal to the composition of all edges from the second path. (But it’s important to note that you do need to be careful here: merely because you can draw a diagram doesn’t mean that it necessarily commutes, just like being able to write an equation doesn’t mean that the equation is true! You do need to show that your diagram is correct and commutes.)

As usual, an example will make that clearer.

cat-assoc.jpg

This diagram is a way of expressing the associativy property of morphisms: f circ (g circ h) = (f circ g) circ h. The way that the diagram illustrates this is: g circ h is the morphism from A to C. When we compose that with f, we wind up at D. Alternatively, f circ g is the arrow from B to D; if we compose that with h, we wind up at D. The two paths: f circ (A rightarrow C) and (B  rightarrow D) circ H are both paths from A to D, therefore if the diagram commutes, they must be equal. And the arrows on the diagram are all valid arrows, arranged in connections that do fit the rules of composition correctly, so the diagram does commute.

Let’s look at one more diagram, which we’ll use to define an interesting concept, the principal morphism between two objects. The principle morphism is a single arrow from A to B such that any composition of morphisms that goes from A to B will end up being equivalent to it.

In diagram form, a morphism m is principle if forall x : A rightarrow A, forall y: A rightarrow B, the following diagram commutes:

cat-principal.jpg

In words, this says that f is a principal morphism if for every endomorphic arrow x, and for every arrow y from A to B, f is is the result of composing x and y. There’s also something interesting about this diagram that you should notice: A appears twice in the diagram! It’s the same object; we just draw it in two places to make the commutation pattern easier to see. A single object can appear in a diagram as many times as you want to to make the pattern of commutation easy to see. When you’re looking at a diagram, you need to be a bit careful to read the labels to make sure you know what it means.

One more definition by diagram: (x, y) is a a retraction pair, and A is a retract of B (written A < B) if the following diagram commutes:

cat-retract.jpg

That is, x : A rightarrow B, y: B rightarrow A are a retraction pair if y circ x = 1_A.

Category Intuition by Example

The good thing about category theory is that it abstracts everything down to a couple of very simple, bare-bones concepts, and lets us play with those concepts in really fascinating ways. But the bad thing about category theory is that it abstracts everything down so much that it can be difficult to have any intuitive sense about what things actually mean.

We said that a category is, basically, a collection of objects connected by arrows, and a composition operation over the arrows. But what the heck is an object, and what is an arrow?

In some sense, the point of category theory is that it doesn’t matter. There’s a set of fundamental concepts that underly all composable mapping-like things, and category theory focuses in on those fundamental concepts, and what they mean.

But still, to understand category theory, you need to be able to get some kind of intuitive handle on what’s going on. And so, we’ll take a look at a few examples.

The Category Set

One of the easiest examples is the category Set. The objects of the category are, obviously, sets; the morphisms are functions between sets; and composition is (obviously) function composition.

That much is easy. Now, what about the special arrows? What do all of the epi, iso, endo, and auto arrows in the category of sets?

An monomorphism over the category of sets is an injective function – that is, a function which maps each value in the domain to a distinct value in the range. So you can always reverse the function: given a value in the range of the function, you can always say, specifically, exactly what value in the domain maps to it. In a monomorphism in the category set, there may be values in the target of the morphism that aren’t mapped to by any element of the range; but if a value of the range is mapped to, you know which value in the domain mapped to it.

The key property of the monomorphism is that it’s left-cancellative – that is, if f is a monomorphism, then we know that f circ g == f circ h is only true if g = h. We can cancel the left side of the composion. Why? Because we know that f maps each value in its domain to a unique value in its range. So f circ g and f circ h can only be the same if they map all of the same values in their domain to the same values in their range – that is, if they’re the same function.

An epimorphism is an onto function – that is, a mapping f from a set X to a set Y in which for every value y in Y, there’s some value in x in X such that f(x)=y. It’s a dual of the notion of a monomorphism; for each value in the range, there is at least one value in the domain that maps to it. But it’s not necessary that the domain values be unique – there can be multiple values in the domain that map onto a particular value in the range, but for every element of the domain, there’s always at least one value that maps to it. It’s right-cancellative in the same way that the monomorphism is left-cancellative.

What about iso-morphism? If you combine the ideas of mono-morphism and epi-morphism, what you end up with is a function where every member of the domain is mapped onto a unique value in the range, and every value in the range is mapped onto by at least one value: it’s a one-to-one function.

The Category Poset

Poset is the category of partially ordered sets. A partially ordered set is a set with a binary relation, le, which satisfies the following properties:

  1. Reflexivity: forall a: a le a
  2. Antisymmetry: if a le b land b le a then a = b.
  3. Transitivity: if a le b land b le c then a le c

In the category of partially ordered sets, the arrows are monotonic functions – that is, functions that preserve the partial ordering. So if x le y, then if f is monotonic, f(x) le f(y).

The arrow composition operation, circ, is still just function composition. In terms of meaning, it’s pretty much the same as it was for simple sets: a monomorphism is still a surjective function; etc. But in Poset it needs to be a surjective function that preserves the ordering relation.

We know that monotonic functions fit the associative and identity properties of category arrows – they’re still just functions, and monotonicity has no effect on those. But for posets, the composition is a bit tricky – we need to ensure that composition maintains the partial order. Fortunately, that’s easy.

  • Suppose we have arrows f : A rightarrow B, g : B rightarrow C. We know that f and g are monotonic functions.
  • So for any pair of x and y in the domain of f, we know that if x le y, then f(x) le f(y).
  • Likewise, for any pair s,t in the domain of g, we know that if s le t, then g(s) le g(t).
  • So we just put those together: if x le y, then f(x) le f(y). f(x) and f(y) are in the domain of g, so if f(x) le f(y) then we know g(f(x)) le g(f(y))

The category Grp

Our last example is the category of groups, which is called Grp. In this category, the objects are (obviously) groups. What are arrows? Group homomorphisms – that is, essentially, functions between sets that preserve the group structure.

What’s a homomorphism in this category? That’s a bit confusing, because we get stuck with some overloaded terminology. But it’s basically just a symmetry-preserving surjection – that is, it’s a reversable mapping from a group into a second group, where the group symmetry of the first group is mapped onto the group symmetry of the second by the arrow. You should be able to follow the meanings of the other kinds of morphisms by using similar reasoning.

These are all, of course, very simple examples. They’re all concrete categories, where the objects are (speaking very loosely) sets. There are categories – many very interesting ones – where the objects aren’t sets or even particularly set-like – but we’ll get to those later. For now, the ideas of objects as sets gives you an intuitive handle to grab onto.

Leading up to Topoi: Getting Back to Categories

As I mentioned a few posts ago, I recently changed jobs. I left Google, and I’m now working for foursquare. Now that I’m done with the whole job-hunting thing, and things are becoming relatively saner and settling down, I’m trying to get back to blogging.

One thing that I’ve been wanting to spend some time learning about is Topoi theory. Topoi theory is an approach to building logic and mathematics using category theory as a fundamental basis instead of set theory. I’ve been reading the textbook Topoi: The Categorial Analysis of Logic (Dover Books on Mathematics), and I’ll be blogging my way through it. But before I get started in that, I thought it would be a good idea to revise and rerun my old posts on category theory.

To get started, what is category theory?

Back in grad school, I spent some time working with a thoroughly insane guy named John Case who was the new department chair. When he came to the university, he brought a couple of people with him, to take temporary positions. One of them was a category theorist whose name I have unfortunately forgotten. That was the first I’d ever heard of cat theory. So I asked John what the heck this category theory stuff was. His response was “abstract nonsense”. I was astonished; a guy as wacky and out of touch with reality as John called something abstract nonsense? It turned out to be a classic quotation, attributed to one of the founders of category theory, Norman Steenrod. It’s silly and sarcastic, but it’s also not an entirely bad description. Category theory takes abstraction to an extreme level.

Category theory is one of those fields of mathematics that fascinates me: where you take some fundamental concept, and abstract it down to its bare essentials in order to understand just what it really is, what it really means. Just like group theory takes the idea of an algebraic operation, strip it down to the bare minimum, and discovering the meaning of symmetry; category theory looks at what happens if you take the concept of a function as a mapping from one thing to another, and strip it down to the bare minimum, and see what you can discover?

The fundamental thing in category theory is an arrow, also called a morphism. A morphism is an abstraction of the concept of homomorphism, which I talked about a bit when I was writing about group theory. Category theory takes the concept function mapping from one set of values to another, and strips it down to itsbarest essentials: the basic concept of something that maps from one thing to some other thing.

The obvious starting point for our exploration of category theory is: what the heck is a category?

To be formal, a category C is a tuple: (O, M, circ), where:

  1. O (or Obj(C)) is a set of objects. Objects can be anything, so long as they’re distinct, and we can tell them apart. All that we’re going to do is talk about mappings between them – so as long as we can identify them, it doesn’t matter what they really are. We’ll look at categories of sets, of numbers, of topological spaces, and even categories of categories.

  2. M (or Mor(C)) is a set of morphisms, also called arrows. Each morphism is a mapping from an object in O called its source, to an object in O called its target. Given two objects a and b in O, we’ll write Mor(a,b) for the set of morphisms from a to b. To talk about a specific morphism f from a to b, we’ll write it as f : a rightarrow b.
  3. circ is the composition operator: dot is a binary operation that is the abstraction of function composition; circ; given an arrow f in Mor(a,b), and an arrow g in Mor(b,c), f circ g in Mor(a,c). It’s got the basic properties of function composition:

    1. Associativity: forall f : a rightarrow b, g : b rightarrow c, h : c =rightarrow d) h circ (g circ f) = (h circ g) circ f.
    2. Identity: forall a,b in O(C): exists 1_a, 1_b in Mor(C): forall f : a rightarrow b: 1_b circ f = f = f circ 1_a.

One neat little trick to simplify things is that we can actually throw away Obj(C), and replace it with the set of identity morphisms: since there is exactly one identity morphism per object, there’s no real need to distinguish between the identity morphism and the object. It’s a nice trick because it means that we have nothing but morphisms all the way down; but it’s really just a trick. We can talk about Obj(C); or Id(C); but we still need to be able to talk about the objects in some way, whether they’re just a subset of the morphisms, or something distinct.

Now, we get to something about category theory that I really don’t like. Category theory is front-loaded with rather a lot of definitions about different kinds of morphisms and different kinds of categories. The problem with that is that these definitions are very important, but we don’t have enough of the theory under our belts to be able to get much of a sense for why we should care about them, or what their intuitive meaning is. But that’s the way it goes sometimes; you have to start somewhere. It will make sense eventually, and you’ll see why these definitions matter.

There are a lot of special types of morphisms, defined by properties. Here’s the basic list:

  • A monomorphism (aka a monic arrow ) is an arrow f : a rightarrow b such that forall g_1, g_2: x rightarrow a: f circ g_1 = f circ g_2 Leftrightarrow g_1 = g_2. That is, f is monic if and only if, when composed with other arrows, it always produces different results for different arrows.
  • An epic (or epimorphism) is an arrow f : a rightarrow b such that forall g_1, g_2: b rightarrow x):  g_1 circ f = g_2 circ f Leftrightarrow  g_1 = g_2. This is almost the same as a monic, but it’s from the other side of the composition; instead of f circ g_i in the definition, it’s g_i circ f; so an arrow is epic if when another arrow is composed with f, it always produces different results for different arrows.
  • An isomorphism is an arrow f : a rightarrow b such that exists g: b rightarrow a: f circ g = 1_b land g circ f = 1_a – an isomorphism is, basically, a reversible arrow: there’s a morphism that always reverses the action of an iso arrow.
  • An endomorphism is an arrow f : a rightarrow  b where a = b. It’s sort of like a weak identity arrow.
  • An automorphism is an arrow that is both an endmorphism and an isomorphism.

One last definition, just because it gives me a chance to point out something useful about category theory. A functor is a morphism in the category of all categories. What that means is that it’s a structure-preserving mapping between categories. It’s neat in a theoretical way, because it demonstrates that we’re already at a point where we’re seeing how category theory can make it easier to talk about something complicated: we’re using it to describe itself! But the concept of functor also has a lot of applications; in particular, the module system of my favorite programming language makes extensive use of functors.

In Ocaml, a module is something called a structure, which is a set of definitions with constrained types. One of the things you often want to be able to do is to write a piece of code in a way that allows you to make it parametric on some other structure. The way you do that is to write a functor: a “function” from a structure to a structure. For example, to implement a generic binary tree, you need a type of values that you’ll put in the tree; and an operation to compare values. The way you do that is to write a functor which takes a structure defining a type and a comparison operator, and mapping it to a structure which is an implementation of a binary tree for that type and comparison.

The Ocaml functor is a category theoretic functor: category theory provides an easy way to talk about the concept of the compile-time “function” from structure to structure.

A Taste of Specification with Alloy

In my last post (which was, alas, a stupidly long time ago!), I talked a bit about software specification, and promised to talk about my favorite dedicated specification tool, Alloy. Alloy is a very cool system, designed at MIT by Daniel Jackson and his students.

Alloy is a language for specification, along with an environment which allows you to test your specifications. In a lot of ways, it looks like a programming language – but it’s not. You can’t write programs in Alloy. What you can do is write concise, clear, and specific descriptions of how something else works.

I’m not going to try to really teach you Alloy. All that I’m going to do is give you a quick walk-though, to try to show you why it’s worth the trouble of learning. If you want to learn it, the Alloy group’s website has a really good the official Alloy tutorial. which you should walk through.

Continue reading

The Value of Tests: It's more than just testing!

Since I have some free time, I’ve been catching up on some of the stuff
I’ve been meaning to read. I’ve got a reading list of stuff that I’ve wanted
to look at that were written by other authors with my publisher. Yesterday, I started looking at Cucumber, which is an interesting behavior-driven development tool. This post isn’t really about Cucumber, but about something that Cucumber reminded me of.

When a competent programmer builds software, they write tests. That’s just
a given. But why do we do it? It seems like the answer is obvious: to make sure that our software works. But I’d argue that there’s another reason, which in the long run is as important as the functional one. It’s to describe what the software does. A well-written test doesn’t just make sure that the software does the right thing – it tells other programmers what the code is supposed to do.

A test is an executable specification. Specifications are a really good thing; executable specifications are even better.

Continue reading

Stuff Everyone Should Do (part 2): Coding Standards

Another thing that we did at Google that I thought was surprisingly effective and useful was strict coding standards.

Before my time at Google, I was sure that coding standards were pointless. I had absolutely no doubt that they were the kind of thing that petty bureaucrats waste time writing and then use to hassle people who are actually productive.

I was seriously wrong.

At Google, I could look at any piece of code, anywhere in Google’s codebase, and I could read it. The fact that I was allowed to do that was pretty unusual in itself. But what was surprising to me was just how much the standardization of style – indents, names, file structures, and comment conventions – made it dramatically easier to look at a piece of unfamiliar code and understand it. This is still surprising to me – because those are all trivial things. They shouldn’t have much impact – but they do. It’s absolutely shocking to realize how much of the time you spend reading code is just looking for the basic syntactic structure!

There’s a suite of common objections to this, all of which I used to believe.

It wastes time!
I’m a good coder, and I don’t want to waste time on stupidity. I’m good enough that when I write code, it’s clear and easy to understand. Why should I waste my time on some stupid standard? The answer is: because there is a value in uniformity. As I alluded to earlier – the fact that every piece of code that you look at — whether it was written by you, by one of your closest coworkers, or by someone 11 timezones away — will always demarcate structures in the same way, will always use the same naming conventions – it really, genuinely makes a big difference. You need so much less effort to read code that you haven’t looked at in a while (or at all), because you can immediately recognize the structure.
I’m an artist!
This is phrased facetiously, but it does reflect a common complaint. We programmers have a lot of pride in our personal style. The code that I write really does reflect something about me and how my mind works. It’s a reflection of my skill and my creativity. If I’m forced into some stupid standard, it seems like it’s stifling my creativity. The thing is, the important parts of your style, the important reflections of your mind and your creativity aren’t in trivial syntactic things. (If it is, then you’re a pretty crappy programmer.) The standard actually makes it easier for other people to see your creativity – because they can actually see what you’re doing, without being distracted by the unfamiliar syntactic quirks.
One size fits all actually fits none!
If you have a coding standard that wasn’t designed specifically for your project, then it’s probably non-optimal for your project. That’s fine. Again, it’s just syntax: non-optimal doesn’t mean bad. The fact that it’s not ideal for your project doesn’t mean that it’s not worth doing. Yeah, sure, it does reduce the magnitude of the benefit for your project, but at the same time, it increases the magnitude of the benefit for the larger organization. In addition, it frequently makes sense to have project-specific code styles. There’s nothing wrong with having a project-specific coding standard. In fact, in my experience, the best thing is to have a very general coding standard for the larger organization, and then project-specific extensions of that for the project-specific idioms and structures.
I’m too good for that!
This is actually the most common objection. It’s sort-of a combination of the others, but it gets at an underlying attitude in a direct way. This is the belief on the part of the complainer that they’re a better programmer than whoever wrote the standard, and lowering themselves to following the standard written by the inferior author will reduce the quality of the code. This is, to put it mildly, bullshit. It’s pure arrogance, and it’s ridiculous. The fact of the matter is that no one is so good that any change to their coding style will damage the code. If you can’t write good code to any reasonable coding standard, you’re a crappy programmer.

When you’re coding against a standard, there are inevitably going to be places where you disagree with the standard. There will be places where your personal style is better than the standard. But that doesn’t matter. There will, probably, also be places where the standard is better than your style. But that doesn’t matter easier. As long as the standard isn’t totally ridiculous, the comprehension benefits are significant enough to more than compensate for that.

But what if the coding standard is totally ridiculous?

Well, then, it’s rough to be you: you’re screwed. But that’s not really because of the ridiculous coding standard. It’s because you’re working for idiots. Screwing up a coding standard enough to really prevent good programmers from writing good code is hard. It requires a sort of dedicated, hard-headed stupidity. If you’re working for people who are such idiots that they’d impose a broken coding standard, then they’re going to do plenty of other stupid stuff, too. If you’re working for idiots, you’re pretty much screwed no matter what you do, coding standard or no. (And I don’t mean to suggest that software businesses run by idiots are rare; it’s an unfortunate fact, but there’s no shortage of idiots, and there are plenty of them that have their own businesses.)

Things Everyone Should Do: Code Review

As I alluded to in my last post (which I will be correcting shortly), I no longer work for Google. I still haven’t decided quite where I’m going to wind up – I’ve got a couple of excellent offers to choose between. But in the interim, since I’m not technically employed by anyone, I thought I’d do a bit of writing about some professional things that are interesting, but that might have caused tension with coworkers or management.

Google is a really cool company. And they’ve done some really amazing things – both outside the company, where users can see it, and inside the company. There are a couple of things about the inside that aren’t confidential, but which also haven’t been discussed all that widely on the outside. That’s what I want to talk about.

The biggest thing that makes Google’s code so good is simple: code review. That’s not specific to Google – it’s widely recognized as a good idea, and a lot of people do it. But I’ve never seen another large company where it was such a universal. At Google, no code, for any product, for any project, gets checked in until it gets a positive review.

Everyone should do this. And I don’t just mean informally: this should really be a universal rule of serious software development. Not just product code – everything. It’s not that much work, and it makes a huge difference.

What do you get out of code review?

There’s the obvious: having a second set of eyes look over code before it gets checked in catches bugs. This is the most widely cited, widely recognized benefit of code review. But in my experience, it’s the least valuable one. People do find bugs in code review. But the overwhelming majority of bugs that are caught in code review are, frankly, trivial bugs which would have taken the author a couple of minutes to find. The bugs that actually take time to find don’t get caught in review.

The biggest advantage of code review is purely social. If you’re programming and you know that your coworkers are going to look at your code, you program differently. You’ll write code that’s neater, better documented, and better organized — because you’ll know that people who’s opinions you care about will be looking at your code. Without review, you know that people will look at code eventually. But because it’s not immediate, it doesn’t have the same sense of urgency, and it doesn’t have the same feeling of personal judgement.

There’s one more big benefit. Code reviews spread knowledge. In a lot of development groups, each person has a core component that they’re responsible for, and each person is very focused on their own component. As long as their coworkers components don’t break their code, they don’t look at it. The effect of this is that for each component, only one person has any familiarity with the code. If that person takes time off or – god forbid – leaves the company, no one knows anything about it. With code review, you have at least two people who are familiar with code – the author, and the reviewer. The reviewer doesn’t know as much about the code as the author – but they’re familiar with the design and the structure of it, which is incredibly valuable.

Of course, nothing is every completely simple. From my experience, it takes some time before you get good at reviewing code. There are some pitfalls that I’ve seen that cause a lot of trouble – and since they come up particularly frequently among inexperienced reviewers, they give people trying code reviews a bad experience, and so become a major barrier to adopting code review as a practice.

The biggest rule is that the point of code review is to find problems in code before it gets committed – what you’re looking for is correctness. The most common mistake in code review – the mistake that everyone makes when they’re new to it – is judging code by whether it’s what the reviewer would have written.

Given a problem, there are usually a dozen different ways to solve it. Andgiven a solution, there’s a million ways to render it as code. As a reviewer, your job isn’t to make sure that the code is what you would have written – because it won’t be. Your job as a reviewer of a piece of code is to make sure that the code as written by its author is correct. When this rule gets broken, you end up with hard feelings and frustration all around – which isn’t a good thing.

The thing is, this is such a thoroughly natural mistake to make. If you’re a programmer, when you look at a problem, you can see a solution – and you think of what you’ve seen as the solution. But it isn’t – and to be a good reviewer, you need to get that.

The second major pitfall of review is that people feel obligated to say something. You know that the author spent a lot of time and effort working on the code – shouldn’t you say something?

No, you shouldn’t.

There is never anything wrong with just saying “Yup, looks good”. If you constantly go hunting to try to find something to criticize, then all that you accomplish is to wreck your own credibility. When you repeatedly make things to criticize just to find something to say, then the people who’s code you review will learn that when you say something, that you’re just saying it to fill the silence. Your comments won’t be taken seriously.

Third is speed. You shouldn’t rush through a code review – but also, you need to do it promptly. Your coworkers are waiting for you. If you and your coworkers aren’t willing to take the time to get reviews done, and done quickly, then people are going to get frustrated, and code review is just going to cause frustration. It may seem like it’s an interruption to drop things to do a review. It shouldn’t be. You don’t need to drop everything the moment someone asks you to do a review. But within a couple of hours, you will take a break from what you’re doing – to get a drink, to go to the bathroom, to talk a walk. When you get back from that, you can do the review and get it done. If you do, then no one will every be left hanging for a long time waiting on you.

Topoi Prerequisites: an Intro to Pre-Sheafs

I’m in the process of changing jobs. As a result of that, I’ve actually got some time between leaving the old, and starting the new. So I’ve been trying to look into Topoi. Topoi are, basically, an alternative formulation of mathematical logic. In most common presentations of logic, set theory is used as the underlying mathematical basis – set theory and a mathematical logic built alongside it provide a complete foundational structure for mathematics.

Topoi is a different approach. Instead of starting with set theory and a logic with set theoretic semantics, Topoi starts with categories. (I’ve done a bunch of writing about categories before: see the archives for my category theory posts.)

Reading about Topoi is rough going. The references I’ve found so far are seriously rough going. So instead of diving right in, I’m going to take a couple of steps back, to some of the foundational material that I think helps make it easier to see where the category theory is coming from. (As a general statement, I find that category theory is fascinating, but it’s so abstract that you really need to do some work to ground it in a way that makes sense. Even then, it’s not easy to grasp, but it’s worth the effort!)

A lot of category theoretic concepts originated in algebraic topology. Topoi follows that – one of its foundational concepts is related to the topological idea of a sheaf. So we’re going to start by looking at what a sheaf is.

Continue reading

The Perils of Premature Optimization

A reader who’s known me for a while just sent me a note asking me to say something about premature optimization. The reason that I mention that he’s known me for a while (in fact, someone who used to be a coworker) is because this is a topic that I’m prone to rant about in person, but I’ve never mentioned on the blog. So he knows that it’s something that I’ve got a strong opinion about. Basically, he’s having trouble dealing with an annoying coworker doing this, and he wants a rant-by-proxy. I’m OK with that. :-).

When you’re writing a new piece of software, particularly on a modern computer, one of the unintuitive things that frequently happens is that the performance of your system is really quite different from what you’d expect.And even when everything is as expected, most people don’t have a particularly good sense of tradeoffs – if I make this thing faster, what effect will it have on that? and if it does really improve the performance, how much will it improve it, and at what cost? If you can’t answer that – and answer it precisely, with supporting evidence – then you have no business optimizing.

So when you sit down to write some new code, what you should really do is write code that’s algorithmically efficient – that is, you pick an algorithm that has good performance in asymptotic time – and implement it in a straightforward way.

But what many people do – in fact, what pretty much all of us do at least some of the time – is try to be clever. We try to find opportunities to change the code to make it faster. Doing that before you understand where the computer actually spends its time when it’s running the program is what we call premature optimization.

Why not?

Continue reading

What if it's not Regular? Pump it!

At this point, we’ve seen a fair bit about regular languages, and we’ve gone through the introduction to context free languages. We know one way of showing that a language is regular or context free: if you can write a (regular/context free) grammar for a language, then that language is necessarily (regular/context free). But… what if we have a language that we suspect is not regular?

Continue reading