Combinator Parsing, part 1

My professional work is focused on making the lives of engineers building software better. I got into that area because I was fascinated – or even obsessed – with programming languages. That led me to the more general area of engineering tools of all types. But programming languages and how they’re implemented remain one of my great fascinations.

If you’re trying to implement a new programming language (or re-implement an old one), the first thing you need to do is called parsing. Parsing consists of reading in a file containing a bunch of text, making sure that it’s correctly structured, and then using its structure to translate it into some kind of data structure.

For example, I implemented a silly programming language called Cruise. In Cruise, a snippet from a program looks like:

actor !Adder {
  behavior :Add() {
    on Plus(Z(),$x, $target) { send $x to $target }
    on Plus(I($x), $y, $target) { send Plus($x,I($y), $target) to $self }
  }
  initial :Add
}

In the Cruise implementation, the parser reads that input, and translates it into a structure called an abstract syntax tree. For the code in the example above, a part of the syntax tree for that code looks something like the following:

Actor[
  name="actor"
  behaviors=[
    Behavior[
      pattern=Tuple[
        name="Plus",
        Tuple[name="Z"],
        Var[name="x"],
        Var[name="target"]],
      action=[Send[Var["x"], Var["Target"]],
    Behavior[
      pattern=Tuple[
        name="Plus",
        Tuple[name="I", Var["x"]],
        Var["y"],
        Var["target"]],
      action=[
        Send[
          Tuple[name="Plus",
            Var["x"],
            Tuple[name="I", Var["y"]]],
          Var["target"]]]]]]

When you want to write a parser, you need to start by describing
what you want to parse. Most of the time, we do that using
a context free grammar (CFG). A CFG is a set of rules that
describe how to produce texts with a particular structure. For
example, suppose you wanted to implement a simple calculator (one
of the canonical examples used when teaching parsing!). You’d
need to describe what the expressions you want to parse
look like. A pretty typical version of this would be:

expr := add_expr
add_expr := mult_expr ( ( '+' | '-' ) mult_expr)*
mult_expr := unary_expr ( ('*' | '/') unary_expr)*
unary_expr := '-' simple | simple
simple := number | '(' expr ')'
number: digit+

This grammar is a way of expressing a system for producing valid expressions. The easiest way to see how it works is to look at an example. We know that something like “3 + 4 * (-5 + 6)” is an expression. So how would be produce that using the grammar?

  1. Start with expr. That always expands to add_expr.
  2. Expand add_expr to mult_expr ‘+’ mult_expr.
  3. Expand the first instance of mult_expr to unary_expr, giving us “unary_expr+mult_expr“.
  4. Expand the first unary expr to simple: “simple+mult_expr“.
  5. Expand simple to number: “number+mult_expr“.
  6. Expand number to the number 3: “3+mult_expr“.
  7. Expand the mult_expr to “unary_expr*unary_expr“: “3+unary_expr*unary_expr“.
  8. Expand the first unary_expr to simple, and simple to number, and number to “4”: “3+4*unary_expr“.
  9. Expand the remaining unary_expr to simple, and simple to a parenthesized expression: “3+4*(expr)”.
  10. And so on, until we end up with “3+4*(-5+6)”.

When I started learning to write compilers, there were two ways that people wrote parsers. They either implemented them manually, using a technique called recursive descent, or they used a table-driven parser generator like yacc, where you wrote the grammar using a special syntax, and then a tool translated it into the implementation of a parser.

Writing a parser using recursive descent was a huge pain. It’s very tedious and repetitive, and like any meticulous process, it’s very error-prone. In short, it’s no damn fun at all.

Writing a parser using a generator was usually much easier – but if something didn’t work, it would be a huge pain. The problem is that the code you write doesn’t end up looking very much like the code you run. For example, to write our expression language, in a typical parser generator, you’d write it something like the follows. (The grammar gets re-arranged a bit, to fit the particular underlying parser algorithm that the generator uses).

%start expr
expr: add_expr;
add_expr:
    add_expr '+' mult_expr
  | add_expr '-' mult_expr
  | mult_expr
  ;

mult_expr:
    mult_expr '*' unary_expr
  | mult_expr '/' unary_expr
  | unary_expr
  ;

unary_expr:
    '-' simple
  | simple
  ;

simple:
    number
  | '(' expr ')'
  ;

The parser generator reads the grammar that we wrote, and translates it into a set of lookup tables, which get interpreted by semi-generic parser code provided by the generator toolkit. What that means is that when your parser actually executes, there’s nothing called unary_expr. There’s some set of states encoded into magic variables with names like yytable and yytrans. If you got something wrong in your parser, debugging it is painful at best, because the grammar is gone: it’s been translated into something completely incomprehensible to human eyes.

Not all parser generators generate code that’s this extreme when it comes to readability. But it’s the output from the most widely used one, called YACC, which has been translated and used by C/C++, Caml, Go, Fortran, Pascal, D, SML, and more.

But even if you’re not using YACC, it’s not a good scene. Even ANTLR, my favorite parser generator, which makes a heroic effort to generate efficient but human-readable parsers, the fact remains that the code that you’re running and trying to debug isn’t familiar to you: it’s not what you wrote. You’re still stuck trying to work through an unfamiliar translation.

Honestly, both approaches – the manual parser, and the parser generator – suck an awful lot of the time.

Then, a few years ago, a gang of functional programmers came up with a truly brilliant scheme. You can describe a parser as a special kind of function from inputs to outputs. If you do that, you can also combine parsers using higher-order functions. Using those functions, all you need to do is write tiny parsers for the simplest elements of your language, and then you can build the parser for your language by just combining your simple parsers in different ways. When you implement a parser this way, what you get is very close to the original grammar of the language, but it’s an actual executable program: the code that you need to read and debug is exactly the code that you wrote.

For example, suppose I wanted to parse that expression language. Using a Python version of parser combinators, I can write:

digit = Parser.match(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'])
number = digit.many(1)
parens = Parser.match(['(']) & Reference('expr') & Parser.match([')'])
simple = number | parens
unary_expr = Parser.match(['-']).opt() & simple
mult_expr = unary_expr &  (Parser.match(['*', '/']) & unary_expr).many()
add_expr = mult_expr & (Parser.match(['-', '+']) & mult_expr).many()
expr = add_expr
Reference.register_named_parser('expr', add_expr)

That’s not a contrived example. That’s actually a real, honest-to-god parser for our expression language! We’ll look at it in more detail later. But first, we need to understand what parsers and parser combinators really are, and how they work.

At this point, it’s time to start getting more concrete. Just what is a parser, precisely?

In keeping with the fact that this approach came from the functional programming community, it makes sense to describe a parser as a function. If you think of it this way, a parser is a special kind of function:

  • It takes a sequence of inputs;
  • It can either succeed or fail;
  • If it succeeds, it consumes part of its input, and produces an output.

That means that we can come up with a sort-of type signature pretty easily. It’s a function from an input to a result value. The result value can be two types: either a success, or a failure. If it’s a success, it contains both the output value produced by the parser and the unconsumed part of the input. If it fails, the failure contains nothing.

In Python, that’s easy to represent in several different ways. We’ll do it in the one that’s closest to a typed approach: an abstract class ParseResult, with subclasses for success or failure.

class ParseResult(object):
  def succeeded(self):
    return False


class Success(ParseResult):
  """A successful parse result, containing the value produced by a parser,
  and the unconsumed part of its input.
  """
  def __init__(self, output, rest):
    self.output = output
    self.rest = rest

  def succeeded(self):
    return True


class Failure(ParseResult):
  def __init__(self):
    pass

The inputs are just a sequence of values. The parsers can look at one element of the input at a time. In Python, we’ll represent them using a class.

class ParserInput(object):
  @abstractmethod
  def at_end(self):
    """Return True if there's no remaining input."""

  def first(self):
     """Return the first element of this input."""

  def rest(self):
     """Return a ParserInput consisting of the remainder after the first
     element has been consumed.
     """

With those definitions, a parser becomes an object with one method, which takes a ParserInput, and produces a ParseResult.

class Parser(object):
  def parse(self, inp):
    """inp should be an instance of ParserInput; returns an instance of ParseResult"""

Before we get to the combinators, let’s look at an example of a simple parser. Suppose we want to accept a digit. We can implement that as:

class DigitParser(Parser):
  def parse(self, inp):
    if inp.first().isdigit():
      return Success(inp.first(), inp.rest())
    else:
      return Failure()

This looks at the first character of the input stream. This is mostly functional, so it does this without altering the input stream. If the first character is a digit, then it succeeds, returning the digit character and the remaining input; otherwise, it fails.

Simple code like that little snippet are all you need to write yourself. It’s so incredibly simple that looking at it, it seems like it’s almost too simple. How on earth can anything this trivially simple actually be the basis of something as powerful as the infamous YACC?

The answer is combinators: functions that combine parsers togethers. This simple basic, functional mechanism makes it possible to combine simple parsers using simple functions – producing a limitless variety of parsers, which can parse pretty much anything you choose to throw at them. So what kinds of combinators do we need? There are four basics that I used in my implementations, and one that’s an artifact of the implementation.

  1. Sequence: Parse an “A” followed by a “B”. In my Python implementation, this is written A & B. It succeeds only if both “A” and “B” succeed in sequence. The output of this is a list containing the outputs from “A”, “B”, and whatever else is in the sequence with them.
  2. Choice: Parse either an “A” or if that fails, parse a “B”. That’s written A | B. The output is whichever one succeeds, or else fails.
  3. Option: Parse either an “A”, or nothing, written a.opt(). The output is either the output from parsing an “A”, or None.
  4. Repetition: Parse more than one “A”, written a.many(min). The result is a list of the outputs of the invocations of “A”.
  5. Reference: I’m going to tell you what kind of thing to parse here later. For now, it’s just named “A”. The output is, of course, whatever the referenced parser outputs.

The last one definitely needs some extra explanation. Most interesting grammars are, ultimately, recursive. In our calculator’s expression grammar, the rule expr contains an add_expr, which contains a mult_expr, which contains a unary_expr, which contains a simple, which could expand to something that contains an expr. That leads to a problem constructing our parsers: we can’t actually build an instance of an expr, because it needs to contain itself. The Reference combinator is the solution to this: it lets you say Call the parser named “A”; I’ll tell you what parser that is later. If you look at the parser implementation I showed earlier, you can see this:

parens = Parser.match(['(']) & Reference('expr') & Parser.match([')'])
...
Reference.register_named_parser('expr', add_expr)

In the definition of the parens rule, we used Reference('expr'), which says “call whatever parser is registered with the name ‘expr'”. Then later on, we registered the add_expr parser with the name ‘expr’. Voila! Recursion.

There’s one more thing that I used in the calculator example: Parser.match(). That’s just a helpful little piece of code that I wrote to create parsers that look for one character from a fixed set. It’s implemented with a class named SetParser. Parser.match(['(']) will only match a “(” in the input; Parser.match(['a', 'b', 'c']) will match either an “a”, a “b”, or a “c”. Implementing it is as easy as you’d guess at this point:

class SetParser(Parser):
  def __init__(self, chars):
    self.chars = chars

  def pr(self):
    return "SetParser%s" % self.chars

  def parse(self, inp):
    if inp.first() in self.chars:
      return Success(inp.first(), inp.rest())
    else:
      return Failure()

With these, you should now be able to completely read and understand the implementation of our expression parser. Let’s walk through it.

digit = Parser.match(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']) 
number = digit.many(1)
parens = Parser.match(['(']) & Reference('expr') & Parser.match([')'])
simple = number | parens
unary_expr = Parser.match(['-']).opt() & simple
mult_expr = unary_expr &  (Parser.match(['*', '/']) & unary_expr).many()
add_expr = mult_expr & (Parser.match(['-', '+']) & mult_expr).many()
expr = add_expr
Reference.register_named_parser('expr', add_expr)
  • digit: this is a SetParser that reads a single digit.
  • number: a number is just a sequence of at least one digit.
  • parens: this is a parenthesized expression: an open paren followed by an expression, followed by a close paren.
  • simple: a simple expression is either a number, or a parenthesized expression.
  • unary: a unary expression is an optional “-“, followed by a simple.
  • mult_expr: a multiplication expression is a unary, followed by a collection of multiplication operators and operands.
  • add_expr: the same thing as a mult_expr, except that it uses add operations. By defining them seperately, with multiplication embedded inside addition, we gave multiplication a higher precedence.
  • expr is just an alias for add_expr.
  • Reference: finally, we bind the reference from the parenthesized expression.

Now that we have a parser, we can run it!

>>> inp = StringParserInput("1+2*(3+5*4)*(6+7)")
>>> print(expr.parse(inp).output)
[None, ['1'], [], [['+', [None, ['2'], [['*', [None, ['(', [None, ['3'], [], [['+', [None, ['5'],
 [['*', [None, ['4']]]]]]]], ')']]], ['*', [None, ['(', [None, ['6'], [], [['+', [None, ['7'],
  []]]]], ')']]]]]]]]

It works, but wow, that output is horrible! It’s basically a tree, formed from lists, that describes the parse process. It’s got all of the information we’d need to evaluate the expressions parsed by our grammar, only it’s in a terrible form.

That brings us to one last thing to add: Actions. In terms of parser combinators, an action is a parser that takes another parser, and a function. The output of the action is the result of applying the function to the embedded parser:

class Action(Parser):
  def __init__(self, parser, act):
    self.parser = parser
    self.action = act

  def pr(self):
    return "Action[%s]" % self.parser.pr()

  def parse(self, inp):
    result = self.parser.parse(inp)
    if result.succeeded():
      return Success(self.action(result.output), result.rest)
    else:
      return Failure()

Using that, we can turn our expression parser into a calculator!

def digits_to_number(digits, running=0):
  """Convert a list of digits to an integer"""
  if len(digits) == 0:
    return running
  else:
    r = (running * 10) + int(digits[0])
    return digits_to_number(digits[1:], r)

digit = Parser.match(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'])
number = Action(digit.many(1), digits_to_number)

In this version, number now actually outputs an integer, which is the number represented by the list of digits produced by the parser.

We can attach similar actions to all of the other rules. For example, the new version of our addition expression is:

def eval_add(lst):
  """Evaluate an addition expression. For addition rules, the parser will return
  [number, [[op, number], [op, number], ...]]
  To evaluate that, we start with the first element of the list as result value,
  and then we iterate over the pairs that make up the rest of the list, adding
  or subtracting depending on the operator.
  """
  first = lst[0]
  result = first
  for n in lst[1]:
    if n[0] == '+':
      result += n[1]
    else:
      result -= n[1]
  return result

add_expr = Action(mult_expr & (Parser.match(['-', '+']) & mult_expr).many(), eval_add)

With similar actions attached to the others, now, it works. (The code for the other actions is in the github repository, linked below):

>>> inp = StringParserInput("1+2*(3+5*4)*(6+7)")
>>> print(expr.parse(inp).output)
599

That’s about it for doing this in Python. The code is all here on Github, in the “python” subdirectory. The rest of the github project is the Java version. It’s more complicated to do all of this in Java, because of the type system. We’ll take a look at that in the next post. The syntax is more verbose, and the types make things a bit more complicated, but it’s still amazingly easy to build a useful parser this way.

This implementation isn’t quite what I’d call industrial strength:

  1. It’s not super-efficient; particularly when it’s parsing a multiple-precedence expression, it can do a whole lot of backtracking, which makes things very slow. To fix this, the best thing would be to add something called operator precedence parsing for handling infix expressions. Parsec, the Haskell library that introduced my to combinator parsing, does this, and it workes beautifully. It would be nice to add that, but for my purposes (ie, writing this blog post, plus a bit of language parsing for my Apex project), it’s not necessary.
  2. It has terrible error handling. In fact, it doesn’t have error handling at all. If there’s a syntax error in your input, all that will happen is that the parser will fail. To fix this, I’ll need to add a third kind of return value to the basic parser function: a parser should be able to succeed, fail without error, and fail with an error. This is something I really need to add if I ever want to use this code for anything even semi-real.
  3. It doesn’t track positions. In other words, when I get it to the point where it can actually produce a syntax error, it can’t say where the error is. I’d really like to be able to say “There’s an error at line 3 column 7”, but right now, there’s no way to do that. This one is relatively easy to add: the parser input just needs to be able to report a line and column number. I’m definitely planning on adding that.

The Horror! The Horror! How dare we discriminate against men, by listening to women?

For those of us who’ve learned to actually be aware of sexism and racism, it’s incredibly frustrating how the same stupid pathetic arguments about sexism keep getting regurgitated, over and over again, by clueless guys. It’s exhausting and frustrating to constantly answer the same stupid, bullshit arguments. But if it’s frustrating to a white guy like me, just imagine what it’s like for the people who are actually affected by it!

This rant is brought on by the fact that lately, there’s been a movement in the tech/engineering community to try to actually do something about the amount of sexism in the community, by trying to push conference organizers to include speakers outside the usual group of guys.

You see, it’s a sad fact that engineering, as a field, is incredibly sexist. We don’t like to admit it. Tons of people constantly deny it, and make excuses for it, and refuse to try to do anything about it. Many people in this field really, genuinely believe that technology is a true meritocracy: those of us who succeed because we deserve to succeed. We see ourselves as self-made: we’ve earned what we’ve got. Anyone else – anyone else – who worked as hard as we do, who’s as good as we are, would succeed as much as we have.

Unfortunately, that’s not reality. Women and minorities of all sorts have a much harder time in this community than the typical guy. But when you try to do anything about this, the meritocrats throw a tantrum. They get actively angry and self-righteous when anyone actually tries to do anything about it. You can’t actively try to hire women: that’s discriminating against the guys! You’re deliberately trying to hire inferior people! It’s not fair!

I am not exaggerating this. Here’s an example.

Summing it up, turning away qualified male candidates and accepting potentially less qualified female candidates just to meet a quota is not only sexist, but is a horrible face to put on the problem. My wife is a nurse. A female dominated field. I also own a construction company. A male dominated field. In neither field would you see a company or an organization suggest hiring one gender over the other to satisfy a ratio.

What exactly is being accomplished by limiting the speakers at an event, or your employee base to an equal male / female ratio? Turning away qualified candidates and hiring purely based on gender, or religion, or race? What does having a program like Women Who Code, PyLadies, Girls Who Code, Black Girls Code, etc… accomplish? The short answer; very little if anything. In fact, I believe groups like this have the potential actually do more harm than good. But I’m not here to debate the minutia of any of those groups so I’ll summarize;…

Seriously, I don’t think I’ve ever heard anyone seriously suggest that you should
turn away candidates based purely on gender, or race, or whatever.

What we do advocate is recognizing that there are people other than white guys in the world, and maybe we should actually include them once in a while.

The majority of people running tech conferences are white guys. When they think about inviting people to come speak, they’re naturally going to start off with a list of “Who do I know who’d be a good speaker?”. The nature of the tech community is that the vast majority of people that they immediately think of are going to be men – because the majority of people they’ve worked with, and the majority of people they’e seen speak at other conferences were men.

People love to handwave this, talking about how it’s just that the field is so male dominated. It’s true that it is, but there are many problems with that as an excuse:

  1. The male domination of the field isn’t meritocratic. We discriminate against people who aren’t like us at every level – from elementary school teachers discouraging little girls from being interested in math, to college classmates harassing women in their classes, to people conducting job interviews and making hiring decisions.
  2. The actual fraction of women who work in tech is much larger than the fraction of women in leadership, or who are invited to give talks at conferences.
  3. There are a shocking number of women who are driven out of technology and engineering by harassment by their male coworkers.

People get really angry when we say thing like this. Part of it is that old meritocracy thing: you’re saying that I didn’t earn my success.

Guess what? You didn’t. Not entirely. No one succeeds solely on the basis of their own merit. There’s always a huge amount of luck involved – being in the right place, having the right body, having the right parents, having the right opportunities. Yes, hard work, talent, and skill helped you get to where you are. It’s a very large, very important factor in your success. But if you were born to a different family, with exactly the same abilities, you might not have ever had any chance at succeeding, despite equally hard work.

The other side of this is actually a sign of progress. We’ve come to accept that racism and sexism are bad. That’s a really good thing. But in our black-and-white way of seeing the world, we know that sexism is bad: therefore, if I’m sexist, I’m a bad person.

That’s not true. Sexism is a deeply ingrained attribute in our culture. It’s pretty much impossible to grow up in the US or in Europe, or in China, or in India, or in Africa, without being constantly immersed in sexist attitudes. You’re going to be exposed – and not just exposed, but educated in a way that teaches you to have the attitudes that come from your culture.

Recognizing that you’ve grown up in a sexist culture, and acknowledging that this had an effect on you and your attitudes doesn’t make you a bad person. It makes you human. Refusing to admit that your culture had any influence on you, and continuing to discriminate against people because you can’t admit that you might do something wrong? That’s what makes you a bad person.

I’ve told this story before, but it’s a damned good story, and it’s true, and it’s not hearsay: it’s my personal first-hand experience. It’s part of my own awakening to just how pervasive gender bias is.

A long time ago now, I worked for IBM Research. My third year working there, I volunteered to be the summer student coordinator for my department. The previous year, IBM Research had hired around 100 summer students, and exactly one of them was a woman. The vast majority were white guys, with a sizable minority of Chinese and Indian guys. The pool of candidates was nowhere near that skewed. It’s definitely true that men outnumber women in computer science by a sizable factor, but not 99 to 1. Our candidate pool was more like 5 men to 1 woman.

So, that year, the powers that be at the company decided that we needed to do something about it. What they decided to do was allocate a reasonable number of summer student slots to each department based on the departments budget, and they could use those slots to hire anyone they wanted. If they hired a candidate who was a woman or minority, they didn’t count against the budget. (Note that they did not reduce the number of students we were allowed to hire: they allocated based on the usual budget. They set up an additional budget for the extra students.)

My department was one of the smaller ones. We were allocated 5 slots for summer students. The day we started allowing people to request students, all 5 were gone within a couple of hours. The next day, the guy across the hall from me came to my office with a resume for a student he wanted to hire. Of course, it was a guy.

I told him that we couldn’t hire him – our budget was gone. But if he could find a woman, we didn’t need budget to hire her.

He threw a fit. It was the angriest I ever saw him. (Most of the time, he was a really nice, mellow guy, so he was really upset about his!) It was discrimination! Sexism! Unfair! He carefully went through the resume database looking for the best candidate, not the best male candidate. We were refusing to hire the most qualified candidate! On, and on, and on. I finally got rid of him, after pointing out at least a dozen times that I was just a lowly junior engineer, not someone who made the policy.

The next day, he was back in my office. He was practically bouncing off the walls: he’d gone back to the resume database, and he’d found a woman who was even better than the guy he’d wanted to hire.

This is the point of the whole story. He wasn’t some nasty, spiteful, misogynistic twit. He wasn’t being deliberately discriminatory. He wasn’t consciously screening out women’s resumes. But the fact is, when he went through the resume database without being forced to consider women, he’d eliminated the resumes of every single woman. He was going through a database of 1000s of resumes, and in that process of quickly skimming, he skipped over a more qualified candidate, because she had a woman’s name.

This is what happens in the real world. We don’t deliberately try to be sexists. We don’t act in a deliberately sexist or discriminatory way. But we’re part of a culture that has deeply ingrained sexist attitudes. We’re taught, by the way teachers treat boys and girls differently in school. We’re taught, by the way that society treats us differently. We absorb the message that when it comes to things like engineering, women are inferior. Most of the time, we don’t even really notice that we’ve absorbed that. But we have. It’s been hammered into us so many times, in so many ways, in so many settings – it would be shocking if we didn’t pick it up.

I’m using my experience at IBM as an example, partly because it’s such a vivid demonstration, and partly because it’s impossible to figure out the real names of anyone involved. But I’ve seen the same kind of thing happen in every job I’ve had where I’ve been involved with hiring. It’s not usually deliberate, but it’s very real.

The point of things like the pledges to not attend conferences that don’t have women and minorities as speakers and participants isn’t because we want to exclude the most qualified speakers. It isn’t because we want to force conference planners to include less qualified speakers. It’s because we know that it’s easy, without trying, to exclude some of the most qualified speakers, because the people running the conference don’t notice them.

They’re just like my friend at IBM: they’re not deliberately trying to exclude women. But if they don’t actively try to think about people outside the usual pool of guys like them, they won’t include any. And if they don’t, then they’re priming the next round of conference planners to do the same: if everyone you’ve seen give a great talk at a conference is a guy, then when you’re planning a conference and you try to think of some great speakers to invite, then who’s going to come to mind?

I’m particularly annoyed at the snipe that the author of the quote up above takes at “Girls Who Code”. GWC is a great organization. If you actually take the time to listen to the people who run it, you’ll hear some appalling true stories, about things like young women who go to college to study computer science, and on their first day in class, have classmates telling them that they’re in the wrong classroom: this is a programming class, not a class for chicks.

We have a community where we treat women like that. And then we rant and rave about how horribly unfair it is to do anything about it.

Roasted Vegetable Recipes

One more post of recipes, and then I’ll get back to math, I promise!

One of my proudest accomplishments that somehow, I successfully taught my children to not just eat, but to love vegetables. I think part of that is genetic – neither of them are supertasters. But the other part of it is a combination of training and cooking.

The training side is, I think, simple. Most adults are convinced that vegetables are icky but necessary. They’re wrong. But they actively teach that to their children. They make eating vegetables, even when they’re delicious, into a chore.

The other side is that because most adults think that veggies are icky, they cook them in ways that don’t taste good.

Take one of my favorite vegetables as an example: brussel sprouts. My children will actually fight over who gets the last bite of brussel sprouts. When we’re talking about what to make for dinner, they beg me to make them! But when I mention this to most people, they act like I’m absolutely insane: brussel sprouts are gross!

If you take a bunch of brussel sprouts, and throw it into boiling water for 20 minutes, what you get is a stinky, pungent, bitter, mushy, revolting little ball of green awfulness. But, if you slice them thin, and saute them in a smoking hot pan with olive oil, salt and pepper, and a bit of garlic, until the edges start to turn brown, what you get is absolutely amazing: sweet, crisp, and wonderful.

So, what I’m going to share here is a couple of vegetable side dishes I made in the last week, which were fantastic. All of them are roasted vegetables – for some reason, people don’t think about roasting veggies, but it’s often one of the easiest and tastiest ways to cook them.

First: simple roasted brussel sprouts.

  1. Preheat your oven to 450 degrees.
  2. Take a pound of brussel sprouts, and cut them into quarters.
  3. Toss them with enough olive out to coat them, but not drench them.
  4. Sprinkle generously with salt and pepper.
  5. Spread them out over a baking sheet.
  6. Put them into the hot oven, for around 10 minutes. After ten minutes, take them and and turn them. If they look really well done and brown on the edges, then they’re done; if not, put them in for up to another 10 minutes.
  7. Take them out, and toss them with a teaspoon or two of balsamic vinegar.

That’s it – and they’re amazing.

Next: Chili Glazed Roasted Sweet Potatoes

This one I’m particularly proud of. I absolutely love sweet potatoes. But normally, my wife won’t touch them – she thinks they’re gross. But this recipe, she actually voluntarily had multiple helpings! It’s sweet, salty, and spicy all at the same time, in a wonderful balance.

  1. Take a couple of pounds of sweet potatoes, peel them, and cut them into cubes about 2 inches on a side.
  2. Toss them with olive oil to coat.
  3. Mix together 1 teaspoon of salt, 1/4 cup of brown sugar, and one generous teaspoon of kochukaru (korean chili powder).
  4. Sprinkle it over the oiled sweet potatoes, and toss them so they’re all coated.
  5. Spread onto a baking sheet, and cook at 350 for about 30 minutes, turning them at least once. They’re done when the outside is nicely browned, and they’ve gotten soft.

Finally: roasted cauliflower

  1. Preheat your oven to 450.
  2. Take a whole head of cauliflower, and break it into small florets. Put them into a bowl.
  3. Take a half of an onion, and slice it thin. Toss the onions with the cauliflower.
  4. Coat the cauliflower and onions with olive oil – don’t drench them, but make sure that they’ve got a nice coat.
  5. Sprinkle generously with salt and pepper.
  6. Spread onto a baking sheet, and into the oven.
  7. After 10 minutes, take them out, and turn them, then back in for another 10 minutes.

All three of these got eaten not just by adults, but by kids. The brussel sprouts and sweet potatoes were eaten not just by my kids, but by the kids of other people too, so it’s not just the crazy Chu-Carroll’s who thought they were delicious!

A Recipe for Gefilte Fish

My mom died last friday night, a little bit after midnight. I’ll probably write something about her, when I’m feeling up to it, but not yet. Being a jewish dad, when I’m depressed, what do I do? I cook.

Last year, for the first time ever, I made homemade gefilte fish for Pesach. If you didn’t grow up Jewish, odds are you’re not familiar with gefilte fish. It’s a traditional ashkenazi (that is, eastern european jewish) dish. It was originally a ground fish mixture cooked inside the body cavity of a fish. It evolved into just the stuffing mixture, simmered in a fish stock. These days, most people just buy it in a jar. If you grew up with it, even out of the jar, it’s a treat; if you didn’t, and you’ve been exposed to it, it looks and smells like dog food.

Honestly, I love the stuff. In general, I’m not a big fan of most traditional Jewish foods. But there’s something about gefilte fish. But even as I enjoy it, I can see the gross side. It’s crazy overprocessed – I mean, come on – it’s fish that will keep, unrefrigerated, for years!

But made fresh, it’s a whole different kettle of fish. This stuff is really good. You’ll definitely recognize the flavor of this as gefilte fish, but it’s a much cleaner flavor. It tastes like fish, not like stale overprocessed fish guts.

So this year, I’m depressed over my mom; after the funeral, I sent my wife out to buy me a bunch of fish, and I made up a batch. This time, I kept notes on how I did it – and it turned out even better than last year.

It’s got a bit of a twist in the recipe. I’m married to a chinese woman, so when the Jewish holidays roll around, I always try to find some way of putting an asian spin on the food, to reflect the nature of our family. So when I cooked the gefilte fish, instead of cooking it in the traditional simple fish broth, I cooked it in dashi. It’s not chinese, but it’s got a lot of flavors that are homey for a chinese person.

So… here’s the recipe for Mark’s homemade salmon dashi gefilte fish!

Ingredients

  • 2 whole pike, gutted and cleaned, but with skin, head, and bones
  • 2 whole red snapper, gutted and cleaned, but with skin, head and bones
  • 2 pounds salmon filet
  • 3/4 to 1 cup matzoh meal
  • 3 eggs
  • salt (to taste)
  • 2 sheets of konbu (japanese dried kelp)
  • 2 handfulls dried shaved bonito
  • 4 or 5 slices of fresh ginger, crushed
  • 2 onions
  • 2 large carrots

(For the fish for this, you really want the bones, the skins, and the head. If you’ve got a fish market that will fillet it for you, and then give you all of the parts, have them do that. Otherwise, do it yourself. Don’t worry about how well you can fillet it – it’s going to get ground up, so if you do a messy job, it’s not a problem.)

Instructions

  1. First thing, you need to make the stock that you’ll eventually cook the gefilte fish in:
    1. If the fish store didn’t fillet the fish for you, you need to remove the filets from the fish, and then remove the skin from the filets.
    2. Put all of the bones, skin, and head into a stock pot.
    3. Cover the fish bones with with water.
    4. Add one onion, and all of the garlic and ginger to the pot.
    5. Heat to a boil, and then simmer for two hours.
    6. Strain out all of the bones, and put the stock back into the pot and bring to a boil.
    7. Add the kombu to the stock, and let it simmer for 30 minutes.
    8. Remove from the heat, and strain out the kombu.
    9. Add the bonito (off the heat), and let it sit for 15 minutes.
    10. Strain out the bonito and any remaining solids.
    11. Add salt to taste.
  2. While the stock is simmering, you can get started on the fish:
    1. Cut all of the fish into chunks, and put them through a meat grinder with a coarse blade (or grind them coarsely in batches in best food processor you can get your hands on.
      r.)
    2. Cut the onion and carrots into chunks, and put them through the grinder as well.
    3. Beat the eggs. Fold the eggs and the salt into the ground fish mixture.
    4. Add in maztoh meal gradually, until the mixture holds together.
    5. Refrigerate for two hours.
  3. Now you’re ready to cook the gefilte fish!
    1. Heat the stock up to a gentle simmer.
    2. Scoop up the fish into balls containing about two tablespoons of the fish mixture, and roll them into balls.
    3. Add the fish balls into the simmering stock. Don’t overcrowd the pot – add no more than can fit into the pot in a single layer.
    4. Simmer for 10-15 minutes until the fish balls are cooked through.
    5. Remove the balls from the simmering liquid. Repeat until all of the fish is cooked.
    6. Put all the cooked fish balls back into the stock, and refrigerate.

The Heartbleed Bug

There’s a lot of panic going around on the internet today, about something called the Heartbleed bug. I’ve gotten questions, so I’m giving answers.

I’ve heard lots of hype. Is this really a big deal?

Damn right it is!

It’s pretty hard to wrap your head around just how bad this actually is. It’s probably even more of a big deal that the hype has made it out to be!

This bug affects around 90% of all sites on the internet that use secure connections. Seriously: if you’re using the internet, you’re affected by this. It doesn’t matter how reputable or secure the sites you connect to have been in the past: the majority of them are probably vulnerable to this, and some number of them have, in all likelihood, been compromised! Pretty much any website running on linux or netbst, using Apache or NGINX as its webserver is vulnerable. That means just about every major site on the net.

The problem is a bug in a commonly used version of SSL/TLS. So, before I explain what the bug is, I’ll run through a quick background.

What is SSL/TLS?

When you’re using the internet in the simplest mode, you’re using a simple set of communication protocols called TCP/IP. In basic TCP/IP protocols, when you connect to another computer on the network, the data that gets sent back and forth is not encrypted or obscured – it’s just sent in the clear. That means that it’s easy for anyone who’s on the same network cable as you to look at your connection, and see the data.

For lots of things, that’s fine. For example, if you want to read this blog, there’s nothing confidential about it. Everyone who reads the blog sees the same content. No one is going to see anything private.

But for a lot of other things, that’s not true. You probably don’t want someone to be able to see your email. You definitely don’t want anyone else to be able to see the credit card number you use to order things from Amazon!

To protect communications, there’s another protocol called SSL, the Secure Sockets Layer. When you connect to another site that’s got a URL starting with https:, the two computers establish an encrypted connection. Once an SSL connection is established between two computers, all communication between them is encrypted.

Actually, on most modern systems, you’re not really using SSL. You’re using a successor to the original SSL protocol called TLS, which stands for transport layer security. Pretty much everyone is now using TLS, but many people still just say SSL, and in fact the most commonly used implementation of it is in package called OpenSSL.

So SSL/TLS is the basic protocol that we use on the internet for secure communications. If you use SSL/TLS correctly, then the information that you send and receive can only be accessed by you and the computer that you’re talking to.

Note the qualifier: if you use SSL correctly!

SSL is built on public key cryptography. What that means is that a website identifies itself using a pair of keys. There’s one key, called a public key, that it gives away to everyone; and there’s a second key, called a private key, that it keeps secret. Anything that you encrypt with the public key can only be decrypted using the private key; anything encrypted with the private key can only be decrypted using the public key. That means that if you get a message that can be decrypted using the sites public key, you know that no one except the site could have encrypted it! And if you use the public key to encrypt something, you know that no one except that site will be able to decrypt it.

Public key cryptography is an absolutely brilliant idea. But it relies on the fact that the private key is absolutely private! If anyone else can get a copy of the private key, then all bets are off: you can no longer rely on anything about that key. You couldn’t be sure that messages came from the right source; and you couldn’t be sure that your messages could only be read by an authorized person.

So what’s the bug?

The SSL protocol includes something called a heartbeat. It’s a periodic exchange between the two sides of a connection, to let them know that the other side is still alive and listening on the connection.

One of the options in the heartbeat is an echo request, which is illustrated below. Computer A wants to know if B is listening. So A sends a message to B saying “Here’s X bytes of data. Send them back to me.” Then A waits. If it gets a message back from B containing the same X bytes of data, it knows B was listening. That’s all there is to it: the heartbeat is just a simple way to check that the other side is actually listening to what you say.

heartbeat

The bug is really, really simple. The attacker sends a heartbeat message saying “I’m gonna send you a chunk of data containing 64000 bytes”, but then the data only contains one byte.

If the code worked correctly, it would say “Oops, invalid request: you said you were going to send me 64000 bytes of data, but you only sent me one!” But what the buggy version of SSL does is send you that 1 byte, plus 63,999 bytes of whatever happens to be in memory next to wherever it saved the byte..

heartbleed

You can’t choose what data you’re going to get in response. It’ll just be a bunch of whatever happened to be in memory. But you can do it repeatedly, and get lots of different memory chunks. If you know how the SSL implementation works, you can scan those memory chunks, and look for particular data structures – like private keys, or cookies. Given a lot of time, and the ability to connect multiple times and send multiple heartbeat requests each time you connect, you can gather a lot of data. Most of it will be crap, but some of it will be the valuable stuff.

To make matters worse, the heartbeat is treated as something very low-level which happens very frequently, and which doesn’t transfer meaningful data. So the implementation doesn’t log heartbeats at all. So there’s no way of even identifying which connections to a server have been exploiting this. So a site that’s running one of the buggy versions of OpenSSL has no way of knowing whether or not they’ve been the target of this attack!

See what I mean about it being a big deal?

Why is it so widespread?

When I’ve written about security in the past, one of the things that I’ve said repeatedly is: if you’re thinking about writing your own implementation of a security protocol, STOP. Don’t do it! There are a thousand ways that you can make a tiny, trivial mistake which completely compromises the security of your code. It’s not a matter of whether you’re smart or not; it’s just a simple statement of fact. If you try to do it, you will screw something up. There are just so many subtle ways to get things wrong: it takes a whole team of experts to even have a chance to get it right.

Most engineers who build stuff for the internet understand that, and don’t write their own cryptosystems or cryptographic protocols. What they do is use a common, well-known, public system. They know the system was implemented by experts; they know that there are a lot of responsible, smart people who are doing their best to find and fix any problems that crop up.

Imagine that you’re an engineer picking an implementation of SSL. You know that you want as many people trying to find problems as possible. So which one will you choose? The one that most people are using! Because that’s the one that has the most people working to make sure it doesn’t have any problems, or to fix any problems that get found as quickly as possible.

The most widely used version of SSL is an open-source software package called OpenSSL. And that’s exactly where the most bug is: in OpenSSL.

How can it be fixed?

Normally, something like this would be bad. But you’d be able to just update the implementation to a new version without the bug, and that would be the end of the problem. But this case is pretty much the worst possible case: fixing the implementation doesn’t entirely fix the problem. Because even after you’ve fixed the SSL implementation, if someone got hold of your private key, they still have it. And there’s no way to know if anyone stole your key!

To fix this, then, you need to do much more than just update the SSL implementation. You need to cancel all of your keys, and replace them new ones, and you need to get everyone who has a copy of your public key to throw it away and stop using it.

So basically, at the moment, every keypair for nearly every major website in the world that was valid yesterday can no longer be trusted.

Manifold are the Manifolds

In the stuff I’ve been writing about topology so far, I’ve been talking about topologies mostly algebraically. I’m not sure why, but for me, algebraic topology is both the most interesting and easy to understand bit of topology. Most people seem to think I’m crazy; in general, people seem to think that algebraic topology is hard. It must say something about me that I find other things much harder, but I’ll leave it to someone else to figure out what.

Despite the fact that I love the algebraic side, there’s a lot of interesting stuff in topology that you need to talk about, which isn’t purely algebraic. Today I’m going to talk about one of the most important ones: manifolds.

The definition we use for topological spaces is really abstract. A topological space is a set of points with a structural relation. You can define that relation either in terms of neighborhoods, or in terms of open sets – the two end up being equivalent. (You can define the open sets in terms of neighborhoods, or the neighborhoods in terms of open sets; they define the same structure and imply each other.)

That abstract definition is wonderful. It lets you talk about lots of different structures using the language and mechanics of topology. But it is very abstract. When we think about a topological space, we’re usually thinking of something much more specific than what’s implied by that definition. The word space has an intuitive meaning for us. We hear it, and we think of shapes and surfaces. Those are properties of many, but not all topological spaces. For example, there are topological spaces where you can have multiple distinct points with identical neighborhoods. That’s definitely not part of what we expect!

The things that we think of as a spaces are really are really a special class of topological spaces, called manifolds.

Informally, a manifold is a topological space where its neighborhoods for a surface that appears to be euclidean if you look at small sections. All euclidean surfaces are manifolds – a two dimensional plane, defined as a topological space, is a manifold. But there are also manifolds that aren’t really euclidean, like a torus, or the surface of a sphere – and they’re the things that make manifolds interesting.

The formal definition is very much like the informal, with a few additions. But before we get there, we need to brush up on some definitions.

  • A set {\mathbf S} is countable if and only if there is a total, onto, one-to-one function from {\mathbf S} to the natural numbers.
  • Given a topological space (T, \tau), a basis \beta for \tau is a collection of open sets from which any open set in \tau can be generated by a finite sequence of unions and intersections of sets in \beta. What this really means is that the structure of the space is regular – it’s got nothing strange like an infinitary-union in its open-sets.
  • A topological space (T, \tau) is called a Hausdorff space if and only if for any two distinct points p, q \in T, there are at least one open set o_p where p \in o_p \land q \not\in o_p, and at least one open set o_q where q \in o_q \land p \not\in o_q. (That is, there is at least one open set that includes p but not q, and one that includes q but not p.) A Hausdorff space just defines another kind of regularity: disjoint points have disjoint neighborhoods.

A topological space (T, \tau) is an n-manifold if/f:

  • \tau has a countable basis.
  • (T, \tau) is a Hausdorff space.
  • Every point in T has a neighborhood homeomorphic to an open euclidean n-ball.

Basically, what this really means is pretty much what I said in the informal definition. In a euclidean n-space, every point has a neighborhood which is shaped like an n-ball, and can be separated from any other point using an n-ball shaped neighborhood of the appropriate size. In a manifold, the neighborhoods around a point look like the euclidean neighborhoods.

If you think of a large enough torus, you can easily imagine that the smaller open 2-balls (disks) around a particular point will look very much like flat disks. In fact, as the torus gets larger, they’ll become virtually indistinguishable from flat euclidean disks. But as you move away from the individual point, and look at the properties of the entire surface, you see that the euclidean properties fail.

Another interesting way of thinking about manifolds is in terms of a construction called charts, and charts will end up being important later.

A chart for an manifold is an invertable map from some euclidean manifold to part of the manifold which preserves the topological structure. If a manifold isn’t euclidean, then there isn’t a single chart for the entire manifold. But we can find a set of overlapping charts so that every point in the manifold is part of at least one chart, and the edges of all of the charts overlap. A set of overlapping charts like that is called an atlas for the manifold, and we will sometimes say that the atlas defines the manifold. For any given manifold, there are many different atlases that can define it. The union of all possible atlases for a manifold, which is the set of all charts that can be mapped onto parts of the manifold is called the maximal atlas for the manifold. The maximal atlas for a manifold is, obviously, unique.

For some manifolds, we can define an atlas consisting of charts with coordinate systems. If we can do that, then we have something wonderful: a topology on which we can do angles, distances, and most importantly, calculus.

Topologists draw a lot of distinctions between different kinds of manifolds; a few interesting examples are:

  • A Reimann manifold is a manifold on which you can meaningfully define angles and distance. (The mechanics of that are complicated and interesting, and I’ll talk about them in a future post.)
  • A differentiable manifold is one on which you can do calculus. (It’s basically a manifold where the atlas has measures, and the measures are compatible in the overlaps.) I probably won’t say much more about them, because the interesting thing about them is analysis, and I stink at analysis.
  • A Lie group is a differentiable manifold with a valid closed product operator between points in the manifold, which is compatible with the smooth structure of the manifold. It’s basically what happens when a differentiable manifold and a group fall in love and have a baby.

We’ll see more about manifolds in future posts!

Two Factor Security

My friend Dr24Hours tweeted something disparaging about two factor security this morning:

I’m a huge fan of two factor, but I’ve definitely noticed that it’s not well understood by a lot of people, and there are a lot of misunderstandings of what it is, how it works, and why it’s good.

Two factor is a mode of authentication. It’s a way of proving that you are who you say you are, in order to be granted access to some secured resource. You don’t want random people to be able to access your email, so your email provider has a way of ensuring that only you can access it. Before you an access it, you have to prove that you’re someone who’s supposed to be able to see it.

So how can you prove who you are? There are fancy terms for different strategies, but fundamentally, it comes down to one of two things: either something you know, or something you have.

If you’re using the internet, you’re guaranteed to be familiar with the “something you know” method: passwords are something you know. For any password protected service, you have a secret that you share with he service. By telling them the secret – something that no one but you should know – you’re proving to them that you’re you.

If you work in an office that uses ID badges, you’re familiar with a “something you have” strategy. The people at the security desk on the ground floor of my office have no idea who I am. They don’t recognize me. Someone could walk into the building, and claim that they work for twitter, and the security people would have no idea if they were telling the truth. But we all have security cards for the building. That card proves that I’m someone who’s supposed to be allowed in.

Passwords – something you know – are used all over the internet. Each of us has a ridiculous number of passwords. Just quickly trying to come up with a list, I’ve got at least 35 different passwords! And I’m probably missing at least another dozen.

Something you know has two big related problems: it’s only a good way of proving who you are if it’s really something that only you know; and if it’s something that someone else won’t be able to guess. If anyone else knows it, or if anyone else can guess it, then it’s no longer useful as a way of proving that you’re you.

Both of those weaknesses have major practical effects.

Lots of people choose really stupid passwords. There’s a huge number of people who use “password” or “1234” as their password. In fact, people who’ve tried to crack passwords to test the security of systems have found that the five most common passwords are “password”, “123456”, “12345678”, “12345”, and “qwerty”. If someone wants to get access to your email, or your amazon account, or anything else, those will be the first thing they try. A frightening amount of the time, that will work.

Even among people who use good passwords, the sheer number of passwords they need to remember has gotten out of hand, and so they reuse passwords. If you reuse your password, then that means that if any of the places you used your password gets cracked, or if any of the places where you used your password are dishonest, your password is no longer a secret that only you know.

What two-step authentication does is mitigate the risk of bad passwords and password reuse by adding a layer of “something you have” to authentication. First, it checks the shared secret – the thing that you know. If that’s successful, then it also checks for the thing you have.

For example, my work email account uses two-factor authentication. What that means is that when I log in, it first asks me for my password. Once I give it the password, then it asks for a four digit code. I’ve got an app on my phone that generates that code. Only my phone contains the data that’s needed to generate that four digit code – so if I get it right, that means that I have my phone.

Two factor doesn’t tell anyone additional information about you, beyond the fact that you have whatever device or artifact that you’re using. Using two factor doesn’t tell Google what I’m buying at Amazon, or what I’m tweeting, or who I talk to. Even if I’m using Google 2factor, Google gets no more information about me than it would get by checking my password.

But two factor has a huge benefit for me. It means that it’s much, much harder for someone else to get access to my accounts. I don’t reuse my passwords, but I do use 1password for password management, so if someone were able to crack my 1password data file, they’d have access to my passwords. But with two factor, that’s not enough for them to get access to my account. I wish that I could turn on two factor in more places!

Yet Another Cantor Crank: Size vs Cardinality

Over the weekend, a reader sent me links to not one but two new Cantor cranks!

Sadly, one of them is the incoherent sort – the kind of nutjob who strings together words in meaningless ways. Without a certain minimal rationality, there’s nothing I can say. What I try to do on this blog isn’t just make fun of crackpots – it’s explain what they get wrong! If a crackpot strings together words randomly, and no one can make any sense of just what the heck they’re saying, there’s no way to do that.

On the other hand, the second guy is a whole different matter. He’s making a very common mistake, and he’s making it very clearly. So for him, it’s well worth taking a moment and looking at what he gets wrong.

My mantra on this blog has always been: “the worst math is no math”. This is a perfect example.

First, I believe that Cantor derived a false conclusion from the diagonal method.

I believe that the primary error in the theory is not with the assertion that the set of Real Numbers is a “different size” than the set of Integers. The problem lies with the assertion that the set of Rational Numbers is the “same size” as the set of Integers. Our finite notion of size just doesn’t extend to infinite sets. Putting numbers in a list (i.e., creating a one-to-one correspondence between infinite sets) does not show that they are the same “size.”

This becomes clear if we do a two step version of the diagonal method.

Step One: Lets start with the claim: “Putting an infinite set in a list shows that it is the same size as the set of Integers.”

Step Two: Claiming to have a complete list of reals, Cantor uses the diagonal method to create a real number not yet in the list.

Please, think about this two step model. The diagonal method does not show that the rational numbers are denumerable while the real numbers are not. The diagonal method shows that the assertion in step one is false. The assertion in step one is as false for rational numbers as it is for real numbers.

The diagonal method calls into question the cross-section proof used to show that the rational numbers are the same size as the integers.

Cantor didn’t talk about size. He never asserted anything about size. He asserted something about cardinality.

That might sound like a silly nitpick: it’s just terminology, right?

Wrong. What does size mean? Size is an informal term. It’s got lots of different potential meanings. There’s a reasonable definition of “size” where the set of natural numbers is larger than the set of even natural numbers. It’s a very simple definition: given two sets of objects A and B, the size of B is larger than the size of A if A is a subset of B.

When you say the word “size”, what do you mean? Which definition?

Cantor defined a new way of defining size. It’s not the only valid measure, but it is a valid measure which is widely useful when you’re doing math. The measure he defined is called cardinality. And cardinality, by definition, says that two sets have the same cardinality if and only if it’s possible to create a one-to-one correspondance between the two sets.

When our writer said “Our finite notion of size just doesn’t extend to infinite sets”, he was absolutely correct. The problem is that he’s not doing math! The whole point of Cantor’s work on cardinality was precisely that our finite notion of size doesn’t extend to infinite sets. So he didn’t use our finite notion of size. He defined a new mathematical construct that allows us to meaningfully and consistently talk about the size of infinite sets.

Throughout his website, he builds a complex edifice of reasoning on this basis. It’s fascinating, but it’s ultimately all worthless. He’s trying to do math, only without actually doing math. If you want to try to refute something like Cantor’s diagonalization, you can’t do it with informal reasoning using words. You can only do it using math.

This gets back to a very common error that people make, over and over. Math doesn’t use fancy words and weird notations because mathematicians are trying to confuse non-mathematicians. It’s not done out of spite, or out of some desire to exclude non-mathematicians from the club. It’s about precision.

Cantor didn’t talk about the cardinality of infinite sets because he thought “cardinality” sounded cooler and more impressive than “size”. He did it because “size” is an informal concept that doesn’t work when you scale to infinite sets. He created a new concept because the informal concept doesn’t work. If you’re argument against Cantor is that his concept of cardinality is different from your informal concept of size, you’re completely missing the point.

Recipe: Real Ramen!

Yesterday, my son wasn’t feeling good, and asked for soup. (Poor kid inherited my stomach troubles.) I’ve been dying to try my hand at a real, serious ramen, so I dived in and did this. It turned out amazingly good.

If you’re American, odds are that when you hear “ramen”, you think of those little packets of noodles with a powdered MSG-heavy soup base that you can buy 5 for a dollar. To be honest, I do like those. But they’re a sad imitation of what ramen is supposed to be.

Ramen is, in my opinion, one of the very best soup dishes in the world. A real ramen is a bowl of chewy noodles, served in a rich hearty broth, with some delicious roasted meat, some veggies. Ramen broth isn’t a wimpy soup like american chicken noodle soup – it’s an intense soup. When you eat a bowl of ramen, you’re eating a meal, and you finish it feeling full, and warm, and happy with the world.

This isn’t a simple recipe. It’s a lot of work, but it’s worth it! And most of the components can be prepared in large batches and frozen.

So, here we go. Ramen with Chicken and Shrimp Tare, Watercress, and Roast Pork Tenderloin!

Tare

In ramen, you make the broth relatively unseasoned. Separately, you prepare a tare, which is a seasoning liquid. When you serve the ramen, you start by putting tare in the bottom of the bowl. It’s one of the tricks of ramen – it’s a big part of what makes the broth special. Every ramen cook has their own tare recipe.

Ingredients

  • Shells from 1lb shrimp
  • 8 chicken wings, cut into three pieces each. (Do not throw out the wingtips – for this, they’re the best part!
  • 1 cup mirin
  • 1 cup sake
  • 1/2 cup soy sauce
  • 1 cup water.

Procedure

  1. Heat some oil in a pan, and saute the shrimp shells until they’re cooked through and pink.
  2. Transfer the cooked shells to a cold saucepan.
  3. Add a bit more oil to the hot pan, and add the wings into the pan where you cooked the shells. Brown them really well on both sides. (I also took the neck from the chicken I used to make the broth, and put it in here.)
  4. Move them into the saucepan with the shells.
  5. Add the mirin, sake, soy, and water into the pan where you sauteed the wings, and scrape up all of the brown bits. Then pour it over the wings and shells.
  6. Simmer for at least 30 minutes, until it reduces by roughly half. Skim out all of the solids.

You should give this a taste. It should be very salty, but also sweet, and intensely flavored from the chicken and shrimp shells.

The Broth

Ingredients

  • 1 whole chicken, cut into parts.
  • A bunch of miscellaneous bones – chicken backs are the best, pork bones will be good too – as long as they aren’t smoked. Even beef soup bones will work.
  • 1 whole onion, cut into large chunks.
  • 1 head of garlic, cut in half.
  • 3 whole star anise

Procedure

  1. Heat up a large stockpot on high heat. Add a little bit of oil.
  2. Throw in the bones, and stir them until they’re browned on all sides.
  3. Add in the chicken parts. No salt, no browning – just throw the chicken in.
  4. Add enough water to cover everything in the pot.
  5. Add the onion, garlic, and anise to the pot.
  6. When it comes to a boil, reduce the heat to low, and let it simmer. Periodically skim the scum that rises to the top.
  7. Simmer for at least 2 hours. You can simmer it overnight in a slow-cooker, and it’ll taste even better, but you’ll need extra water. I love slow cookers, Bella crock pot reviews helped me choose my favorite kitchen appliance.
  8. Take out the chicken, bones, and spices. Add some salt – but you want to leave the broth a little underseasoned, because you’re going to mix in some tare later!

Roast Pork Tenderloin

Ingredients

  • 1/2 pork tenderloin, cut into two pieces (to make it easier to fit into the pan.)
  • 2 cloves garlic, finely minced.
  • 3 tablespoons soy sauce
  • 3 tablespoons sake
  • 1 teaspoon sugar

Procedure

  1. Take the tenderloin. Remove any silverskin. Poke all over, on all sides, with a fork. (This will help the marinade
    penetrate.)
  2. Mix together the garlic, soy, sake, and sugar to make a marinade.
  3. Put the pork in, and get it coated. Let it marinade for about an hour, turning it a couple of times.
  4. Heat a cast iron pan on high heat until it’s smoking hot.
  5. Put the tenderloin pieces in the pan. Turn it to brown on all sides.
  6. Remove the pork from the pan, and transfer to a 350 degree oven. Cook until it’s about 140 degrees inside. (This
    took about 15 minutes in my oven.) This is a bit underdone, but it’s going to cook more in the soup, and you don’t want it to be tough!
  7. Slice the pork into half-inch rounds.
  8. Dip the rounds in the hot tare.

Putting it all together

Ingredients

  • Eggs – one per person.
  • The green parts of a couple of scallions, cut finely.
  • A bunch of watercress.
  • Torigashi shichimi (a prepackaged japanese spice blend.)
  • Sesame oil.
  • Ramen noodles. (If you go to an asian grocery, you should be able to find fresh ramen noodles, or at least decent quality dried.)

Procedure

  1. Boil the eggs for about 5-6 minutes. The whites should be set, the yolks still a bit runny.
  2. In each soup bowl, put:
    • A couple of tablespoons of tare (the exact quantity depends on your taste, and how much you reduced your tare)
    • a bunch of watercress
    • some of the minced scallions
    • A drop of sesame oil
  3. Boil the ramen.
  4. To each bowl, add a big bunch of ramen noodles.
  5. Cover the noodles with the broth.
  6. Add a couple of slices of the roast pork.
  7. Crack the egg, and scoop it out of its shell into the soup.
  8. Sprinkle some shichimi on top.
  9. Eat!

Big Numbers and the Lost Plane

(In the original version of this post, there were two places where I wrote “cubic” instead of “square”. It was a stupid mistake, but just a stupid wording mistake. The calculations are unchanged: I was calculating square units, not cubic – I just typed the wrong thing.)

I haven’t written about this in a while, but for a long time, I’ve been fascinated by just how bad humans are at actually understanding big numbers. When it comes down to it, we’re comfortable with numbers we can count on our fingers. The further you get from our fingers, the worse we get at it.

We see this over and over. I’ve been noticing a new variation on this lately, in discussions about the missing Malaysian airliner. For example, I was at the dentist tuesday morning for a tmj mouth guard because I’ve been told that I grind my teeth at night. They’ve installed TVs in the exam rooms, to give you something to do while you wait for the novocaine to kick in, and they had it turned to CNN. The technician who was setting things up was talking to me about what they were saying, and she kept repeating: it’s a plane! It’s huge! Why can’t they find it?

If you think intuitively, it’s a good question!

In fact, if you look at it with basic intuition, it seems insane. A Boeing 777 is a huge plane! The thing is nearly as long as football field, and it’s length wingtip to wingtip is just short of 200 feet. Fully loaded with fuel at takeoff, it weighs more than 300 tons! How can you lose something that large? Even if it broke into pieces, the debris field from it would need to be huge, and it must be easy to find!

But that’s relying on intuition, and as I said in the introduction, our intuitions scale very poorly. To show what I mean, I’m going to work through a process that will hopefully help make it clear just how badly our intuition fails us here.

We’ll start by coming up with an estimate of the quantity of debris.

Assume that the fuselage is a continuous tube. The official stats say that the diameter of the fuselage is 20 feet, and it’s 242 feet long. So a deliberately overlarge estimate of the total surface area of the fuselage is 24220π – or about 15,000 square feet. Assume that the surface of the wings is about he same, and you get a total surface area of about 30,000 square feet. That means that if the entire body of the plane was peeled and laid flat, and it all floated on the water, it would cover 30,000 square feet of the surface. (In reality, much of the plane would sink, but the seats would float; on the whole, it’s probably a wash; the 30,000 square feet is probably still order-of-magnitude correct for the amount of debris visible on the surface.) Sounds like a whole lot, right?

To get an idea of how easy that is to find, we need to consider how much area we need to search. From what we know about how long the plane could have been in the air, and what direction it could have been moving, we’ve got a search area that’s at least a couple of thousand square miles.

One square mile is 27878400 square feet. So assuming that the plane crashed, and its wreckage is distributed over one square mile, that would mean that in the crash area, if every bit of the plate floated, just over one tenth of one percent of that square mile is covered in wreckage. That’s piling together very unrealistic assumptions – a highly constrained debris field, all of it floating – in order to stack the odds of finding it!

We’re using some optimistic assumptions here – but even so, even with a debris field that’s highly (unrealistically) constrained, it’s still very sparse. (In fact, it’s likely that if the plane crashed in the water, the debris field is spread over more than one square mile.) That’s what we’re looking for: an area where less than one tenth of one percent of the surface is covered, in an area covering several thousand square miles.

A thousand square miles is pretty squarely in the zone of stuff that we don’t really understand very well. Getting an understanding of just how big an area a thousand square miles is… it’s not something that our intuition will help with. But statistically, we’re looking for a patch of debris where the visible artifacts cover less than one ten-millions of one percent of the area being searched. If the search area were a football field, you’d be looking for a little fleck of tinfoil, 1/32nd of an inch on each side.

Only even that is making it too easy: it’s not one piece of tinfoil, 1/32nd of an inch on a side. It’s a fleck of foil which was shredded into a thousand even tinier pieces.

That’s assuming a search area of only one thousand square miles. But the search area is, potentially, quite a lot larger than that. Searching for the tinfoil on a football field is easy in comparison.

Once you understand how small an airplane is in comparison to a small patch of ocean, and you understand just how big a small patch of ocean is, then it’s not at all surprising that we haven’t found it yet.