So, my post on monads apparently set of a bit of a firestorm over my comments about avoiding null pointer exceptions. To show you what I mean, here’s a link to one of the more polite and reasonable posts.
Ok. So. Let me expand a bit.
The reason that I think it’s so great that I don’t get NPEs when I use an option with something like Option isn’t because it makes me a super-programmer who’s better than the lowly slime who deal with NPEs. It’s because it changes how that I write code, in a way that helps me avoid making mistakes – mistakes that I make because I’m just a fallible idiot of a human.
There are two fundamental things about an option type, like what we have in Scala, that make a huge difference.
First, it narrows the field of errors. When I’m programming in Java, any call that returns a pointer could return a null. The language makes no distinction between a function/expression that could return a null, and one that can’t. That means that when I get an NPE, the source of that null pointer could be anything in the dynamic slice leading to the error. With an option type, I’ve got two kinds of functions: functions that always return a non-null value, and functions that sometimes return a non-null value, and sometimes return a None. That’s incredibly valuable.
Second, it forces me to explicitly deal with the None case. In Java, programmers constantly build code without null checks, because they know that a function won’t return null. And then it does, and ker-splat. With an option type, I have no choice: I have to explicitly deal with the potential error case. Sure, I can forcibly code around it – in Scala, I can use Option.get, which will turn into an error analagous to an NPE. But it forces me to make that choice, and make it explicitly.
Even if I’m going the stupid, brute-force route and assuming that I know, without fail, that a function is going to return a non-null value… Consider an example:
Java: :
T g = f.doSomething()
g.doSomethingElse()
Scala:
val g: Option[T] = f.doSomething()
g.get.doSomethingElse()
The scala case has to explicitly deal with the fact that it’s dealing with a potentially empty value, and using a statement that asserts the non-emptiness.
But in reality, if you’re a decent programmer, you never use .get to directly access an option. (The only exception is in cases where you call the .get in a context dominated by a non-empty test; but even then, it’s best to not, to avoid errors when the surrounding code is modified.) In real code, you pretty much always explicitly de-option a value using a function like getOrElse:
val f: User = getUser("markcc").getOrElse(new User("markcc"))
As I hope it has become plain, the point of avoiding NPEs through option-like type structures isn’t that somehow it makes the entire category of unexpected result value disappear. It’s that it changes the way that you code to distinguish where those errors can occur, and to force you to deal with them.
I think that ultimately, things like this are really just manifestations of the good-old static vs dynamic type wars. Type errors in a dynamically typed language are really just unexpected value errors. Strong typing doesn’t stop you from making those errors. It just takes a bunch of common cases of those errors, and converts them from a run-time error to a compile-time error. Whether you want them to be run-time or compile-time depends on the project your working on, on your development team, and on your personal preferences.
I find in practice that I get many fewer errors by being forced to explicitly declare when a value might be null/None, and by being required to explicitly deal with the null/None case when it might occur. I’ve spent much less time debugging that kind of error in my year at foursquare than in the 15 years of professional development that I did before. That’s not because I magically became a better programmer a year ago when I joined foursquare. It’s because I’m using a better tool that helps me avoid mistakes.
Sorry things have been so slow around here. I know I keep promising that I’m going to post more frequently, but it’s hard. Life as an engineer at a startup is exhausting. There’s so much work to do! And the work is so fun – it’s easy to let it eat up all of your time.
Anyway… last good-math post ’round these parts was about monoids and programming. Today, I’m going to talk about monads and programming!
If you recall, monoids are an algebraic/categorical construct that, when implemented in a programming language, captures the abstract notion of foldability. For example, if you’ve got a list of numbers, you can fold that list down to a single number using addition. Folding collections of values is something that comes up in a lot of programming problems – capturing that concept with a programming construct allows you to write code that exploits the general concept of foldability in many different contexts.
Monads are a construct that have become incredibly important in functional programming, and they’re very, very poorly understood by most people. That’s a shame, because the real concept is actually simple: a monad is all about the concept of sequencing. A monad is, basically, a container that you can wrap something in. Once it’s wrapped, you can form a sequence of transformations on it. The result of each step is the input to the next. That’s really what it’s all about. And when you express it that way, you can begin to see why it’s such an important concept.
I think that people are confused by monads for two reasons:
Monads are almost always described in very, very abstract terms. I’ll also get into the abstract details, but I’ll start by elaborating on the simple description I gave above.
Monads in Haskell, which are where most people are introduced to them, are very confusing. The basic monad operations are swamped with tons of funny symbols, and the basic monad operations are named in incredibly misleading ways. (“return” does almost the exact opposite of what you expect return to do!)
In programming terms, what’s a monad?
Basically, a monadic computation consists of three pieces:
A monadic type, M which is a parametric type wrapper that can wrap a value of any type.
An operation which can wrap a value in M.
An operation which takes a function that transforms a value wraped in M into another value (possibly with a different type) wrapped in M.
Whenever you describe something very abstractly like this, it can seem rather esoteric. But this is just a slightly more formal way of saying what I said up above: it’s a wrapper for a series of transformations on the wrapped value.
Let me give you an example. At foursquare, we do all of our server programming in Scala. In a year at foursquare, I’ve seen exactly one null pointer exception. That’s amazing – NPEs are ridiculously common in Java programming. But in Scala at foursquare, we don’t allow nulls to be used at all. If you have a value which could either be an instance of A, or no value, we use an option type. An Option[T] can be either Some(t: T) or None.
So far, this is nice, but not really that different from a null. The main difference is that it allows you to say, in the type system, whether or not a given value might be null. But: Option is a monad. In Scala, that means that I can use map on it. (map is one of the transformation functions!)
val t: Option[Int] = ...
val u: Option[String] = t.map( _ + 2 ).map(_.toString)
What this does is: if t is Some(x), then it adds two to it, and returns Some(x+2); then it takes the result of the first map, and converts it toa string, returning an Option[String]. If t is None, then running map on it always returns None. So I can write code which takes care of the null case, without having to write out any conditional tests of nullness – because optionality is a monad.
In a good implementation of a monad, I can do a bit more than that. If I’ve got a Monad[T], I can use a map-like operation with a function that takes a T and returns a Monad[T].
For an example, we can look at lists – because List is a monad:
The monad map operation does a flatten on the map steps. That means a lot of things. You can see one in the rather silly example above.
You can take values, and wrap them as a list. THen you can perform a series of operations on those elements of a list – sequencing over the elements of the list. Each operation, in turn, returns a list; the result of the monadic computation is a single list, concatenating, in order, the lists returned by each element. In Scala, the flatMap operation captures the monadic concept: basically, if you can flatmap something, it’s a monad.
Let’s look at it a bit more specifically.
The monadic type: List[T].
A function to wrap a value into the monad: the constructor function from List def apply[T](value: T): List[T]
The map operation: def flatMap[T, U](op: T => List[U]): List[U].
(In the original version of this post, the I put the wrong type in flatMap in the list above. In the explanation demonstrating flatMap, the type is correct. Thanks to John Armstrong for catching that!)
You can build monads around just about any kind of type wrapper where it makes sense to map over the values that it wraps: collections, like lists, maps, and options. Various kinds of state – variable environments (where the wrapped values are, essentially, functions from identifiers to values), or IO state. And plenty of other things. Anything where you perform a sequence of operations over a wrapped value, it’s a monad.
Now that we have some understanding of what this thing we’re talking about it, what is it in mathematical terms? For that, we turn to category theory.
Fundamentally, in category theory a monad is a category with a particular kind of structure. It’s a category with one object. That category has a collection of arrows which (obviously) are from the single object to itself. That one-object category has a functor from the category to itself. (As a reminder, a functor is an arrow between categories in the category of (small) categories.)
The first trick to the monad, in terms of theory, is that it’s fundamentally about the functor: since the functor maps from a category to the same category, so you can almost ignore the category; it’s implicit in the definition of the functor. So we can almost treat the monad as if it were just the functor – that is, a kind of transition function.
The other big trick is closely related to that. For the programming language application of monads, we can think of the single object in the category as the set of all possible states. So we have a category object, which is essentially the collection of all possible states; and there are arrows between the states representing possible state transitions. So the monad’s functor is really just a mapping from arrows to different arrows – which basically represents the way that changing the state causes a change in the possible transitions to other states.
So what a monad gives us, in terms of category theory, in a conceptual framework that captures the concept of a state transition system, in terms of transition functions that invisibly carry a state. When that’s translated into programming languages, that becomes a value that implicitly takes an input state, possibly updates it, and returns an output state. Sound familiar?
Let’s take a moment and get formal. As usual for category theory, first there are some preliminary definitions.
Given a category, C, 1C is the identity functor from C to C.
Given a category C with a functor T : C → C, T2 = T º T.
Given a functor T, 1T : T → T is the natural transformation from T to T.
Now, with that out of the way, we can give the complete formal definition of a monad. Given a category C, a monad on C is a triple: (T:C→C, η:1C→T, μ:T2 → T), where T is a functor, and η and μ are natural transformations. The members of the triple must make the following two diagrams commute.
Commutativity of composition with μ
Commutativity of composition with η
What these two diagrams mean is that successive applications of the state-transition functor over C behave associatively – that any sequence of composing monadic functors will result in a functor with full monadic structure; and that the monadic structure will always preserve. Together, these mean that any sequence of operations (that is, applications of the monad functor) are themselves monad functors – so the building a sequence of monadic state transformers is guaranteed to behave as a proper monadic state transition – so what happens inside of the monadic functor is fine – to the rest of the universe, the difference between a sequence and a single simple operation is indistinguishable: the state will be consistently passed from application to application with the correct chaining behavior, and to the outside world, the entire monadic chain looks like like a single atomic monadic operation.
Now, what does this mean in terms of programming? Each element of a monadic sequence in Haskell is an instantiation of the monadic functor – that is, it’s an arrow between states – a function, not a simple value – which is the basic trick to monads. They look like a sequence of statements; in fact, each statement in a monad is actually a function from state to state. And it looks like we’re writing sequence code – when what we’re actually doing is writing function compositions – so that when we’re done writing a monadic sequence, what we’ve actually done is written a function definition in terms of a sequence of function compositions.
Understanding that, we can now clearly understand why we need the return function to use a non-monad expression inside of a monadic sequence – because each step in the sequence needs to be an instance of the monadic functor; an expression that isn’t an instance of the monadic functor couldn’t be composed with the functions in the sequence. The return function is really nothing but a function that combines a non-monadic expression with the id functor.
In light of this, let’s go back and look at the definition of Monad in the Haskell standard prelude.
class Functor f where
fmap :: (a -> b) -> f a -> f b
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
(>>) :: m a -> m b -> m b
return :: a -> m a
fail :: String -> m a
-- Minimal complete definition:
-- (>>=), return
m >> k = m >>= _ -> k
fail s = error s
The declaration of monad is connected with the definition of functor – if you look, you can see the connection. The fundamental operation of Monad is “>>=” – the chaining operation, which is basically the haskell version of the map operation, which is type m a -> (a -> m b) -> m b is deeply connected with the fmap operation from Functor‘s fmap operation – the a in m a is generally going to be a type which can be a Functor. (Remember what I said about haskell and monads? I really prefer map and flatMap to >> and >>=).
So the value type wrapped in the monad is a functor – in fact, the functor from the category definition! And the “>>=” operation is just the functor composition operation from the monad definition.
A proper implementation of a monad needs to follow some fundamental rules – the rules are, basically, just Haskell translations of the structure-preserving rules about functors and natural transformations in the category-theoretic monad. There are two groups of laws – laws about the Functor class, which should hold for the transition function wrapped in the monad class; and laws about the monadic operations in the Monad class. One important thing to realize about the functor and monad laws is that they are not enforced – in fact, cannot be enforced! – but monad-based code using monad implementations that do not follow them may not work correctly. (A compile-time method for correctly verifying the enforcement of these rules can be shown to be equivalent to the halting problem.)
There are two simple laws for Functor, and it’s pretty obvious why they’re fundamentally just strucure-preservation requirements. The functor class only has one operation, called fmap, and the two functor laws are about how it must behave.
fmap id = id (Mapping ID over any structured sequence results in an unmodified sequence)
fmap (f . g) = (fmap f) . (fmap g) (“.” is the function composition operation; this just says that fmap preserves the structure to ensure that that mapping is associative with composition.)
The monad laws are a bit harder, but not much. The mainly govern how monadic operations interact with non-monadic operations in terms of the “return” and “>>=” operations of the Monad class.
return x >>= f = f x (injecting a value into the monad is basically the same as passing it as a parameter down the chain – return is really just the identity functor passing its result on to the next step. I hate the use of “return”. In a state functor, in exactly the right context, it does sort-of look like a return statement in an imperative language. But in pretty much all real code, return is the function that wraps a value into the monad.)
f >>= return = f (If you don’t specify a value for a return, it’s the same as just returning the result of the previous step in the sequence – again, return is just identity, so passing something into return shouldn’t affect it.)
seq >>= return . f = fmap f seq (composing return with a function is equivalent to invoking that function on the result of the monad sequence to that point, and wrapping the result in the monad – in other words, it’s just composition with identity.)
seq >>= (x -> f x >>= g) = (seq >>= f) >>= g (Your implementation of “>>=” needs to be semantically equivalent to the usual translation; that is, it must behave like a functor composition.)
I’m not the only one who gets mail from crackpots!
A kind reader forwarded me yet another bit of Cantor crackpottery. It never ceases to amaze me how many people virulently object to Cantor, and how many of them just spew out the same, exact, rubbish, somehow thinking that they’re different than all the others who made the same argument.
This one is yet another in the representation scheme. That is, it’s an argument that I can write out all of the real numbers whose decimal forms have one digit after the decimal point; then all of the reals with two digits; then all of them with 3 digits; etc. This will produce an enumeration, therefore, there’s a one-to-one mapping from the naturals to the reals. Presto, Cantor goes out the window.
Or not.
As usual, the crank starts off with a bit of pomposity:
Dear Colleague,
My mathematic researshes lead me to resolve the continuum theory of Cantor, subject of controversy since a long time.
This mail is made to inform the mathematical community from this work, and share the conclusions.
You will find in attachment extracts from my book “Théorie critique fondamentale des ensembles de Cantor”,
Inviting you to contact me,
Francis Collot,
Member of the American mathematical society
Membre de la société mathématique de France
Member of the Bulletin of symbolic logic
Director of éditions européennes
As a quick aside, I love how he signs he email “Member of the AMS”, as if that were something meaningful. The AMS is a great organization – but anyone can be a member. All you need to do is fill out a form, and write them a check. It’s not something that anyone sane or reasonable brags about, because it doesn’t mean anything.
Anyway, let’s move on. Here’s the entirety of his proof. I’ve reproduced the formatting as well as I could; the original document sent to me was a PDF, so the tables don’t cut-and-paste.
The well-order on the set of real numbers result from this remark that it is possible to build, after the comma, a set where each subset has the same number of ordered elements (as is ordered the subset 2 : 10 …13 … 99).
Each successive integer is able to be followed after the comma (in french the real numbers have one comma after the integer) by an increasing number of figures.
0,0
0,10
0,100
0,1
0,11
0,101
0,2
0,12
0,102
…
…
…
0,9
0,99
0,999
It is the same thing for each successive interger before the comma.
1 2 3
So it is the 2 infinite of real number.
For this we use the binary notation.
But Cantor and his disciples never obtained this simple result.
After that, the theory displays that the infinity is the asymptote of the two branches of the hyperbole thanks to an introduction of trigonometry notions.
The successive numbers which are on cotg (as 1/2, 1/3, 1/4, 1/5) never attain 0 because it would be necessary to write instead (1/2, 1/3, 1/4, 1/4 ).
The 0 of the cotg is also the origin of the asymptote, that is to say infinite.
The beginning is, pretty much, a typical example of the representational crankery. It’s roughly a restatement of, for example, John Gabriel and his decimal trees. The problem with it is simple: this kind of enumeration will enumerate all of the real numbers with finite length representations. Which means that the total set of values enumerated by this won’t even include all of the rational numbers, much less all of the real numbers.
(As an interesting aside: you can see a beautiful example of what Mr. Collot missed by looking at Conway’s introduction to the surreal numbers, On Numbers and Games, which I wrote about here. He specifically deals with this problem in terms of “birthdays” and the requirement to include numbers who have an infinite birthday, and thus an infinite representation in the surreal numbers.)
After the enumeration stuff, he really goes off the rails. I have no idea what that asymptote nonsense is supposed to mean. I think part of the problem is that mr. Collot isn’t very good at english, but the larger part of it is that he’s an incoherent crackpot.