One of the questions that a ton of people sent me when I said I was going to write about category theory was “Oh, good, can you please explain what the heck a monad is?”
The short version is: a monad is a category with a functor to itself. The way that this works in a programming language is that you can view many things in programming languages in terms of monads. In particular, you can take things that involve mutable state, and magically hide the state.
How? Well – the state (the set of bindings of variables to values) is an object in a category, State. The monad is a functor from State → State. Since the functor is a functor from a category to itself, the value of the state is implicit – they’re the object at the start and end points of the functor. From the viewpoint of code outside of the monad functor, the states are indistinguishable – they’re just something in the category. For the functor itself, the value of the state is accessible.
So, in a language like Haskell with a State monad, you can write functions inside the State monad; and they are strictly functions from State to State; or you can write functions outside the state monad, in which case the value inside the state is completely inaccessible. Let’s take a quick look at an example of this in Haskell. (This example came from an excellent online tutorial which, sadly, is no longer available.)
Here’s a quick declaration of a State monad in Haskell:
class MonadState m s | m -> s where get :: m s put :: s -> m () instance MonadState (State s) s where get = State $ s -> (s,s) put s = State $ _ -> ((),s)
This is Haskell syntax saying we’re defining a state as an object which stores one value. It has two functions: get
, which retrieves the value from a state; and put
, which updates the value hidden inside the state.
Now, remember that Haskell has no actual assignment statement: it’s a pure functional language. So what “put” actually does is create a new state with the new value in it.
How can we use it? We can only access the state from a function that’s inside the monad. In the example, they use it for a random number generator; the state stores the value of the last random generated, which will be used as a seed for the next. Here we go:
getAny :: (Random a) => State StdGen a getAny = do g <- get (x,g') <- return $ random g put g' return x
Now – remember that the only functions that exist *inside* the monad are "get" and "put". "do" is a syntactic sugar for inserting a sequence of statements into a monad. What actually happens inside of a do is that *each expression* in the sequence is a functor from a State to State; each expression takes as an input parameter the output from the previous. "getAny" takes a state monad as an input; and then it implicitly passes the state from expression to expression.
"return" is the only way *out* of the monad; it basically says "evaluate this expression outside of the monad". So, "return $ randomR bounds g" is saying, roughly, "evaluate randomR bounds g" outside of the monad; then apply the monad constructor to the result. The return is necessary there because the full expression on the line *must* take and return an instance of the monad; if we just say "(x,g') <- randomR bounds g", we'd get an error, because we're inside of a monad construct: the monad object is going be be inserted as an implicit parameter, unless we prevent it using "return". But the resulting value has to be injected back into the monad – thus the "$", which is a composition operator. (It's basically the categorical º). Finally, "return x" is saying "evaluate "x" outside of the monad – without the "return", it would treat "x" as a functor on the monad.
The really important thing here is to recognize that each line inside of the "do" is a functor from State → State; and since the start and end points of the functor are implicit in the structure of the functor itself, you don't need to write it. So the state is passed down the sequence of instructions – each of which maps State back to State.
Let's get to the formal part of what a monad is. There's a bit of funny notation we need to define for it. (You can't do anything in category theory without that never-ending stream of definitions!)
- Given a category C, 1C is the *identity functor* from C to C.
- For a category C, if T is a functor C → C, then T2 is the TºT. (And so on for tother )
- For a given Functor, T, the natural transformation T → T is written 1T.
Suppose we have a category, C. A *monad on C* is a triple (T,η,μ), where T is a functor from C → C, and η and μ are natural transformations; η: 1C → T, and μ: (TºT) → T. (1C is the identity functor for C in the category of categories.) These must have the following properties:
First, μ º Tμ = μ º μT. Or in diagram form:
Second, μ º Tη = μ º ηT = 1T. In diagram form:
Basically, what these really comes down to is an associative property ensuring that T behaves properly over composition, and that there is an identity transformation that behaves as we would expect. These two properties together add up to mean that any order of applications of T will behave properly, preserving the structure of the category underlying the monad.