Since I mentioned the idea of monoids as a formal models of computations, John Armstrong made the natural leap ahead, to the connection between monoids and monads – which are a common feature in programming language semantics, and a prominent language feature in Haskell, one of my favorite programming languages.
Monads are a category theoretic construction. If you take a monoid, and use some of the constructions we’ve seen in the last few posts, we can move build a meta-monoid; that is, a monoid that’s built from monoid-to-monoid mappings – essentially, the category of small categories. (Small categories are categories whose collection of objects are a set, not a proper class.)
We’re going to look at constructs built using objects in that category. But first (as usual), we need to come up with a bit of notation. Suppose we have a category, C. In the category of categories, there’s an identity morphism (which is also a functor) from C to C. We’ll call that 1C. And given any functor from T:C→C (that is, from C to itself), we’ll say that exponents of that functor are formed by self-compositions of T: T2=TºT; T3=TºTºT, etc. Finally, given a functor T, there’s a natural transformation from T to T, which we’ll call 1T.
So, now, as I said, a monad is a construct in this category of category – that is, a particular category with some additional structure built around it. Given a category, C, a monad on C consists of three parts:
- T:C→C, a functor from C to itself.
- A natural transformation, η:1C→T
- A natural transformation μ:T2→T
C, T, η, and μ must satisfy some coherence conditions, which basically mean that they must make the following two diagrams commute. First, we show a requirement that in terms of natural transformations, μ is commutative in how it maps T2 to T:
And then, a commutativity requirement on μ and η with respect to T (basically making μ and η into a meta-identity for this meta-monoid):
μ and η basically play the role of turning C into a meta-meta-monoid. A monoid is basically a category; then we play with it, and construct the category of categories – the first meta-monoid. Now we’re taking a self-functor of the meta-monoid, and and using natural transformations to build a new meta-meta-monoid around it.
One of the key things to notice here is that we’re building a monoid whose objects are, basically, functions from monoids to monoids. We’ve gone meta out the wazoo – but it’s given us something really interesting.
We start with the category. From the category, we get the functor – a structure preserving map from the category to itself. The monad focuses on the functor – the transition from C to C: using natural transformations, it defines an equivalence – not an equality, but an equivalence – between multiple applications of the functor and a single application.
In terms of programming languages, we can think of C as a state. An application of the functor T is an action – that is, a mapping from state to state. What the monad does is provide a structure for composing actions. We don’t need to write the state – it’s implicit in the definition of the functor/action. The monad says that if we have an action “X” and an action “Y”, we can compose them into an action “X followed by Y”. What the natural transformation says is that “X followed by Y” is an action – we can compose sequences of actions, and the result is always an action – which we can compose further, producing other compound actions.
So at the bottom, we have functions that are state-to-state transformers. But we don’t really need to think much about the complexity of a state-to-state transition. What we can do instead is provide a collection of primitive actions – which are themselves written as state-to-state transitions – and then use those primitives to build imperative code – which remains completely functional under the covers, and yet has all of the properties that we would want from an imperative programming system – ordering, updatable state, etc.
Below is a really simple piece of Haskell code using the IO monad. What the monad does is play with IO states. The category is the set of IO states. Each action is a transformation from state to state. The state is invisible — it’s created at the beginning of the “do”, and each subsequent statement is implicitly converted to a state transition function.
hello :: IO () hello = do print "What is your name?" x <- getLine print (concat ["Hello", x])
So in the code above, “print "What is your name"
” is an action from an IO state to an IO state. It’s composed with x <- getLine
– which is, implicitly, another transition from an IO state to an IO state, which includes an implicit variable definition; and that’s composed with the finat “print
“. The actions are sequenced – they occur in the correct order, and each passes its result state to next action. The monad lets us program completely in terms of the actions, without worrying about how to pass the states.
Mark, good stuff on category theory as usual.
Talking about John Armstrong, I think it would be helpful for other readers if you could add his blog to your blogroll (I think you just overlooked that.)
Now here’s something I didn’t realize until one of the students in my category theory seminar last semester helped tease apart: a Haskell monad is actually not a monad. It’s a closely related structure called at “Kleisli triple”. There’s a way of going back and forth between monads and triples, but the data you give in a Haskell program is actually that of the triple corresponding to the monad we’re thinking of.
> a Haskell monad is actually not a monad
Here is the documentation for Monad in the Haskell libraries:
http://cvs.haskell.org/Hugs/pages/libraries/base/Control-Monad.html
Due to a silly mistake in the library, Monad doesn’t derive from Functor, but everyone who needs it defines a Functor instance for their monads. So we have an endofunctor, and two natural transformations, return and join. return and join are supposed to satisfy the equations satisfied by eta and mu in a monad. So I’m a bit confused about why you say Haskell monads aren’t really monads.
John:
As I understand it, the notions are isomorphic until you get to 2-categories.
However, there is some historical justification, in that Haskell Monads as originally posed used fmap/return/join as the basis functions rather than fmap/return/bind.
The category you discuss here – the category of IO states with arrows being state transitions – is all nice and well, but it’s not the category “normally” considered in Haskell code descriptions; and Haskell Monads are monads in the rather different category Hask of Haskell datatypes with functions as morphisms as well.
Thus, IO is a functor that takes a datatype to a datatype encoding IO actions. And the monad structure on IO basically tells us that once we’ve started allowing changes to the World State, we can re-declare our intent to allow changes to the World State without really changing anything.
Is there a good way to view the state categories we get in IO (and in the State monad) as … collapsed subcategories of Hask, with all objects in IOState collapsed to the datatype classed objects in IO Hask? It seems like there should be potential for a “categorification” process here where a state-ish datatype gets blown up to a large subcategory of state transitions encoded using some sort of virtual type thingies…