Saturday, February 14, 2009

Beyond Monads

The state monad gives an elegant way to thread state information through Haskell code. Unfortunately it has an annoying limitation: the state must have the same type throughout the monadic expression. In this post I want to look at how to fix this. Unfortunately, fixing State means it's no longer a monad, but we'll discover a new abstraction that replaces monads. And then we can look at what else this abstraction is good for. The cool bit is that we have to write virtually no new code, and we'll even coax the compiler into doing the hard work of figuring out what the new abstraction should be.

This is all based on an idea that has been invented by a bunch of people independently, although in slightly different forms. I'm being chiefly guided by the paper Parameterized Notions of Computation.

The problem with the state monad is that it is defined by

newtype State s a = State { runState :: s -> (a, s) }

The state going into and out of one of these values is the same, s. We can't vary the type of the state as we pass through our code. But that's really easy to fix, just define:

> import Prelude hiding (return,(>>=),(>>),(.),id,drop)
> import Control.Category

> newtype State s1 s2 a = State { runState :: s1 -> (a, s2) }

I can now just copy and paste the definitions (with name changes to avoid clashes) out of the ghc prelude source code

> return' a = State $ \s -> (a, s)
> m >>>= k = State $ \s -> let
> (a, s') = runState m s
> in runState (k a) s'

> get = State $ \s -> (s, s)
> put s = State $ \_ -> ((), s)

We don't have to change a thing! The old code exactly matches the new type. We can now write code using the new State:

> test1 = return' 1 >>>= \x ->
> return' 2 >>>= \y ->
> get >>>= \z ->
> put (x+y*z) >>>= \_ ->
> return' z
> go1 = runState test1 10

But we're now also able to write code like:

> test2 = return' 1 >>>= \x ->
> return' 2 >>>= \y ->
> get >>>= \z ->
> put (show (x+y*z)) >>>= \_ ->
> return' z
> go2 = runState test2 10

The state starts of as an Integer but ends up as a String.

Problem solved! Except that this definition of State doesn't give us a monad and so we lose the benefits of having an interface shared by many monads. Is there a new more appropriate abstraction we can use? Rather than scratch our heads over it, we can just ask ghci to tell us what's going on.

*Main> :t return'
return' :: a -> State s1 s1 a
*Main> :t (>>>=)
(>>>=) :: State s1 s11 t -> (t -> State s11 s2 a) -> State s1 s2 a

This immediately suggests a new abstraction:

> class ParameterisedMonad m where
> return :: a -> m s s a
> (>>=) :: m s1 s2 t -> (t -> m s2 s3 a) -> m s1 s3 a

> x >> f = x >>= \_ -> f

It's a lot like the usual Monad class except that we're now parameterising uses of this class with a pair of types. Our new >>= operator also has a compatibility condition on it. We can think of an element of m s1 s2 as having a 'tail' and 'head' living in s1 and s2 respectively. In order to use >>= we require the head of the first argument to match the tail given by the second argument.

Anyway, we have:

> instance ParameterisedMonad State where
> return = return'
> (>>=) = (>>>=)

We didn't really design this class, we just used what ghci told us. Will it turn out to be a useful abstraction?

First a category theoretical aside: in this post I talked about how monads were really a kind of abstract monoid. Well ParameterisedMonad is a kind of abstract category. If we were to implement join for this class it would play a role analogous to composition of arrows in a category. In a monoid you can multiply any old elements together to get a new element. In a category, you can't multiply two arrows together unless the tail of the second matches the head of the first.

Now we can generalise the writer monad to a ParameterisedMonad. But there's a twist: every monoid gives rise to a writer. This time we'll find that every category gives rise to a ParameterisedMonad. Here's the definition. Again, it was lifted straight out of the source for the usual Writer monad. The main change is replacing mempty and mappend with id and flip (.).

> data Writer cat s1 s2 a = Writer { runWriter :: (a,cat s1 s2) }
> instance (Category cat) => ParameterisedMonad (Writer cat) where
> return a = Writer (a,id)
> m >>= k = Writer $ let
> (a, w) = runWriter m
> (b, w') = runWriter (k a)
> in (b, w' . w)
> tell w = Writer ((),w)
> execWriter m = snd (runWriter m)

It's just like the usual Writer monad except that the type of the 'written' data may change. I'll borrow an example (modified a bit) from the paper. Define some type safe stack machine operations that are guaranteed not to blow your stack:

> push n x = (n,x)
> drop (_,x) = x
> dup (n,x) = (n,(n,x))
> add (m,(n,x)) = (m+n,x)
> swap (m,(n,x)) = (n,(m,x))

We can now 'write' the composition of a bunch of these operations as a 'side effect':

> test3 = tell (push 1) >>
> tell (push 2) >>
> tell dup >>
> tell add >>
> tell swap >>
> tell drop
> go3 = execWriter test3 ()

I guess there's one last thing I have to find. The mother of all parameterised monads. Again, we lift code from the ghc libraries, this time from Control.Monad.Cont. I just tweak the definition ever so slightly. Normally when you hand a continuation to an element of the Cont type it gives you back an element of the continuation's range. We allow the return of any type. This time the implementations of return and (>>=) remain completely unchanged:

> newtype Cont r1 r2 a = Cont { runCont :: (a -> r2) -> r1 }
> instance ParameterisedMonad Cont where
> return a = Cont ($ a)
> m >>= k = Cont $ \c -> runCont m $ \a -> runCont (k a) c

> i x = Cont (\fred -> x >>= fred)
> run m = runCont m return

> test4 = run $ i (tell (push 1)) >>
> i (tell (push 2)) >>
> i (tell dup) >>
> i (tell add) >>
> i (tell swap) >>
> i (tell drop)

> go4 = execWriter test4 ()

So what's going on here? The implementations of these instances require almost trivial changes to the original monads, or in two cases no changes at all apart from the type signature. I have my opinion: Haskell programmers have been using the wrong type class all along. In each case the type signature for return and >>= was too strict and so the functionality was being unnecessarily shackled. By writing the code without a signature, ghci tells us what the correct signature should have been all along. I think it might just possibly be time to consider making ParameterisedMonad as important as Monad to Haskell programming. At the very least, do-notation needs to be adapted to support ParameterisedMonad.

Update: You *can* use do-notation with ParameterisedMonad if you use the NoImplicitPrelude flag.

Update2: Some credits and links:

  1. The Polystate Monad is one of the independent discoveries I mentioned above.
  2. A more general approach to Parameterized Monads in Haskell.
  3. A comment on Parameterized Monads that shows explicitly how to make this work with NoImplicitPrelude.
  4. Oleg's Variable (type)state `monad'.
  5. Wadler discovered this design pattern back in 1993 in Monads and composable continuations.

I didn't contribute anything, this article is just advocacy.