Thursday, November 26, 2009

Programming with impossible functions, or how to get along without monads.

A recent paper by Hyland and Power speculates about an alternative universe in which languages like Haskell develop along a path in which monads don't play a starring role. This post will be about making tentative steps in that direction to see what developer-friendly alternatives might exist to solve problems like writing code using I/O in an elegant and readable manner. I think all of what I'll say was worked out 20 years or more ago but there may be some good reasons to think about this stuff again.

We'll need some Haskell language extensions


> {-# LANGUAGE GADTs, RankNTypes, TypeOperators #-}

> import Prelude hiding (read, IO)
> import Control.Monad.Reader hiding (Reader)


Introduction
Let's start with this problem: We'd like to write pure functional code to read input values of type i sequentially from some source. In an imperative language we could define a function taking no arguments and returning type i. It might look like read :: () -> i, or even just read :: i. But in Haskell such a thing is impossible. Unlike in imperative languages, the value of something of type read :: i is a constant so it can't give a different value each time it is used. How could we implement something like the impossible function read in Haskell?

Here's a general sketch of how I plan to proceed: In some sense we want to add a new language feature to Haskell. So we need a new language that is Haskell plus read. Rather than write a whole new language, we can just define an embedded DSL using Haskell as our host language. When we want to perform read operations we build an abstract syntax tree (AST) for our code in the new language and pass it to an interpreter. Sounds complicated, But the last step will be to show how we can eliminate all of these steps and be left with simple and efficient Haskell code that does exactly what we want. I'll show you what the final code will look like for our read example:


> read_example1 =
> read $ \x ->
> read $ \y ->
> let z = x+y in
> read $ \w ->
> rvalue (z*w)


The same methodology will also work for other non-functional features.

A Reader
We start by defining a type to represent our AST in our read-extended language:


> data Reader i x where


I'm using GADTs because it's useful to make explicit the type of the constructors. The reason for i and x will be explained by the signatures of the constructors. Firstly, we need to represent the value of computations of any type that don't read anything. So we need a constructor like:


> RValue :: x -> Reader i x


Now how should we represent a computation that reads something? Well suppose we're reading something of type i (for input) from some source. We'd like the AST to tell us how such a value would be used once it is read. We express such an idea by writing a function that takes an input value and maps it to an AST. In other words:


> Read :: (i -> Reader i x) -> Reader i x


The argument to Read is an example of a continuation. It describes how to continue after an input is received.

That's it. Now we can write code like


> read_example2 = Read $ \x ->
> Read $ \y ->
> RValue (x+y)


There's just one thing wrong. It doesn't do anything. We have the syntax but no semantics. So we need to write an interpreter. We have some choice here, but I'll do something easy. I'll make our imaginary read function pick off values one by one from a list of inputs. Here is the implementation. We interpret an RValue by simply unwrapping it and ignoring the list of inputs:


> read_interpreter (RValue x) is = x


We handle Read by picking off the first input, applying our continuation to it, and then carrying on by interpreting the result:


> read_interpreter (Read cont) (i:is) = read_interpreter (cont i) is


Now we're ready to go:


> go1 = read_interpreter read_example2 [1,2,3,4,5]


One obvious advantage of leaving things as an AST is that we can apply different interpreters to read from different sources. But if we know what our source is to be we can optimise a bit by completely eliminating the interpreter. We can do this using a completely generic method that I described here. The Reader type is a pretty standard recursive type. In our examples we build a type, but when we interpret we immediately tear it apart again in a reduction operation. The reduction is an example of a generalised fold. Following what I described in that earlier post, the reduction operation looks like this:


> rfold :: (x -> w) -> ((c -> w) -> w) -> Reader c x -> w
> rfold rvalue read (RValue x) = rvalue x
> rfold rvalue read (Read cont) = read (rfold rvalue read . cont)


If you look at read_interpreter you can see there are two parts. There's the part that does the work of picking off inputs, and there's the part that recursively applies read_interpreter. rfold provides the latter so we only have to implement the former. So we now get:


> read_interpreter' = rfold rvalue read where
> rvalue x is = x
> read cont (i:is) = cont i is

> go2 = read_interpreter' read_example2 [1,2,3,4,5]


Now comes the magic step described in that earlier post. Rather than separately build a tree, and then fold it, we can fuse the two operations together by simply replacing the constructors with the arguments to rfold:


> read cont (i:is) = cont i is
> rvalue x is = x


And now we can test the first example:


> go3 = read_example1 [1,2,3,4,5]


All that remains is to work out the details of some more examples.

Writer
An obvious one is a writer. In an imperative language we'd expect a signature like write :: o -> (). Our AST will need a node to represent a write operation. It'll need to contain both the written value, and the continuation. In this case the continuation doesn't actually depend on a hypothetical input, it's just another AST. So we get


> data Writer o x where
> Write :: o -> Writer o x -> Writer o x
> WValue :: x -> Writer o x


Again we need to interpret. We can do this a bit like the Writer monad. The final value will be a pair consisting of a value and a list of outputs.


> write_interpreter (Write o x) = let (v, os) = write_interpreter x in (v, o:os)
> write_interpreter (WValue x) = (x, [])


And now we can implement things like:


> write_example1 from to =
> if from<=to
> then Write from $ write_example1 (from+1) to
> else WValue ()

> go4 = write_interpreter (write_example1 10 20)


Again we can construct a fold for this type:


> wfold :: (x -> w) -> (o -> w -> w) -> Writer o x -> w
> wfold value write (WValue x) = value x
> wfold value write (Write o x) = write o (wfold value write x)

> write_interpreter' = wfold value write where
> value x = (x, [])
> write o (x, os) = (x, o:os)


And eliminating the fold gives us:


> wvalue x = (x, [])
> write o (x, os) = (x, o:os)


Maybe you might have guessed an implementation like that without going through the AST. But it's good to step back and look at the pattern emerging here. For an imperative function of type v1 -> v2 -> ... -> vn -> w we want an AST constructor of type v1 -> v2 -> ... -> vn -> (w -> AST) -> AST, where AST is the type of our tree. Figuring that out gives us a half-way point making the job of writing our interpreter slightly easier. You can sort of read the signature as saying "there's no function mapping mapping v1 -> v2 -> ... -> vn -> v doing what I want, but imagine I had one anyway, what would I do with the result it gave me?".

But, if you look closely at the definition of Writer you might notice that it is, in fact, a type of list structure. It's basically a list of elements of type o terminating in an element of type x. All the interpreter is doing is converting it to a conventional list structure. And if you go back to the definition of Reader you'll see that it too is exactly the right type for the class. It's a sort of dual to a list, one that consumes elements as you work along its length rather than yielding them. We only need an interpreter because we want to apply this object to a specific datatype, a list of inputs.

The fact that we can derive these types automatically from signatures is a really wonderful and deep point that I'm going to have to return to in another post. Usually, when you implement a monad, you have to come up with a type that can express the semantics you want. But that type can (in some sense) be derived automatically from the signatures of the operations you want, and some equations specifying how these operations interact.

IO
We can now try to comine the writer and reader to handle simultaneous input and output:


> data IO i o x where
> IValue :: x -> IO i o x
> IRead :: (i -> IO i o x) -> IO i o x
> IWrite :: o -> IO i o x -> IO i o x


The interpreter is very similar to each of the individual interpreters. It applies our AST to a list of inputs and gives us back a pair consisting of a return value and a list of outputs:


> io_interpreter (IValue x) is = (x, [])
> io_interpreter (IRead cont) (i:is) = io_interpreter (cont i) is
> io_interpreter (IWrite o cont) is = let (x, os) = io_interpreter cont is in (x, o:os)

> io_example1 =
> IRead $ \x ->
> IRead $ \y ->
> IWrite (x+y) $
> IValue ()

> go5 = io_interpreter io_example1 [1..10]


The corresponding fold function is an amalgam of the previous ones:


> ifold :: (x -> w) -> ((i -> w) -> w) -> (o -> w -> w) -> IO i o x -> w
> ifold value read write (IRead cont) = read (ifold value read write . cont)
> ifold value read write (IWrite o cont) = write o (ifold value read write cont)
> ifold value read write (IValue x) = value x

> io_interpreter' = ifold value read write where
> value x is = (x, [])
> read cont (i:is) = cont i is
> write o cont is = let (x, os) = cont is in (x, o:os)

> go6 = io_interpreter' io_example1 [1..10]

> ivalue x is = (x, [])
> iread cont (i:is) = cont i is
> iwrite o cont is = let (x, os) = cont is in (x, o:os)

> io_example2 =
> iread $ \x ->
> iread $ \y ->
> iwrite (x+y) $
> ivalue ()

> go7 = io_example2 [1..10]


Playing with Control Flow
But we're not just limited to these kinds of side-effects. We can also play with control flow. For example, consider an imperative function of tye throw :: e -> () that throws a value out of our computation to the top level. If we followed the description above we'd be tempted to replace that with something of type e -> Exception e x -> Exception e x, just like with write. But that type uses a continuation to say how to proceed after writing. We don't want there to be any continuation, we just want to bomb out with the thrown value. So the type we need is actually e -> Exception e x.


> data Exception e x where
> Throw :: e -> Exception e x
> EValue :: x -> Exception e x


You might recognise that. It's just Either. But we can go through the motions anyway:


> exception_interpret (Throw e) = Left e
> exception_interpret (EValue x) = Right x

> safediv x y = if y==0 then Throw "SIGFPE" else EValue (x `div` y)
> throw_example1 = safediv 1 0
> go8 = exception_interpret throw_example1

> efold :: (e -> w) -> (x -> w) -> Exception e x -> w
> efold throw evalue (Throw e) = throw e
> efold throw evalue (EValue x) = evalue x

> exception_interpret' = efold Left Right

> throw = Left
> evalue = Right


On its own that's not a very useful example. Still, it gives a warmup for a much more interesting control-flow affecting example. Suppose we want to implement the imperative function fork :: [u] -> u that forks one process for each element of the list with each fork getting that element as the return value. So let x = fork [1,2,3] would result in three threads, one with x=1, one with x=2 and one with x=3.


> data ManyWorlds x where
> MValue :: x -> ManyWorlds x


The following line follows directly by applying the principle I described earlier to fork :: [u] -> u:


> Fork :: [u] -> (u -> ManyWorlds x) -> ManyWorlds x


Now we need an interpreter. The state of the system at any time is a list of values, one for each thread. So we can use


> manyworlds_interpret (MValue x) = [x]


When we fork, we want to apply the continuation to each element of the list. We'll then have a list of lists of threads which we collect together into a single list using concat.


> manyworlds_interpret (Fork as cont) = concat (map manyworlds_interpret (map cont as))

> fork_example1 =
> Fork [1,2,3] $ \x ->
> Fork [10,20,30] $ \y ->
> MValue (x+y)

> go9 = manyworlds_interpret fork_example1


Again we can write this as a fold:


> mfold :: (x -> w) -> (forall u. [u] -> (u -> w) -> w) -> ManyWorlds x -> w
> mfold value fork (MValue x) = value x
> mfold value fork (Fork xs u) = fork xs (mfold value fork . u)

> manyworlds_interpret' = mfold value fork where
> value u = [u]
> fork us cont = concatMap cont us


And the interpreter-free version is:


> fork xs cont = concatMap cont xs
> mvalue u = [u]


I hope that looks familiar!

Here's a recursive example to generate the cartesian product of a bunch of lists:


> prod [] = [[]]
> prod (x:xs) = fork x $ \u ->
> fork (prod xs) $ \v ->
> mvalue (u:v)

> fork_example2 = prod [[1,2],[3,4],[10,20]]


Abstract Nonsense
This section is just to give a taste of some deeper stuff going on.

The problem to solve is this: given two side-effectful "extensions" like those above, how to you combine them and still keep functional purity?

As above, given just the signatures of the "functions" in our extensions to Haskell we can build a datatype. These signatures form part of the description of an algebraic theory. The other part is a set of equations, something I skipped above. For example, if I had talked about State then we might have an equation stating that if we do a put, followed by two gets, then the gets would both "get" the same value. With a bunch of signatures and equations, the corresponding type, and even a monad, follow automatically. (You can probably see the monads lurking behind all of my examples above.)

Now, when we write two monads, there is no automatic way to combine them. We have to instead write monad transformers and there is no theory for how to automatically construct them.

On the other hand, there are ways to automatically combine algebraic theories (and hence the monads derived from them). So the latter approach seems like it could be much more powerful. The theory is the primary thing, the monad is just something derived from it.

But the catch is that when I said "automatically" I meant in the sense that there is a precise mathematical description. It's not clear to me yet how to turn the entire process into something that a compiler could perform for you.

And the goal isn't really to eliminate monads. In fact, do-notation is a great way to work and gives nicer looking code than my examples. It's just that maybe we need to be thinking about the theories first and monads second.


You are a protocol droid, are you not?
Anyway, on a tangent, here's a bit of code I spun off from the above that runs as a self-contained example in its own .hs file. I tweaked read and write for the IO case by using the constructor (,) to construct heterogeneous lists instead of using (:) to construct homogeneous lists. I then feed the outputs of a pair of IO expressions into each other as inputs. The result is a type-safe way to implement simple imperative looking coroutines in a couple of lines of code. Good for playing with protocols as a opposed to algorithms:


import Prelude hiding (read)

read f (c,cs) = f c cs
write o r is = let (v, os) = r is in (v, (o,os))
value x is = (x, ())

-- Diffie-Hellman key exchange

p = 23
g = 5

alice =
write "Hello" $
let a = 6 in
write (g^a `mod` p) $
read $ \b ->
let s = b^a `mod` p in
value s

bob =
read $ \message ->
let b = 15 in
write (g^b `mod` p) $
read $ \a ->
let s = a^b `mod` p in
value s

(alice_value, alice_output) = alice bob_output
(bob_value, bob_output ) = bob alice_output

go = print (bob_value,alice_value)




18 comments:

  1. http://www.cs.nott.ac.uk/~gmh/modular.pdf
    You probably would enjoy reading the above paper. It's a bit about how to implement the combination of signatures together.

    It's a bit...ungainly, but seems relevant.

    Now the thing that's really cute, and what ties in to Conal Elliott's Type Class Morphism paper, is that an actual implementation of a signature, a model, corresponds to a functor & morphisms between models correspond to natural transformations. Conal was looking at it more from the functorial point of view, you're starting a bit more directly from models.

    Either way, there's an equivalence between the perspectives (in the categoric sense).

    ReplyDelete
  2. I'll check out that paper, at first glance it looks like exactly what I want. I searched hard for papers with concrete examples but everything I found was pretty abstract.

    ReplyDelete
  3. There's a big caveat, imo, with that paper: you can combine the theories, but relating the monad made from them to the monad you morally _want_ is a little trickier.

    It wasn't clear to me how that gets resolved.

    ReplyDelete
  4. Creighton,

    That sounds a bit like where I got to. I can easily write code that takes pairs of sets of signatures, combines them, and makes you a monad. It works perfectly well with do-notation. But it's "too big" and you need an "interpreter" to cut it back down to size, ie. what you really want is a quotient.

    ReplyDelete
  5. This paper (which also has Ghani as a coauthor) may be relevant, they give a construction of a monad coproduct which is a quotient:

    http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.8.3581

    ReplyDelete
  6. > an alternative universe in which languages like Haskell develop along a path in which monads don't play a starring role.

    http://en.wikipedia.org/wiki/Standard_ML

    ReplyDelete
  7. Ooh, awesome. I actually hadn't read that paper, but when Mr. Sigfpe said "quotient" my first thought was colimit. It looks like the coproduct isn't particularly useful in & of itself, but it's a nice proof of concept that colimits are the notion you want for combining monads.

    Not sure how to "pullback" colimits on free monads of signatures to more primitive operations.

    Also, this paper reminded me that one just shouldn't use the phrase "so called" in an academic work. Despite intentions, it sounds dismissive.

    ReplyDelete
  8. Colimits are the general categorical way of sticking things together.

    ReplyDelete
  9. Chreighton, yes, and "so called" can also come across as "I'm an out of touch fuddy-duddy" - for example, when a judge writes something like "So-called computer "codes" direct the operation of the computer".

    ReplyDelete
  10. Regarding the idea of modelling effects by first constructing ASTs of an embedded language and then writing a small interpreter, you may want to consider this paper:
    http://doi.acm.org/10.1145/1291201.1291206

    I think some of your code appears there almost literally.

    And (to plug my own work :-)) regarding the idea of afterwards eliminating the ASTs and interpreter overhead, you may want to look at this:

    http://dx.doi.org/10.1007/978-3-540-70594-9_20

    (It's a generic treatment, but Section 4 applies it to the "monad specification setting".)

    Finally, regarding how to nicely combine sublanguages for different effects, you may want to look at:

    http://dx.doi.org/10.1017/S0956796808006758

    Happy reading. :-)

    ReplyDelete
  11. Janis,

    Thanks! Those papers look awesome. My next blog post is going to be quite similar to Swiestra's "Monads for Free" section, except I plan to answer the bit where he says "It is not immediately obvious what the type of our algebra should be" slightly differently.

    ReplyDelete
  12. Janis,

    Correction: Swiestra handles the free monad case. I intend to handle the non-free case.

    ReplyDelete
  13. I've read that exceptions and continuations are not algebraic effects. It would explain why Haskell need Monad and Lawvere Theories are not enough.

    But to be honest, I don't yet really understand it. The paper http://homepages.inf.ed.ac.uk/gdp/publications/Effect_Handlers.pdf is the cause of my confusion and I'll have to read it.

    In Plotkin papers, effects are related to constructors and handlers (not algebraic) to deconstructors.

    I wonder if there is a relation with coalgebra. I am just beginning to study effects so I have more questions than answers.

    ReplyDelete
  14. Dear Author blog.sigfpe.com !
    It is possible and necessary :) to discuss infinitely

    ReplyDelete
  15. Just came back as I encountered this talk:

    http://blog.well-typed.com/2009/12/talk-at-the-functional-programming-exchange/

    which also does something very reminiscent of the above.

    Must be something in the air these days.

    ReplyDelete
  16. Great stuff!

    Interestingly, I just also found out about Bart Jacob's co-algebraic view on object-oriented programming. See for example his papers ...

    Objects and classes, co-algebraically

    Introduction to Coalgebra. Towards Mathematics of States and Observations

    ReplyDelete
  17. Your coroutines can be extended with choice (a la session types) with a few more lines of code:

    readOpt f g (Left (c,cs)) = second Left $ f c cs
    readOpt f g (Right (c,cs)) = second Right $ g c cs
    writeRight o r is = let (v, os) = r (fromRight is) in (v, Right (o, os))
    writeLeft o r is = let (v, os) = r (fromLeft is) in (v, Left (o, os))

    ReplyDelete