Friday, January 26, 2007

The Monads Hidden Behind Every Zipper

Uustalu points out that behind every zipper lies a comonad. I used this in my cellular automaton example. What he doesn't mention is that there is also a monad lurking behind every zipper and that it arises in a natural way. The catch is that it's a monad in the wrong category making it tricky to express in Haskell.

Firstly, what does "wrong category" mean? In the category of vector spaces, every object is equipped with a commutative binary + operator. We can make no such assumption in Haskell. What's more, Haskell doesn't easily allow us to restrict type constructors to instances of Num say. So we can't easily implement monads in the category of vector spaces as instances of Haskell's Monad. Nonetheless, we can always write our own bind and return constrained by Num, and that's what I'll do here. I believe there are workarounds that don't involve GADT abuse but really I just want to illustrate an idea, not write a complete working application.

As I've mentioned on #haskell, convolution is comonadic. A convolution (more specifically, a discrete convolution) is a function that maps a (doubly infinite) sequence of real numbers to another such sequence of real numbers in such a way that each number in the new sequence is a weighted sum of numbers in the old sequence and the weights depend only on the relative positions of the numbers. For example: replacing each number in the sequence with the average of it and its neighbours is a convolution.

Here's a simple example. Consider the sequence ...0,0,1,2,3,0,0,... We'll replace each number with the sum of its neighbours and twice itself. So this maps to ...,0,0+0+1,0+2*1+2,1+2*2+3,2+2*3+0,3+2*0+0,0,... The structure of this computation is very similar to the structure of cellular automaton evaluation and we can reuse that code here:


> data Zipper a = Zipper [a] [a] deriving (Eq,Show)

> left (Zipper (a:as) bs) = Zipper as (a:bs)
> right (Zipper as (b:bs)) = Zipper (b:as) bs

> class Comonad w where
> coreturn :: w a -> a
> cobind :: (w a -> b) -> w a -> w b

> iterate1 f x = tail $ iterate f x

> instance Comonad Zipper where
> cobind f a = fmap f $ Zipper (iterate1 left a) (iterate right a)
> coreturn (Zipper _ (b:_)) = b

> a = Zipper (repeat 0) ([0,1,2,3]++repeat 0)
> f (Zipper (a:_) (b:c:_)) = a+2*b+c

> test = let Zipper u v = cobind f a in take 5 v


When we run test we get [1,4,8,8,3].

Now think about the structure of this computation. We write a function f that gathers up the value at a cell and its neighbours and computes the weighted sum. cobind then applies this function all the way through our sequence to get our result. The important thing is that f pulls in the data it needs, distilling it down to one value, and that's why it has type Zipper a -> a.

There's another approach to convolution. We could push or scatter data. At each point in the sequence we could take the value and push one copy left, one copy right, and leave double the original value in place. Each cell will end up with three values left in it and all we have to do is sum the values deposited in a cell. What we want therefore is to take a value in a cell, of type a, and generate an entire zipper full of values pushed from it, ie. a Zipper a. We'd then like bind for the monad to sum over the returned zippers, staggering them right and left as appropriate. Here's a table illustrating what I mean:


0 -> 0 0 0 0 0 0
1 -> 0 1 2 1 0 0
2 -> 0 0 2 4 2 0
3 -> 0 0 0 3 6 3
0 -> 0 0 0 0 0 0
----------------
0 1 4 8 8 3


Unfortunately, as I pointed out above, bind for a monad can't perform summations, so I have to define bind' and return' instead.



> plus (Zipper a b) (Zipper a' b') = Zipper (plus' a a') (plus' b b') where
> plus' (a:as) (b:bs) = (a+b) : plus' as bs
> plus' a [] = a
> plus' [] a = a


Note that I'm going to assume that elements past the end of a list (so to speak) actually represent zero and define new versions of left and right that produce the required zeros on demand:


> left' (Zipper (a:as) bs) = Zipper as (a:bs)
> left' (Zipper [] bs) = Zipper [] (0:bs)
> right' (Zipper as (b:bs)) = Zipper (b:as) bs
> right' (Zipper as []) = Zipper (0:as) []

> tail' [] = []
> tail' a = tail a
> stagger f [] = []
> stagger f (x:xs) = x : map f (stagger f xs)
> stagger1 f x = tail' (stagger f x)

> instance Functor Zipper where
> fmap f (Zipper a b) = Zipper (map f a) (map f b)


Here's our fake monad:


> return' a = Zipper [] [a]
> bind' f x = let Zipper a b = fmap f x
> in foldl1 plus (stagger left' b ++ stagger1 right' a)

> a' = Zipper [] [0,1,2,3]
> f' x = Zipper [x] [2*x,x]

> test' = let Zipper u v = bind' f' a' in take 5 v



Run test' and we get the same result as before by a quite different method.

You should be able to see that in some sense comonads describe lazy computations and monads represent strict ones. In strict languages you evaluate your arguments first, and then push them into a function. In lazy languages you call the function first, and it pulls in the value of its arguments as needed. In fact, in Haskell we have a strictness monad and in ocaml you can implement a lazy comonad. By and large, it's easier to write pull algorithms in Haskell. Push algorithms are much easier to implement in a language with side effects. In fact, one of the things that Haskell beginners need to learn is how to rewrite push algorithms as pull algorithms and eliminate those side effects.

What I think is interesting here is that this dichotomy between push and pull, scatter and gather, is pervasive all the way through computing, and it's quite pretty that this deep notion it can be captured nicely by a(n almost) monad and a comonad. Just to give an illustration of how pervasive it is consider 3D rendering. We typically want to take a list of primitives and turn it into an array of pixels. If we rasterise we push the primitives, asking for each primitive, what pixels does it cover. If we ray-trace we pull on the pixels asking, for each one, what primitives cover it. Many optimisations to both of these approaches consist of making judicious decisions about which part of the pipeline should be switched from push to pull or vice versa.

BTW I have an algorithm for automatically converting some types of pull algorithm into push algorithms and vice versa. It's another one of those papers that I keep meaning to finish and submit for publication one day...

9 comments:

  1. The deep link between ray tracing and scanline rendering actually goes much deeper than than.

    I'll leave the full details to a paper on the topic, but intuitively, you can think of the 3D scene as a vector space, the vectors of which are an assignment of output irradiance (emitted light) from each point.

    If you think of the "source" lights as a vector x and the "destination" film plane as a vector y, then scanline rendering is an inner product < M x | y > (using Dirac bracket notation), where M is an operator that moves light from the source. So you can think of M transporting light from the light sources, and then the camera just samples.

    Ray tracing, then, is the following inner product: < x | M* y >, where M* is an operator that traces some quantity (usually called "importance") out from the camera in the direction of the light sources.

    Linear algebraists will recognise that in the matrix formulation M* is the adjoint matrix of M Quantum mechanics/operator calculoids will also recognise the adjoint operator. One of the interesting results of rendering theory is that most of the basic light transport operations are self-adjoint, which is why ray tracing works.

    Now compare this with the category theory adjunction (F and G are adjoint if C[x, F y] = D[G x, y] for all x, y). Is there a deeper connection here? Are the natural monad/comonads here in some way adjoint?

    ReplyDelete
  2. I know the Per Christensen paper well though I first thought about this stuff in the context for forward/reverse mode automatic differentiation. (Closely related is my practical experiment here based on a SIGGRAPH paper.)

    Adjunctions were loosely 'modelled' on adjoints but I haven't yet thought hard about whether the connection is deeper. But there are distinct push and pull algorithms for matrix multiplication (the difference is important when writing sparse matrix routines) so I expect I could get a monad/comonad distinction out of it.

    ReplyDelete
  3. Here's a hunch into the connection.

    if C is the category of interest for Haskell programs (eg, C is such that we can say the IO monad is a functor: C->C) I have two guesses about what's really going on.

    Guess (A):

    for some nontrivial CD, DC, CD', D'C : C->D, D->C, C->D', and D'->C respectively we actually have, say,

    IO : C -> (D -> C -> D') -> C
    OI : C -> (D' -> C -> D) -> C

    formed by use of the relevant compositions. I'm not sure what this would give us, but am throwing it out for completeness.

    Guess (B):

    At least for some comonads ( OI, maybe others ) we are actually in different categories (ie, OI:D->D but IO:C->C), but one of the mappings F:D->C or G:C->D is so trivial we don't know we're using it; alternatively, both F:D->C and G:C->D are nontrivial but at program launch we're "transparently" applying some (apparently trivial) I:C->D (or I:D->C) and we haven't figured out that we're doing it (or what the equivalent I':D->C (or I':C->D) would be).

    Guess (B) feels correct to me, but a good proof would be needed to convince me.

    What most comes to mind is the suspicion that all of Haskell outside of the monads compiles into a comonad:
    if we make the relabelings:

    extend -> defer
    extract -> request

    then

    defer request == id
    request . (defer f) == f
    (defer f) . (defer g) == defer (f . defer g)

    is almost precisely how programs are lazily evaluated. (And: perhaps that's why naive OI has problems: there's not a natural way to do truly lazy IO in a safe fashion...)

    I'm running out of steam here, but: we already know that most useful haskell programs are a sequence of evaluations of purely-functional code, with the sequence ordering (and thus the values fed in to be evaluated) coerced by encapsulating it in a monad.

    As currently evaluated the code called between binds and returns is already in a transparent comonad (it's transparent to the programmer, b/c the compiler handles the transformation). What would we gain by bundling the code into explicit comonads?

    ReplyDelete
  4. Thanks for the nice post!

    ReplyDelete
  5. This is a great post.

    Did you ever finish that paper? If not, can I see a draft?

    ReplyDelete
  6. Luke,

    The paper's finished but not submitted. For reasons mentioned above, it's based on C code rather than functional code. My submission for publication is a bit held up because it connects a tiny bit with a patent application my employer wants me to do. That also means there are probably awkward legal implications of showing it to anyone.

    OTOH I think I can argue that the patent doesn't really hinge on what's in this paper. I'll raise that with my manager and if you give me contact info I can send it to you if you're still interested.

    ReplyDelete
  7. Enlighting! This gives me an idea what comonads are. However, I associated Zipper with things like List.zip and Control.Applicative.ZipList. But instead it represents a list which is infinite to the left and to the right ("doubly infinite" as you call it).

    ReplyDelete
  8. Really cool! One question though, could left' and right' be replaced by something like,

    left' (Zipper a b) = left $ Zipper (a ++ repeat 0) (b ++ repeat 0)

    And would this implementation have any disadvantages over the original?

    ReplyDelete
  9. Really cool! One question though, could left' and right' be replaced by something like,

    left' (Zipper a b) = left $ Zipper (a ++ repeat 0) (b ++ repeat 0)

    And would this implementation have any disadvantages over the original?

    ReplyDelete