Wednesday, December 27, 2006

Tying Knots Generically

A comment by a friend and a puzzle on Lambda the Ultimate made me realise that the function loeb that I defined a few weeks back does have a nice application.


> import Maybe
> import Data.Map

> loeb :: Functor a => a (a x -> x) -> a x
> loeb x = fmap (\a -> a (loeb x)) x


The problem it helps to solve is this: building circular datastructures in Haskell, otherwise known as Tying the Knot. loeb solves this problem in a very generic way that imposes the minimum of restrictions on the user.

Remember how in my previous article I showed how you could view loeb as a kind of spreadsheet evaluator. You have a container of functions and that is converted into another container by applying these functions to the final container. This is a circular definition, we're applying the functions to the final result to get the final result. If we think of the final container as a collection of answers to a bunch of problems, loeb allows each element to be constructed using references to any of the other answers in the container, even if they haven't actually been found yet. This is exactly what we need to make circular references.

Firstly, let's look at doing something similar to the Tying the Knot web page to solve Derek's puzzle. Here potential links are made by using a dictionary structure to allow names of objects to stand for direct references to objects. So define an operator to perform pointwise concatenation of string-valued functions:


> f +++ g = \x -> f x ++ g x


Now define lookup into the dictionary, substituting a default value if needed:

And now we can translate some example equations into Haskell by hand. I'm considering define a x *b and define b y *a. Note how it's trivial to write a parser to make these objects because there's nothing circular going on:


> equations = fromList
> [
> ("a",const "x"+++lookup "b"),
> ("b",const "y"+++lookup "a")
> ] where lookup = Data.Map.findWithDefault ""


We now just apply loeb to get a dictionary containing the actual values:


> values = loeb equations
> a = fromJust $ Data.Map.lookup "a" values


We can use a similar method to make a doubly linked list. Here is "A", "B" and "C" in a list with each letter pointing to its predecessor and successor if it exists:


> data DList a = Nil | Node a (DList a) (DList a) deriving Show
>
> val (Node a _ _) = a
> rt (Node _ _ r) = r
> lft (Node _ l _) = l
>
> nodes = fromList
> [
> ("a", \d -> Node "A" Nil (lookup "b" d)),
> ("b", \d -> Node "B" (lookup "a" d) (lookup "c" d)),
> ("c", \d -> Node "C" (lookup "b" d) Nil)
> ] where lookup = Data.Map.findWithDefault Nil
>
> abc = loeb nodes
> b = fromJust $ Data.Map.lookup "b" abc


b is now a doubly linked list and we can use lft and rt to walk up and down it.

And here's a nice advantage of using loeb. You don't have to restrict yourself to storing information about links in a dictionary. Any functorial container will do. For example here's a doubly linked circular list of the integers from 0 to 99 made without using any names:


> zlist = [\z -> Node i (z!!((i-1) `mod` 100)) (z!!((i+1) `mod` 100)) | i <- [0..99]]
> z = loeb zlist


Note how working your way up and down zlist is a slow process because you need to use the slow (!!) to access elements. But stepping up and down any element of z takes time O(1). Potentially very useful!

But there is a slight problem with loeb. I'd like to be able to handle errors better and the obvious way to do that is by allowing the use of monads to handle exceptions, for example if a name lookup in a dictionary fails. I attempted to do this as follows:


> class Monad m => MFunctor m f where
> fmapM :: (a -> m b) -> f a -> m (f b)

> instance Monad m => MFunctor m [] where
> fmapM = mapM

> loebM :: (Monad m,MFunctor m a) => a (a x -> m x) -> m (a x)
> loebM x = r where
> r = do
> lx <- r
> fmapM (\a -> a lx) x

> u = loebM [const (Just 1),const (Just 2)]


But attempting to evaluate u overfows the stack. It also failed when I tried modifying it to use MonadFix. So here's a puzzle. Can you write a monadic version of loeb? It seems like something that would be very useful for writing interpreters for programming languages that allow mutually recursive definitions.

So my original intuition a few weeks back was right. Taking an interesting axiom of modal logic, interpreting as a type, and them finding an implementation, did lead to something useful.



By the way, while I was thinking about loeb I was led to consider what I call tautological containers. These are containers where each element tells you how to extricate it from the container. For example, these can be thought of as tautological: (fst,snd), [head,head.tail], [(!!n) | n <- [0..]] and [("a",fromJust . lookup "a"),("b",fromJust . lookup "b")]. Can you see any use for these objects? Are they universal for some category diagram? I have partial answers to these questions but maybe you can make more sense of them than me.




Update: Ocaml version by Matias Giovannini.

Tuesday, December 19, 2006

Evaluating cellular automata is comonadic

Paul Potts's post inspired me to say something about cellular automata too.

So here's the deal: whenever you see large datastructures pieced together from lots of small but similar computations there's a good chance that we're dealing with a comonad. In cellular automata we compute the value of each cell in the next generation by performing a local computation based on the neighbourhood of that cell. So cellular automata look like they might form a good candidate for comonadic evaluation.

I want to work on 'universes' that extend to infinity in both directions. And I want this universe to be constructed lazily on demand. One way of doing that is to represent a 'universe' as a centre point, a list of all elements to the left of that centre point and a list of all elements to the right. Here's a suitable
type:


> data U x = U [x] x [x]


For example U [-1,-2..] 0 [1,2..] can be thought of as representing all of the integers in sequence.

But this actually contains slightly more information than a list that extends to infinity both ways. The centre point forms a kind of focus of attention. We could shift that focus of attention left or right. For example consider

U [-2,-3..] (-1) [0,1..]

This represents the same sequence of integers but the focus has been shifted left. So think of the type U x as being a doubly infinite sequence with a cursor. (In fact, this makes it a kind of zipper.)

We can formalise the notion of shifting left and right as follows:


> right (U a b (c:cs)) = U (b:a) c cs
> left (U (a:as) b c) = U as a (b:c)


An object of type U is semantically like a C pointer into a const block of memory. You can increment it, decrement it and dereference it using the function I'll call coreturn below.

As U is a kind of list structure, it needs a map. In fact, we can define fmap for it:


> instance Functor U where
> fmap f (U a b c) = U (map f a) (f b) (map f c)


Now the fun starts. First I'll bemoan the fact that Comonads aren't in the standard Haskell libraries (at least I don't think they are). So I have to define them myself:


> class Functor w => Comonad w where
> (=>>) :: w a -> (w a -> b) -> w b
> coreturn :: w a -> a
> cojoin :: w a -> w (w a)
> x =>> f = fmap f (cojoin x)


cojoin is the dual to the usual join function. I've chosen to do things the category theoretical way and define =>> in terms of cojoin.

And here's why U forms a Comonad:


> instance Comonad U where
> cojoin a = U (tail $ iterate left a) a (tail $ iterate right a)
> coreturn (U _ b _) = b


Look closely at cojoin. It turns a into a 'universe' of 'universes' where each element is a copy of a shifted left or right a number of times. This is where all the work is happening. The reason we want to do this is as follows: we want to write rules that work on the local neighbourhoods of our universe. We can think of a universe with the cursor pointing at a particular element as being an element with a neighbourhood on each side. For example, we can write a cellular automaton rule like this:


> rule (U (a:_) b (c:_)) = not (a && b && not c || (a==b))


In order to apply this everywhere in the universe we need to apply the rule to each possible shift of the universe. And that's what cojoin does, it constructs a universe of all possible shifts of a. Compare with what I said here. So believe it or not, we've already written the code to evaluate cellular automata. u =>> rule applies the rule to u. The rest is just boring IO:


> shift i u = (iterate (if i<0 then left else right) u) !! abs i
>
> toList i j u = take (j-i) $ half $ shift i u where
> half (U _ b c) = [b] ++ c
>
> test = let u = U (repeat False) True (repeat False)
> in putStr $
> unlines $
> take 20 $
> map (map (\x -> if x then '#' else ' ') . toList (-20) 20) $
> iterate (=>> rule) u


Lazy infinite structures, comonads, zippers. I think I'm just beginning to get the hang of this functional programming lark! Over Xmas I might feel ready to try writing a piece of code longer than a dozen or so lines.

Anyway, I must end with a credit. I probably wouldn't have come up with this if I hadn't read this paper by Uustalu and Vene.

Wednesday, December 13, 2006

Why isn't ListT [] a monad?

Well I'm back from my vacation. But this isn't my personal blog so I think it's time to dive right in. Anything to take my mind off these mosquito bites...

So consider the free semiring R generated by some set S. In other words, S consists of finite sums and products of 0, 1 and elements of S and where the only simplification rules allowed are

a+0=a, a+b=b+a, a+(b+c)=(a+b)+c

a1=1, a(bc) = (ab)c

a(b+c) = ab+ac and (a+b)c = ac+bc.



For example, consider a term like ab+c(ef+gh). We can use distributivity to multiply this out to ab+cef+cgh. There's not much else we can do to simplify this. Now notice that if you multiply out all of the brackets wherever you can, ie. use distributivity until you can no longer, you end up with an expression that is a sum of terms and each term is a product of generators from S. (The sum of terms may be empty, in which case it's zero, and each of the terms may be a product of zero generators, in which case it is 1). This allows us to write elements of R in a canonical form: as a set of terms where each term is an ordered list of generators. For example, we could write ab+cef+cgh as {[a,b],[c,e,f],[c,g,h]}. Notice how the commutativity of addition is represented by the fact that we're using a set of terms, but each term is a list of generators because we're making no assumption about the commutativity of multiplication.

In Haskell it's more convenient to work with lists. Se we'll represent our running example as [['a','b'],['c','e','f'],['c','g','h']]. So if S is the Haskell type corresponding to our set of generators, then [[S]] can be thought of as the free semiring generated by elements of S, with the proviso that we consider two elements equal if one can be reordered to the other. Note that []=0 and [[]] = 1.

Now suppose that S is itself a free semiring generated by T. Then R = [[ [[T]] ]], modulo ordering. If you think about it, there's a nice way to use the algebraic structure to 'flatten' an element of [[ [[T]] ]] down to an element of [[T]]. R is freely generated by elements of S, in other words it consists of sums of products of elements of S. But the elements of S are themselves sums and products. So we have sums of products of sums of products. I emphasised two words in that sentence. This is because R contains subparts that are products of sums. If we multiply these out in the usual way, we get sums of sums of products of products. Group the sums and products together, and we get back to sums of products. Here's an example: any element of T trivially gives an element of S. So if a,b,c,e,f,g and h are all in T, then a,b,c and ef+gh are all in S and hence are generators of R, so ab+c(ef+gh) is in R. Multiply out and we obviously have the element ab+cef+cgh of S. It's not hard to see how this generalises to map any element of S back down to T. So we have maps


T -> [[T]] (by trivial inclusion of generators)

[[ [[T]] ]] -> [[T]] (by above argument)



Look to you like a monad? Of course it does! :-) But what monad is it?

First, let's implement the map [[ [[T]] ]] -> [[T]]. After tinkering I came up with


> import Control.Monad.List

> flatten x = concatMap (map concat . sequence) x


We can test it out


> x = [[ [['a']],[['b']] ], [ [['c']],[['e','f'],['g','h']] ]]
> test1 = flatten x


It multiplies out exactly the way we want.

Now compare with running this:


> x' = ListT [[ ListT [['a']],ListT [['b']] ],
> [ ListT [['c']],ListT [['e','f'],['g','h']] ]]
> test2 = runListT $ join x'


In other words, apart from the ListT fluff, join for ListT [] is flatten. So if we consider lists as a mechanism to represent sets, ListT [] is revealed as the monad of semirings. I'm not sure, but I think that historically this is where monads originally came from. Certainly there are many papers on the relationship between monads and algebraic structures like semirings.

And now I can answer my original question. In a semiring, addition is commutative, so the order of terms doesn't matter. But in ListT [], we're using lists, and order does matter in a list. So if we do take order into account, then really ListT [] is the monad of semirings where both addition and multiplication is non-commutative. And here's the problem: in general, there is no such thing as a freely generated semiring with non-commutative addition.

Here's why: consider the expression ((a+b)+(a+b))*((a+b)+(a+b)). Multiply out the inner parentheses first and we get

(a+b+a+b)*(a+b+a+b)

= a*a+a*b+a*a+a*b+…


Now multiply out the outer parentheses first and we get

(a+b)*(a+b)+(a+b)*(a+b)+(a+b)*(a+b)+(a+b)*(a+b)

= a*a+a*b+b*a+b*b+…


The terms are coming out in a different order. Essentially distributivity doesn't work in a semiring with non-commutative addition.

This translates directly into failure of one of the monad laws. First write our expression as an object of type ListT []:

u = a+b

> u = ListT [["a"],["b"]]


v = (a+b)+(a+b)

> v = ListT [[u],[u]]


w = ((a+b)+(a+b))*((a+b)*(a+b))

> w = ListT [[v,v]]


join multiplies out parentheses. So working from the outer parentheses inwards we can use:


> expanded1 = join $ join w
> go1 = runListT expanded1


Working outwards from the inner parentheses we get:

> expanded2 = join $ fmap join w
> go2 = runListT expanded2


(Note how I use fmap to 'duck down' through the outer layer of parentheses so as to multiply out each of the subexpressions first.)

Evaluate go1 and go2 and you'll see that they corresponds to the two ways of multiplying out that I gave above. And more importantly, the values of expanded1 and expanded2 aren't equal, meaning that join . join = join . fmap join isn't satisfied. You may recognise this: it's one of the monad laws. (At least it's one of the monad laws written the way category theorists write them.) So we ain't got no monad.




I think this is now a fairly complete analysis of what ListT [] is all about. So one obvious remaining question is: where do games come into all this? The answer is that games form a semiring in a way that I haven't seen documented anywhere (though is surely common knowledge). I'd explain but I've run out of time...

Note that the reason ListT [] isn't a monad is that [] isn't commutative, in some sense. This has been observed many times in the past. Two papers mentioning this are this and this.

I actually figured out all of this stuff before this. I realised that the trees I was scribbling on the backs of my envelopes to represent elements of semirings could actually be generalised to just about any kind of tree, so I wrote about the general case first.

I prefer my example of ListT [] failing to be a monad to the examples given here. The latter make use of the IO monad so they aren't quite as 'pure'.

Saturday, December 02, 2006

A Yonedic Addendum

Firstly, blogger.com seems to have temporarily lost the preview feature so I'm writing this as blind HTML. I won't know exactly how it looks until I hit 'publish'. (It's not real HTML so it's no good just pasting into a web page.)

I kept meaning to follow up on my earlier post about the Yoneda lemma by working out if each of the three examples I considered were Theorems for Free! But it's tedious work to decrypt the free theorem as it is generated by the procedure in Wadler's paper. But then I suddenly realised that lambdabot could do the work for me. I looked at all three examples that I gave, but I'll just spell out the details for the last one here. Consider machines of type forall b. (A -> b) -> C -> b. What free theorem do we get from that? If you don't have lambdabot you can see here. But first I need to point out either a limitation in that free theorem prover, or a limitation of my understanding of it. It seems to attach a forall for every free variable. But I want to have A and C fixed, but unknown. It makes a difference. In the end a cheated by considering the type


(Float -> b) -> Char -> b


The free theorem is
h1 (f1 x2) = g1 x2) => h1 (t1 f1 x1) = t1 g1 x1

where t1 :: forall b. (Float -> b) -> Char -> b. With a tiny bit of work we can deduce

t1 h1 = h1 . t1 id.



Using my earlier notation, that is essentially check3 . uncheck3 = id. Similar results followed for the other examples. So I conjecture that for functors F, the free theorems for (a -> b) -> F b are just proofs of the Yoneda lemma (well, the less trivial direction at least). I'm guessing this is all blindingly obvious to type theorists but it's all new to me.


Anyway, a couple of thoughts come to mind. This stuff is all about parametricty, and part of what makes this work is that any polymorphic function (with the same restrictions in Theorems for Free!) that looks like a natural transformation is a natural transformation in the category of Haskell types. But Theorems for Free! also talks about functions that don't look like natural transformations. For example consider the type (a->a)->(a->a). The free theorem reflects the fact that any such function must simply raise its argument to some non-negative integer power. But it seems to me that when the free theorem is written in point-free style, then functions (ie. functions that map objects to arrows like the way natural transformations do) in a general category that satisfy this theorem are also in some sense 'natural'. So is there a wider sense of 'natural' in a category that I should know about?


What I find interesting here isn't necessarily the type theory in itself. What I think is interesting is that the type theory provides nice intuition for other applications for the Yoneda lemma, and indeed other parts of category theory. Up to now, my spare time reading of computer science hasn't really fed back back into my understanding of other branches of mathematics. But this time it has. Even though the category of Haskell types and functions looks a lot like Set, it has nice properties of its own that deepen your understanding of categories in general.


Anyway, while I'm on the subject of things Yonedic, here's an application of category theory to the composition of music. The author s claim the Yoneda lemma has applications in this field. Yes, you read that correctly. No, it's not April 1. In fact, I discovered this by doing a google code search on yoneda.


One last thing. I should credit augustss for getting me to think about the mathematical significance of parametricity in the first place. That was 6 months ago. Trying to do mathematics when you only have a couple of hours free a week is slow going.


BTW I'm in Mexico on vacation for the next week and a half. On the plane I'll be reading (at least) the papers on cake cutting and Venn diagrams mentioned here: Ars Mathematica.




Blog Archive