A Neighborhood of Infinity

Saturday, June 14, 2008

Categories of polynomials and comonadic plumbing

Suppose you have a Haskell program and you want to introduce a new global constant into your program. There are at least two approaches you could take:

1. Simply introduce a new global constant. You could name it x and write something like x=1.23456 and refer to x throughout your code. This has the advantage of being easy to implement.
2. Write all of your code in monadic style and make use of the reader monad. This is intrusive in the sense that you may have to make many changes to your code to support it. But it has the advantage that all of your functions now explicitly become functions of your global constant.

Now I’m going to roughly sketch a more categorical view of both of these approaches. So let’s restrict ourselves to the subset of Haskell that corresponds to typed lambda calculus without general recursion so that we know all of our functions will be total and correspond to the mathematical notion of a function. Then all of our functions become arrows in the category that we’ll call Hask.

Firstly consider approach (1). Suppose we want to introduce a new constant, x, of type A. Category theory talks about arrows rather than elements of objects, so instead of introducing x of type A, introduce the function x:1->A where 1 is the terminal object in Hask, normally called (). An element of A is the same thing as an element of 1->A, but in the latter case we have an arrow in the category Hask.

Before continuing, let me digress to talk about polynomials. Suppose we have a ring (with an identity) R. We define R[x], where x is an indeterminate, to be the ring of polynomials in x. Another way to describe that is to say that R[x] is the smallest ring containing R and an indeterminate x, that makes no assumptions about x other than those required to make R[x] a ring. For example we know that (1+x)(1-x)=1-x2, because that must hold in any ring. Given a polynomial p in R[x] we can think of it as a function fp from R to R. fp(a) is the value we get when substituting the value of a for x in p. So a polynomial in R[x] is the same as a function from R to R that can be written in terms of elements of R, multiplication and addition.

We can do the same with category theory. Given a category A we can ask for the smallest category extending A and containing an indeterminate arrow x:1 -> A. Just as with polynomials we have to allow all possible arrows that can be made by composing arrows of A with x. The resulting expressions for arrows will contain x as a free variable, just like the way x appears in polynomials. In fact, by analogy we can call the resulting category, A[x], the category of polynomials in x:1->A. In the special case A=Hask, you can see that Hask[x] is the category of Haskell functions extended by a new constant of type x:1->A but assuming no equations other than those necessary to make Hask[x] a category. Just as an arrow in Hask is a Haskell function, an arrow in Hask[x] is a Haskell function making use of an as yet undefined constant x.

(I've glossed over some subtleties. Just as we need a suitable equivalence relation to ensure that (1+x)(1-x)=1-x2 in R[x], we need suitable equivalence relations in our category. I'll be showing you where to find the missing details later.)

Here's the implementation of a function, h, making use of a constant x:

(Note that I'll be using Edward Kmett's category-extras shortly so I need some imports)
> import Control.Monad.Reader> import Control.Comonad> import Control.Comonad.Reader> x = 1.23456> f a = 2*a+x> g a = x*a> h a = f (g a)> test1 = h 2

Now consider the second approach. The easiest thing is to just give an implementation of the above using the reader monad:
> f' a = do>    x <- ask>    return $2*a+x> g' a = do> x <- ask> return$ x*a> h' a = return a >>= g' >>= f'> test2 = runReader (h' 2) 1.23456

So here's the important point: Hask[x] is the same category as HaskR. In both cases the arrows are things, which when supplied a value of the right type (like 1.23456), give arrows in Hask from their head object to their tail object.

But there's another way to do this. We can use the reader comonad:
> f'' a = 2*extract a+askC a> g'' a = extract a*askC a> h'' a = a =>> g'' =>> f''> test3 = runCoreader (h'' (Coreader 1.23456 2))

In a similar way, we're dealing with arrows of the form wa -> b and we can compose them using =>>. These arrows form the coKleisli category of the reader comonad, S, which we can write HaskS. So we must have

Now some back story. Over 20 years ago I was intrigued by the idea that logic might form a category with logical ‘and’ and ‘or’ forming a product and coproduct. I came across the book Introduction to Higher Order Categorical Logic by Lambek and Scott for ₤30.00. That’s ₤60.00 at today's prices, or about \$120.00. On a student grant? What was I thinking? And as it bore no relation to anything I was studying at the time, I barely understood a word of it. I was probably fairly applied at that point doing courses in stuff like solid state physics and electromagnetism as well as a bit of topology and algebra. I doubt I'd heard of lambda calculus though I could program in BASIC and APL. So there it sat on my bookshelf for 22 years. Periodically I’d look at it, realise that I still didn’t understand enough of the prerequisites, and put it back on the shelf. And then a month or so ago I picked it up again and realised that the first third or so of it could be interpreted as being about almost trivial Haskell programs. For example, on page 62 was

Proposition 7.1
The category A[x] of all polynomials in the indeterminate x:1->A over the cartesian or cartesian closed category A is isomorphic to the Kleisli category AA=ASA of the cotriple (SA,&epsilonAA).

The language is a little different. Lambek and Scott used the term cotriple instead of comonad and Kleisli category where I’d say coKleisli category. δ and ε are cojoin and coreturn. And Lambek and Scott's theorem applies to any cartesian closed category. But after staring at this claim for a while it dawned on me that all it was really saying was this: here are two ways to introduce new constants into a category. But there’s no way I would have seen that without having practical experience of programming with monads. Learning Haskell has finally paid off. It’s given me enough intuition about category theory for me to get some return on my ₤30.00 investment paid to Heffers all those years ago. I expected to take this book to my deathbed, never having read it.

Anyway, for the details I left out above, especially the correct equivalence relation on Hask[x], you'll just have to read the book yourself.

Also, note the similarity to the deduction theorem. This theorem says that if we can prove B, assuming A, then we can deduce A implies B without making any assumptions. It unifies two way to introduce a proposition A, either as a hypothesis, or as an antecedent in an implication. In fact, the above theorem is just a categorical version of the deduction theorem.

Also note the connection with writing pointfree code. In fact, the pointfree lambdabot plugin makes use good use of the reader monad to eliminate named parameters from functions.

I’m amazed by seeing a book from 1986 that describes how to use a comonad to plumb a value through some code. As far as I know, this predates the explicit use of the reader monad in a program, Wadler and Moggi’s papers on monads, and certainly Haskell. Of course monads and comonads existed in category theory well before this date, but not, as far as I know, for plumbing computer programs. I’d love to hear from anyone who knows more about the history these ideas.

Anonymous said...

Nice post. I've also found that haskell can really help in learning category theory.

Btw, what you say about polynomials as functions is incorrect:

"So a polynomial in R[x] is the same as a function from R to R that can be written in terms of elements of R, multiplication and addition."

This is not true in general, as for example x^p - x == 0 as a function in F_p. It is of course true only for infinite fields.

sigfpe said...

Anonymous,

While I tried to make it clear for categories, I should have also made it clearer for ordinary polynomials too - you need to have a suitable equivalence relationship on your polynomials before they are in 1-1 correspondence with functions written in terms of elements of R, multiplication and addition.

Derek Elkins said...

You will probably find this paper and the papers also discussed (at least some of which I'm sure you've already read) interesting.

Decomposing typed lambda calculus into a couple of categorical programming languages

Edward Kmett said...

sigfpe said...

Edward,

And of course the 'three way' version of the isomorphism I give involving monads requires a CCC as it uses currying. I should write out the full details as it's not in the book. When I have a moment...

Mikael Vejdemo-Johansson said...

Edward, does that difference in categories needed dualize neatly? Is there such a thing as a writer comonad, and do you get similar differences in category properties needed between the writer monad and comonad?

sigfpe said...

Hmm...if we dualise this we should get the notion of a coindeterminate, a symbol that sucks the value of subexpressions out of expressions.

Kim-Ee Yeoh said...

Mikael,

There is a monoidal comonad, although I'm not sure you'd want to call it a writer comonad. See Tarmo.

The usage is distinctly different, and I don't believe you get another monad/comonad isomorphism. The one above is the only one I know of, a consequence of product being left-adjoint to exponentiation.

Perhaps sigfpe might want to blog a nice illustration of the monoidal comonad?

sigfpe said...

Thanks Derek, that paper looks does look interesting.

Edward Kmett said...

Mikael,

the dual of the writer monad is takes a comonoidal structure. in other words instead of a binary operation

join :: a * a -> a,

you need a

split :: a -> a * a

The Iavor's implementation of Augustsson, Rittri, and Synek's Supply comonad (included in category-extras) models this approximately, but uses dirty unsafePerformIO tricks to avoid excessive splitting.

I have intended to get around to adding a SupplyT comonad transformer and a pure Supply comonad, but haven't taken the time.

Finally, note that linear implicit parameters in Haskell (plague though they may be on the semantics of Haskell) implement pretty much this very idea and are a useful source of inspiration if you are looking for intuition on the use of this comonad.

Edward Kmett said...

kim-ee:

An implementation of the exponent comonad:

Peter Berry said...

What's wrong with the idea of "Kleisli category of a comonad"? It seems unambiguous to me (even if I don't fully understand comonads). Or is there some difference between "Kleisli" and "coKleisli", e.g. does it make sense to have the coKleisli category of a monad?

Edward Kmett said...

Peter:

The 'Kleisli category of a comonad' is unambiguous. It is also more verbose than saying 'coKleisli category'. Since one often just says the 'Kleisli construction', this enables the use of 'coKleisli construction' without introducing ambiguity into the root term.

That said, one might argue that 'co' is used a little to freely. e.g. Why aren't 'cofree comonads' just free? After all, you have to say 'comonad' in there anyways. On the other hand, you have codensity monads and density comonads, so the co- in the adjective doesn't always follow from the root idea.

In the end, I tend to subscribe to the idea that using the shotgun approach of 'co-' everywhere it is applicable enables easier application and facilitates easier communication than requiring the user of the term to know in every case which form is appropriate.

Peter Berry said...

Hmm, I suppose "Kleisli category of T" for some functor T is ambiguous if T is both a monad and a comonad. Then you could say "Kleisli category of T as a monad" and "Kleisli category of T as a comonad", but since they're pretty verbose you'd rather just say "(co)Kleisli category", understanding that "Kleisli" refers to the monad and "coKleisli" to the comonad.

Anonymous said...

Off-topic, Dan, but would you have any opinion on Dorst/Mann's use of geometric algebra to present a unified view of ray-tracing?

geophf said...

sigfpe, the link to Wikipedia for rings seems to be broken. The link that I got working is the following:

http://en.wikipedia.org/wiki/Ring_(mathematics)

... but the link on the blog entry is currently ...

http://sigfpe.blogspot.com/%E2%80%9Dhttp://en.wikipedia.org/wiki/Ring_%28mathematics%29%E2%80%9D

... which seems not to work

geophf said...

sigfpe, FYI, the link to wikipedia for Ring seems to be broken. The link on your blog currently points to ...

http://sigfpe.blogspot.com/%E2%80%9Dhttp://en.wikipedia.org/wiki/Ring_%28mathematics%29%E2%80%9D

... which doesn't seem to work. The link to Ring in wikipedia that does work is the following:

http://en.wikipedia.org/wiki/Ring_(mathematics)

Piet Delport said...

"[...] introduce the function x:1->A where 1 is the terminal object in Hask, normally called ()."

s/terminal/initial/ ?

sigfpe said...

Piet,

1 is definitely terminal, there's always a map from X to 1 defined (in Haskell) by

const () :: a -> ()

0, or the 'empty' type defined by the (extended) Haskell

data Void

is initial.

'Elements' of objects X can be picked out using maps from the terminal object to X. So in Set maps 1->X are the same as elements of X.

Edward Kmett said...

() has two inhabitants _|_ and (), so there are multiple functions from a -> () the moment you allow recursion, which causes () to fail to be terminal.

Also because Haskell is not total you need to be careful, since

data Void

has a single inhabitant _|_ and satisfies (most of) the conditions for a terminal object.

Of course all of this is a pleasant fiction because in the presence of seq you can even distinguish some inhabitants of a -> _|_. But if we ignore seq we get nice properties.

Unfortunately, since _|_ is an inhabitant of every type in Haskell, there is no initial object in Hask.

One way to see Void is not initial is to see that there is no uniqueness to the function from Void -> a for each a in Haskell. you are free to choose any inhabitant of a, each giving rise to another morphism in the category. The uniqueness constraint is violated. If we were strict then we might be able to say that the only member of that type was a function returning _|_.

Another way to look at it: for Hask to be comonoidal there would have to be a type Void for which 'Either Void a' can only contain 'Right a's since you can have 'Left _|_'s. This prevents the category of Haskell types from being CoCartesian.

This is why in category-extras this constraint is captured by saying Hask is pre-(co-Cartesian closed).

sigfpe said...

Edward,

I did mention that I was dealing with a subset of Haskell that allows only total functions. It's messy otherwise.

Edward Kmett said...

Sorry about that. I forgot the context of the article. It has been long enough that this thread has kind of taken on a life of its own. =)

Piet Delport said...

Dan, Edward, thank you for the explanation; it makes sense now.