Sunday, February 05, 2012

Lawvere theories made a bit easier

Introduction
I still don't think anyone has found a completely satisfactory way to combine effects in Haskell. (That's computational effects, not visual effects.) Monads are great for one kind of effect at a time, but it's not clear how to combine two arbitrary monads. Instead of monads we can work with monad transformers, but they are tricky both to implement and to use.

I want to sketch a different, though incomplete approach to combining effects. There are a bunch of papers that describe this approach, and even some code that implements part of it. Almost everything I say is from a paper by Hyland and Powers that I read a few years ago though I recently ran into this helpful answer by Andrej Bauer on mathoverflow. Even if we don't get code we can immediately use, we still get a good way to think about and analyse effects.

I'll get onto the effects part in another post. This one concentrates on what are known as Lawvere theories.

Monoids

> {-# LANGUAGE MultiParamTypeClasses, FlexibleInstances, FlexibleContexts, ExplicitForAll #-}
> import Data.Monoid

The (finite) lists of integers form a monoid. Here are some functions that operate on such lists:

> f1 xs = map (+1) $ reverse xs
> f2 xs = xs `mappend` xs

They can both be given signature [Integer] -> [Integer]. But there's one very important difference between these functions. f2 has been written using only operations from the type class Monoid. It's a sort of universal function that can be applied to any monoid. On the other hand, the function f1 can only applied to very specific monoids. In fact, the type signature of f2 can be written as:

> f2 :: forall a. Monoid a => a -> a

That type signature essentially says that we're not going to do anything with elements of a except use the interface defined by the Monoid type class.

(Although Haskell type classes can't enforce them, we also assume that any instance of Monoid satisfies the axioms for a monoid.)

We can also define functions on tuples of monoids. Here are some examples:

> g1 :: forall a. Monoid a => (a, a, a) -> (a, a)
> g1 (xs, ys, zs) = (mempty, ys `mappend` xs)

> g2 :: forall a. Monoid a => (a, a) -> (a, a)
> g2 (xs, ys) = (xs `mappend` ys, ys `mappend` xs)

Notice that we can compose these functions. So we have
g2 . g1 :: forall a. Monoid a => (a, a, a) -> (a, a)

We also have have identity functions for tuples. Armed with functions, identities and compositions can now form a category that I'll call . I'll call the (distinct) objects of this category , , and so on. The arrows from to are the total functions of type . So, for example, g1 is an arrow from to . Note that it doesn't matter what the objects are (as long as they're distinct). They're just placeholders between which we can string our arrows. Note how because of the universal quantifier, the functions we use have a type not of the form A -> B. So we can't represent the objects of our category as types in the usual way. We can think of mempty as a 0-ary operator, ie. an element of forall a. Monoid a => () -> a.

But there's one more detail I want. I'll consider two arrows to be equal if they can be proved equal using the axioms for monoids. For example, these two Haskell functions represent the same arrow in because of the associativity law:

> h1 (x, y, z) = (x `mappend` y) `mappend` z
> h2 (x, y, z) = x `mappend` (y `mappend` z)

We now have a bona fide category. Note that contains lots of arrows. Anything you can build using mempty and mappend as well as all of the projections and permutations between tuples. completely captures everything that can be deduced using the axioms for monoids. For example, the associativity law is contained in the fact that h1 and h2 represent the same arrow.

The category also has products. In fact it is given by with the projections back to the factors being represented by the obvious projection functions. serves as the product of no factors.

So captures the properties shared by all monoids. But what is its relationship to actual monoids? It's pretty nice. A monoid is a functor that preserves products.

Let's unpack that. First must take objects in to sets. But we've stipulated that so is completely determined on objects once we know . In fact will be the carrier for our monoid and is its th power. takes arrows in to functions on sets. So, for example, it gives concrete realisations of mempty and mappend. Because, for example, h1 and h2 represent the same arrow in , and must be equal. So associativity must hold for these realisations. The same goes for all of the other laws, and everything we can deduce from them. So the requirement of functoriality makes into a monoid with identity F(mempty) and product F(mappend).

Given an instance of the Monoid type class we can immediately apply an arrow from to it. The functor is applied implicitly by the Haskell compiler. For example h1 can be applied to ("a", "b", "c") :: (String, String, String).

The object in is weird. It's a lot like the universal monoid sharing all of the properties you expect to hold simultaneously in all monoids. Except for one important one: it's not a monoid itself.

Note that in a sense the category isn't anything new. It's just a convenient 'datastructure' into which we can pack everything we can deduce about monoids.

M-sets
Before generalising, let's try a similar treatment for another algebraic structure, the horribly named -set. An -set is a structure on a set that assumes we already have some choice of monoid . It defines an action of on the set. An action is simply a function defined for each element of and which is compatible with the operations on . Here's a suitable type class:

> class Monoid m => MSet m s where
> act :: m -> s -> s

and the laws are
act mempty x == x
act a (act b x) = act (a `mappend` b) x
We're thinking of act not as an operation on two arguments but instead thinking of act a being an operation on s for each element a. Note how that second law is really a lot of laws, one for every pair (a, b) in our monoid.

A simple example is the way scalars act on vectors.

> data Vector a = V a a a deriving Show
> instance Monoid a => MSet a (Vector a) where
>    act w (V x y z) = V (w `mappend` x) (w `mappend` y) (w `mappend` z)

Given any particular monoid (eg. String) we can define functions like:

> j1 :: forall a. MSet String a => (a, a) -> (a, a)
> j1 (x, y) = ("x" `act` x, "y" `act` y)
> j2 :: forall a. MSet String a => (a, a) -> a
> j2 (x, y) = "abc" `act` x

We can form a category in exactly the same way as for monoids. As the objects were just placeholders we may as well reuse them. The arrows from to are the functions on tuples we can make from repeated applications of act, ie. . For each choice of we get a new category encoding everything we want to know about -sets. In much the same way, any functor gives an -set. We have and maps act a to the actual action on the set.

Lawvere theories
So what do these two categories have in common?

Let's start with the simplest possible algebraic structure: a plain old set with no operations on it. The corresponding category will have objects . The arrows from to will be represented by functions of the form . That includes functions like the projections onto elements of tuples or permutations of tuples. This category is called .

Both and defined earlier contain all of the objects and arrows of . But both and also contain arrows corresponding to all of the operations you can perform in their respective algebraic structures. So we can define a Lawvere theory as nothing more than a category that is with extra arrows. The category defined earlier is the "theory of monoids" and is the "theory of -sets". A 'model' of , or a -algebra, is a product-preserving functor .

Product theories
Suppose we have two theories and . There's a straightforward way to combine them into one theory. Form the category that (like all Lawvere theories) shares the same objects as and but has all of the arrows from both. (Obviously we'll have to keep just one identity, not an identity from and another from .) The catch is that if is an arrow in and is an arrow in we'll need the composition too. We simply take the smallest set of arrows that contains all of the arrows from and and contains all of their compositions modulo all of the laws in and .

Thus far we already have a new Lawvere theory built from and . But sometimes it's useful to ensure the operations from are 'compatible', in some sense, with those from . We want them to commute with each other. I'll describe what that commutativity means now:

Suppose we have an arrows and on and respectively. If we have a set that is both a -algebra and a -algebra, then gives an operation and gives an operation . Write out an array of variables:

We can apply across the rows to get the array:

and now we can apply down the columns to get an array.

We could also have done this by applying down the columns first and then across the rows. We now throw in extra commutativity laws stating that these two operations give equal results, whatever the operations and .

For example, if the theory has a binary multiplication operator and the theory has binary multiplication operator then commutativity requires . This is the usual interchange law. In this example the commutativity law is therefore an assertion about the equality of two arrows in .

The result of combining the two theories and , additionally throwing in these commutativity laws, is known as the product theory .

Product theories in Haskell
Although we can't get Haskell to enforce the commutativity rules we can get part way to defining product theories using type classes. We make our type an instance of both classes. For and defined as the theories of monoids and -sets above, a -algebra is given by a type that is an instance of both Monoid and MSet. Unfortunately we can't make Haskell automatically enforce the rule that elements be equal if the laws say they should be.

Coming soon
We've defined Lawvere theories. I've explained previously how many kinds of effects can be modelled by algebraic theories. I've defined a product on Lawvere theories. Which means we now have a way to combine effects. But for that you'll have to wait for the sequel.

8 comments:

  1. Very nice! I think you meant forall a. Monoid a => a^m -> a^n instead of forall a. Monoid m => a^m -> a^n, though.

    ReplyDelete
  2. @pumpkin You spotted that fast. I don't think the pixels were even dry yet. Thanks though!

    ReplyDelete
  3. And just when it started to get intriguing!

    About your "I don't think anyone has found a satisfactory way to combine effects in Haskell" - you might want to check out a mix-arrows package from hackage. I've just released the 1.2 version, which is supposed to be better than older ones, but documentation could be not generated yet. Anyway, it's a neat way to mix two arrows into one, getting the effects of both.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. I fear the most interesting types of mixed effects will not be ones that commute with one another, but looking forward to the sequel nonetheless!

    ReplyDelete
  6. @Tom,

    The constructions for maximally non-commuting and maximally commuting effects are pretty similar. But it's easier to write down concrete types for the commuting case. And there are some interesting surprises to be had with commuting effects, as I'll mention.

    ReplyDelete
  7. The object M_1 in T is weird. It's a lot like the universal monoid sharing all of the properties you expect to hold simultaneously in all monoids. Except for one important one: it's not a monoid itself.

    Actually it is a monoid object in the category T!

    ReplyDelete
  8. I haven't seen a type class for Lawvere theories anywhere. What would that look like?

    About combining effects in Haskell, there is the paper "Combining Effects using Coproducts." It's a really simple idea: it's like a free monad, but each layer can have two constructors: Inl and Inr. To run one, you unwrap all the layers with your runXXX functions. They say that successive layers, of for instance State, can't use the same state thread, but they do provide a way of mapping the two monads onto a common monad, which provides that functionality.

    ReplyDelete