I've been enjoying Eric Kidd's articles on probability theory with Haskell. So I thought I'd follow them up with two things: (1) finding the underlying algebraic structure and (2) showing how it's general enough to do more than just probability.
Firstly, Eric found a really neat factoring of the probability monad as a 'product' of two monads: the Perhaps monad and the List monad. This factoring can be interpreted algebraically.
A monoid is defined to be a set m with a binary operator (typically written as abuttal, ie. the product of a and b is ab) that is associative, and with an element, 1, that is an identity for the binary operator. The List monad gives rise to the monoid freely generated by a set, in other words it defines the smallest monoid containing the set and with no extra relationships between the elements that don't come from the definition of a monoid. The binary operator for List is called ++ and we embed the original set in the monoid using the function \x -> [x], otherwise known as return.  is the identity. It's not hard to see that we have asociativity as (a ++ b) ++ c == a ++ (b ++ c). If we use Set instead of List then we get the free commutative monoid instead, ie. one where a+b=b+a.
(Eh? Where's my paragraph on m-sets. Don't tell me I have to format all those HTML subscripts again! I hate it when computers do that. I wrote a paragraph and it just vanished. Oh well, maybe version 2.0 will be better. But without subscripts this time...)
If m is a monoid then an m-set is a set s with an action of m on it. An action of a monoid on a set is a scheme that converts each element of the monoid, say g, into a map g*:S→S so that 1* is the identity map and g*h*=(gh)*. The type Writer m s corresponds to pairs in s×m but this also doubles as the free m-set. We simply define f*(x,g)=(x,fg). It's free because we never get f*(x,f')=g*(x',g') unless x=x', so we never get any surprising equalities that aren't inherent in the definition of an m-set. Perhaps, which is actually a pseudonym for Writer Float, defines an R-set, where R is the monoid of reals under multiplication.
In an earlier post I showed how you could layer up algebraic structures. There I showed that you could combine a commutative monoid structure with a monoid structure to get a semiring. So what happens if we layer up a commutative monoid and an m-set? We'll get something that has a commutative binary operator but that can also be acted on my elements of m. Specialise to the case when m is a field like the real or complex numbers. Does this sound like a familiar structure? I hope so, it's a description of a vector space. What Eric Kidd has shown is that we can build a vector space monad by applying an m-set monad transformer (with m a field) to a commutative monoid monad. (This isn't a completely trivial fact and it depends on the fact that the definition of PerhapsT handles distributivity correctly.) I think that's pretty neat. But I'm getting slightly ahead of myself here as I haven't shown what Eric's stuff has to do with vector spaces.
A probability distribution can be thought of as a vector with outcomes forming a basis. Any distribution attaches a probability 'weight' to each possible outcome in the same way that a vector can be written as a weighted sum of basis elements. Each time we have a probabilistic transition, we're effectively multiplying our distribution vector by a stochastic matrix. Eric (and others)'s monad allows you to write these matrix multiplications in a very natural way.
A vector space forms a monad in a straightforward way. If V(B) is the vector space generated by basis B, and V(C) is another vector space, then any function B→V(C) can be lifted to a linear map V(B)→V(C). This lift is clearly of type (B→V(C))→(V(B)→V(C)). With a little flip we get V(B)→(B→V(C))→V(C). This is the main step in showing V to be a monad. And what the probability monad allows us to do is write our stochastic matrices in terms of what happens to the individual basis elements (ie. outcomes) instead of having to write out the entire matrix.
Anyway, this is all just waffle and it really needs some code to make it more concrete. I have an ulterior motive here. It's not just probability theory that is described by vector spaces. So is quantum mechanics. And I plan to work my way up to defining quantum computers and implementing a bunch of well known quantum algorithms in Haskell. In the next installment I expect to get at least as far as writing the code to play with the square root of NOT operator.
PS This idea of layering up algebraic structures is one I haven't found in the textbooks yet. I'm guessing it's in Mac Lane, but I haven't yet justified the cost of that book to myself. But I don't actually have a clue what a category theorist would call a monad transformer and don't recall reading about them in any texts that weren't specifically about computer science. Maybe someone else can fill me in. I do know that it's absolutely standard to get free algebraic structures from monads.
Errata: As notfancy points out, I should really be talking about multisets rather than sets, otherwise we have a+a=a. As programmers often use lists to represent both sets and multisets it can sometimes get confusing.
Sunday, February 25, 2007
- ► 2011 (13)
- ► 2010 (20)
- ► 2009 (21)
- ► 2008 (35)
- ▼ February (6)
- ► 2006 (92)