Saturday, October 25, 2008

Operads and their Monads

Hardly a day goes by at the n-Category Cafe without operads being mentioned. So it's time to write some code illustrating them.


> {-# LANGUAGE FlexibleInstances #-}

> import Data.Monoid
> import Control.Monad.Writer
> import Control.Monad.State


Let's define a simple tree type:


> data Tree a = Leaf a | Tree [Tree a] deriving (Eq,Show)


Sometimes we want to apply a function to every element of the tree. That's provided by the fmap member of the Functor type class.


> instance Functor Tree where
> fmap f (Leaf x) = Leaf (f x)
> fmap f (Tree ts) = Tree $ map (fmap f) ts


But just as we can't use map to apply monadic functions to a list (we'd write mapM print [1,2,3]), we can't use fmap to apply them to our tree. What we need is a monadic version of fmap. Here's a suitable type class:


> class FunctorM c where
> fmapM :: Monad m => (a -> m b) -> c a -> m (c b)


(I could have used Data.Traversable but that entails Data.Foldable and that's too much work.)

And now we can implement a monadic version of Tree's Functor instance:


> instance FunctorM Tree where
> fmapM f (Leaf x) = do
> y <- f x
> return (Leaf y)
> fmapM f (Tree ts) = do
> ts' <- mapM (fmapM f) ts
> return (Tree ts')


We can use fmapM to extract the list of elements of a container type:


> serialise a = runWriter $ fmapM putElement a
> where putElement x = tell [x]


Not only does serialise suck out the elements of a container, it also spits out an empty husk in which all of the elements have been replaced by (). We can think of the latter as the 'shape' of the original structure with the original elements removed. We can formalise this as


> type Shape t = t ()


So we have:


> serialise :: FunctorM t => t a -> (Shape t,[a])


Keeping the shape around allows is to invert serialise to give:


> deserialise :: FunctorM t => Shape t -> [a] -> (t a, [a])
> deserialise t = runState (fmapM getElement t) where
> getElement () = do
> x:xs <- get
> put xs
> return x


(That's a bit like using the supply monad. This function also returns the leftovers.)

We can also write (apologies for the slightly cryptic use of the writer monad):


> size :: FunctorM t => t a -> Int
> size a = getSum $ execWriter $ fmapM incCounter a
> where incCounter _ = tell (Sum 1)


Let's try an example. Here's an empty tree:


> tree1 = Tree [Tree [Leaf (),Leaf ()],Leaf ()]


We can pack a bunch of integers into it:


> ex1 = fst $ deserialise tree1 [1,2,3]


And get them back out again:


> ex2 = serialise ex1


serialise separates the shape from the data, something you can read lots more about at Barry Jay's web site.

Remember that trees are also monads:


> instance Monad Tree where
> return x = Leaf x
> t >>= f = join (fmap f t) where
> join (Leaf t) = t
> join (Tree ts) = Tree (fmap join ts)


The join operation for a tree grafts the elements of a tree of trees back into the tree.

Right, that's enough about trees and shapes for now.

Operads are a bit like the plumbing involved in installing a sprinkler system. Suppose you have a piece, A that splits a single pipe into two:


______
/
/ ____
____/ /
A /
____ \
\ \____
\
\______


And you have two more pipes B and C that look like this:


______
/
/ ____
/ /
____/ \____

B or C
____ ____
\ /
\ \____
\
\______



then you can 'compose' them to make a larger system like this:




______
/
/ ____
/ /
_________/ \____
/ B
/ _______ ____
/ / \ /
/ / \ \____
/ / \
____/ / \______
A /
____ \ ______
\ \ /
\ \ / ____
\ \ / /
\ \_______/ \____
\ C
\_________ ____
\ /
\ \____
\
\______


(Vim rectangular mode FTW!)

The important thing to note here is that as A had two outputs (or inputs, depending on your point of view) you can attach two more pieces, like B and C, directly to it.

Call the number of outlets the 'degree' of the system. If A has degree n then we can attach n more systems, A1...An to it and the resulting system will have degree degree(A1)+...+degree(An). We can write the result as A(A1,...,An).

We also have the 'identity' pipe that looks like this:


_____________

identity
_____________


Formally, an operad is a collection of objects, each of which has a 'degree' that's an integer n, n≥0, depending on your application), an identity element of degree 1, and a composition law:


> class Operad a where
> degree :: a -> Int
> identity :: a
> o :: a -> [a] -> a


o is the composition operation. If f has degree n then we can apply it to a list of n more objects. So we only expect to evaluate f `o` fs successfully if degree f == length fs.

There are many identities we'd expect to hold. For example f `o` [identities,...,identity] == f, because adding plain sections of pipe has no effect. We also expect some associativity conditions coming from the fact that it doesn't matter what order we build a pipe assembly in, it'll still function the same way.

We can follow this pipe metaphor quite closely to define what I think of as the prototype Operad. A Fn a is a function that takes n inputs of type a and returns one of type a. As we can't easily introspect and find out how many arguments such a function expects, we also store the degree of the function with the function:


> data Fn a = F { deg::Int, fn::[a] -> a }

> instance Show a => Show (Fn a) where
> show (F n _) = "<degree " ++ show n ++ " function>"


unconcat is a kind of inverse to concat. You give a list of integers and it chops up a list into pieces with lengths corresponding to your integers. We use this to unpack the arguments to f `o` gs into pieces suitable for the elements of gs to consume.


> unconcat [] [] = []
> unconcat (n:ns) xs = take n xs : unconcat ns (drop n xs)

> instance Operad (Fn a) where
> degree = deg
> f `o` gs = let n = sum (map degree gs)
> in F n (fn f . zipWith fn gs . unconcat (map degree gs))
> identity = F 1 head


Now compute an example, f(f1,f2,f3) applied to [1,2,3]. It should give 1+1+2*3=8.


> ex3 = fn (f `o` [f1,f2]) [1,2,3] where
> f = F 2 (\[x,y] -> x+y)
> f1 = F 1 (\[x] -> x+1)
> f2 = F 2 (\[x,y] -> x*y)


(That's a lot like lambda calculus without names. Operads are a bit like n-ary combinators.)

Now I'm going to introduce a different looking operad. Think of V as representing schemes for dicing the real line. Here are some examples:


|0.25|0.25| 0.5 |

|0.1|0.1| 0.8 |


If A divides up the real line into n pieces then you could divide up each of the n pieces using their own schemes. This means that dicing schemes compose. So if we define A, B and C as:


A = |0.5|0.5|

B = |0.75|0.25|
C = |0.1|0.1|0.8|


Then A(B,C) is:


|0.375|0.125|0.05|0.05|0.4|


We could implement V as a list of real numbers, but it's more fun to generalise to any monoid and not worry about divisions summing to 1:


> data V m = V { unV :: [m] } deriving (Eq,Show)


This becomes an operad by allowing the monoid value in a 'parent' scheme multiply the values in a 'child'.


> instance Monoid m => Operad (V m) where
> degree (V ps) = length ps
> (V as) `o` bs = V $ op as (map unV bs) where
> op [] [] = []
> op (a:as) (b:bs) = map (mappend a) b ++ op as bs
> identity = V [mempty]


For example, if d1 cuts the real line in half, and d2 cuts it into thirds, then d1(d1,d2) will cut it into five pieces of lengths 1/4,1/4,1/6,1/6,1/6:


> ex4 = d1 `o` [d1,d2] where
> d1 = V [Product (1/2),Product (1/2)]
> d2 = V [Product (1/3),Product (1/3),Product (1/3)]


If the elements in a V are non-negative and sum to 1 we can think of them as probability distributions. The composition A(A1,...,An) is the distribution of all possible outcomes you can get by selecting a value i in the range {1..n} using distribution A and then selecting a second
value conditionally from distribution Ai. We connect with the recent n-category post on entropy.

In fact we can compute the entropy of a distrbution as follows:


> h (V ps) = - (sum $ map (\(Product x) -> xlogx x) ps) where
> xlogx 0 = 0
> xlogx x = x*log x/log 2


We can now look at the 'aside' in that post. From an element of V we can produce a function that computes a corresponding linear combination (at least for Num types):


> linear (V ps) xs = sum $ zipWith (*) (map (\(Product x) -> x) ps) xs


We can now compute the entropy of a distribution in two different ways:


> (ex5,ex6) = (h (d1 `o` [d1,d2]),h d1 + linear d1 (map h [d1,d2])) where
> d1 = V [Product 0.5,Product 0.5]
> d2 = V [Product 0.25,Product 0.75]


Now according to this paper on operads we can build a monad from an operad. Here's the construction:


> data MonadWrapper op a = M { shape::op, value::[a] } deriving (Eq,Show)


(The field names aren't from the paper but they do give away what's actually going on...)

The idea is that an element of this construction consists of an element of the operad of degree n, and an n element list. It's a functor in an obvious way:


> instance Functor (MonadWrapper o) where
> fmap f (M o xs) = M o (map f xs)


It's also a FunctorM:


> instance FunctorM (MonadWrapper o) where
> fmapM f (M s c) = do
> c' <- mapM f c
> return $ M s c'


We can make the construction a monad as follows:


> instance Operad o => Monad (MonadWrapper o) where
> return x = M identity [x]
> p >>= f = join (fmap f p) where
> join (M p xs) = M (p `o` map shape xs) (concatMap value xs)


Now for something to be a monad there are various laws that needs to be satisfied. These follow from the rules (which I haven't explicitly stated) for an operad. When I first looked at that paper I was confused - it seemed that the operad part and the list part didn't interact with each other. And then I suddenly realised what was happening. But hang on for a moment...

Tree shapes make nice operads. The composition rule just grafts child trees into the leaves of the parent:


> instance Operad (Tree ()) where
> degree t = length (snd (serialise t))
> identity = Leaf ()
> t `o` ts = let (r,[]) = deserialise t ts in r >>= id


We can write that more generically so it works with more than trees:

> data OperadWrapper m = O { unO::Shape m }


> instance (FunctorM m,Monad m) => Operad (OperadWrapper m) where
> degree (O t) = size t
> identity = O (return ())
> (O t) `o` ts = let (r,[]) = deserialise t (map unO ts) in O (r >>= id)


So let's use the construction above to make a monad. But what actually is this monad? Each element is a pair with (1) a tree shape of degree n and (2) an n-element list. In other words, it's just a serialised tree. We can define these isomorphisms to make that clearer:


> iso1 :: FunctorM t => t x -> MonadWrapper (t ()) x
> iso1 t = uncurry M (serialise t)

> iso2 :: FunctorM t => MonadWrapper (t ()) x -> t x
> iso2 (M shape contents) = let (tree,[]) = deserialise shape contents in tree


So, for example:


> ex7 = iso2 (iso1 tree) where
> tree = Tree [Tree [Leaf "Birch",Leaf "Oak"],Leaf "Cypress",Leaf "Binary"]


That construction won't work for all monads, just those monads that come from operads. I'll leave you to characterise those.

And now we have it: a way to think about operads from a computational perspective. They're the shapes of certain monadic serialisable containers. Operadic composition is the just the same grafting operation used in the join operation, using values () as the graft points.

I have a few moments spare so let's actually do something with an operad. First we need the notion of a free operad. This is basically just a set of 'pipe' parts that we can stick together with no equations holding apart from those inherent in the definition of an operad. This is different from the V operad where many different ways of apply the o operator can result in the same result. We can use any set of parts, as long as we can associate an integer with each part:


> class Graded a where
> grade :: a -> Int


The free operad structure is just a tree:


> data FreeOperad a = I | B a [FreeOperad a] deriving Show


I will be the identity, but it will also serve as a 'terminator' like the way there's always a [] at the end of a list.

An easy way to make a single part an element of an operad:


> b n = let d = grade n in B n (replicate d I)


Here's the instance:


> instance Graded a => Operad (FreeOperad a) where
> degree I = 1
> degree (B _ xs) = sum (map degree xs)
> identity = I
> I `o` [x] = x
> B a bs `o` xs = let arities = map degree bs
> in B a $ zipWith o bs (unconcat arities xs)


Now I'm going to use this to make an operad and then a monad:


> instance Graded [a] where
> grade = length

> type DecisionTree = MonadWrapper (FreeOperad [Float])


What we get is a lot like the probability monad except it doesn't give the final probabilities. Instead, it gives the actual tree of possibilities. (I must also point out this hpaste by wli.)


> test = do
> a <- M (b [Product 0.5,Product 0.5]) [1,2]
> b <- M (b [Product (1/3.0),Product (1/3.0),Product (1/3.0)]) [1,2,3]
> return $ a+b


Now we can 'flattten' this tree so that the leaves have the final probabilities:


> flatten :: Monoid m => FreeOperad [m] -> V m
> flatten I = V [mempty]
> flatten (B ms fs) = V $ concat $ zipWith (map . mappend) ms (map (unV . flatten) fs)


This is a morphism of operads. (You can probably guess the definition of such a thing.) It induces a morphism of monads:


> liftOp :: (Operad a,Operad b) => (a -> b) -> MonadWrapper a x -> MonadWrapper b x
> liftOp f (M shape values) = M (f shape) values


liftOp flatten test will give you the final probabilities.

There may just be a possible application of this stuff. The point of separating shape from data is performance. You can store all of your data in flat arrays and do most of your work there. It means you can write fast tight loops and only rebuild the original datatype if needed at the end. If you're lucky you can precompute the shape of the result, allowing you to preallocate a suitable chunk of memory for your final answer to go into. What the operad does is allow you to extend this idea to monadic computations, for suitable monads. If the 'shape' of the computation is independent of the details of the computation, you can use an operad to compute that shape, and then compute the contents of the corresponding array separately. If you look at the instance for MonadWrapper you'll see that the part of the computation that deals with the data is simply a concatMap.

BTW In some papers the definition restricts the degree to ≥1. But that's less convenient for computer science applications. If it really bothers you then you can limit yourself to thinking about containers that contain at least one element.

Saturday, October 18, 2008

What's the use of a transfinite ordinal?

This post is going to talk about one player games (which I'll call 1PGs, or just games). They seem pretty stupid at first, but it turns out that they have a lot of rich and applicable theory. The idea is that at any position in a 1PG there is a set of new positions that you can move to. You lose when you have no legal moves and the goal is to avoid losing for as long as you can. For convenience, let's identify your position in a 1PG with the set of positions you could move to. A position in a 1PG is just another 1PG. So we can give the following recursive formal definition of a 1PG:

A one player game is either the losing position with no legal moves, denoted by ∅, or a set of one player games.

A sequence of legal moves in a 1PG, say x, is then a sequence of the form x ∋x0 ∋x1 ∋x2 ∋....

Did you notice that I sneaked in a really big restriction on 1PGs in that definition? My definition means that 1PGs are the same thing as sets. And sets are well-founded meaning that any sequence of the form x ∋x_0 ∋x1 ∋x2 ∋... eventually terminates. So my definition implicitly contains the restriction that you always eventually lose. That's fine for my purposes.

Now suppose that x ∋y ∋z is a legal sequence of plays in some 1PG. You might as well assume that z is in x. The reason is that if z were a legal move from x, you wouldn't take it, because in order to delay the inevitable you're going to prefer to go via y. So we'll assume that any game x is "transitively closed", ie. if x∋y and y∋z then x∋z.

Now I want to restrict things even further to the most boring kind of 1PG of all, games where the successor moves are all totally ordered. Intuitively what I mean is that you can think of every move in the game as simply moving you along a 'line'. Given any two positions in the game, x and y, either x∋y, or y∈x, or x=y. In these games there are no branches, the only thing you can do in these games is delay the inevitable. A bit like life really. We'll also use the notation a<b to mean a∈b.

I'll call these games ordinal 1PGs. You may have noticed that this is precisely the same as the ordinary definition of an ordinal.

Let's call the game ∅ by the name 0. 0 is sudden death. Recursively define the game n={0,1,2,3,n-1}. In the game of n you have up to n steps to play before you die (though you could hurry things along if you wanted to).

But what about this game: ω={0,1,2,...}. There's nothing mysterious about it. In your first move you choose an integer, and then you get to delay the end for up to n moves. This is our first transfinite ordinal, but it just describes a 1PG that anyone can play and which is guaranteed to terminate after a finite number of moves. We can't say in advance how many moves it will take, and we can't even put an upper bound on the game length, but we do know it's finite.

Given two games, a and b, we can define the sum a+b. This is simply the game where you play through b first, and then play through a. Because of the transitivity condition above you could simply abandon b at any stage and then jump into a, but that's not a good thing to do if you're trying to stay alive for as long as possible. The first example of such a 1PG is the game ω+1. You get to make one move, then you choose an n, and then you get to play n.

What about 1+ω? Well you get to choose n, play n moves, and then play an extra move. But that's exactly like playing ω and making the move n+1. So 1+ω and ω are the same game. Clearly addition of 1PGs isn't commutative.

We can multiply two games as well. Given games a and b, ab is defined like this: we have game a and game b in front of us at the same time, a on the left and b on the right. At any turn we have a choice: make a move in the game on the left, or make a move in the game on the right, resetting the left game back to a again. It's like playing through b, but playing through a copy of a at each turn. Note how for integers a and b the game ab is just ab, where the former is game multiplication, and the latter is ordinary arithmetical multiplication.

We can also define exponentiation. We can define an, for finite integer n, in the obvious way as ω·ω·...·ω n times. Before getting onto more general exponentiation I need to classify ordinal games. There are three kinds. There's

(1) the zero game 0=∅.
(2) the games where there is one 'largest' next move in the game, the one you should take. These are games of the form a+1 and the next position in the game, if you're playing to survive, will be a. For example, the game 7 is 6+1.
(3) the games where there is no single best choice of next move. These are games where there is no largest next move, but instead an ever increasing sequence of possible next moves and you have to pick one. The first example is ω where you're forced to pick an n. These are called limit ordinals.

We can use this to recursively define ab using the classification of b.

(1) If b=0 then by definition ab=1 and that's that.
(2) if b=c+1, then ab=ac·a.
(3) if b is a limit ordinal then a move in ab works like this: you pick a move in b, say c, and now you get to play ac (or, by transitive closure, any smaller game).

For example, you play ωω like this: in your first move you pick an n, and then you play ωn. Note how the state of play never becomes infinite, you're always playing with finite rules on a finite 'board'. This is an important point. Even though ωω3+22+ω+4 seems like it must be a very big kind of thing, it describes a finite process that always terminates.

So what use are these things?

Suppose you have an algorithm that runs one step at a time, with each step taking a finite time, and where the internal state of the program at step t is st. Suppose that you can find a map f from program states to the ordinals such that f(st+1)<f(st), ie. the program always decreases the ordinal associated to a state. Then the program must eventually terminate.

For example, suppose the program iterates 10 times. We can simply assign f(st)=10-t. Suppose instead your algorithm computes some number n at the first step, though you can't figure out what that number is (or maybe it accepts n as an input from a user), then we can use f(0)=ω and so on. We don't know in advance what n is, but that doesn't matter. Whatever n is entered, the program will terminate.

This technique is particularly useful for proving termination of rewrite rules. For example, with the rewrite rules generated by the Buchberger algorithm we can map the first step to ωn, for some n. And if we don't know what n is, we can just start with ωω. If you take a peek at the papers of Nachum Dershowitz you'll see how he applies this technique to many other types of rewrite rule.

Transfinite ordinals are not as esoteric as they may first appear to be. Using them to prove termination goes back to Turing and Floyd but I learnt about the method from reading Dershowitz. (I think Floyd suggested using the method to prove termination of the standard rewrite rules for computing derivatives of expressions. It's tricky because differentiating a large product, say, can result in many new terms, so there's a race between the derivative 'knocking down' powers, and products causing growth.)

Probably the best example of using ordinals to prove the termination of a process is in the Hydra game. What's amazing about this game is that you need quite powerful axioms of arithmetic to prove termination.

One more thing. All of this theory generalises to two player games.

Update: I think mjd may have been about to write on the same thing. On the other hand, though he mentions program termination, he says of it "But none of that is what I was planning to discuss". So maybe what I've written can be seen as complementary.

Sunday, October 12, 2008

Untangling with Continued Fractions: part 5


> {-# LANGUAGE MultiParamTypeClasses,FunctionalDependencies,FlexibleInstances,GeneralizedNewtypeDeriving #-}

> module Main where

> import qualified Data.Map as M
> import Control.Monad
> import Prelude
> import Ratio hiding (numerator,denominator)

> infixl 5 .+
> infixl 6 .*


So far I've described a way to use monad notation to represent knots, links and tangles, and I've provided a way to implement a monad so that the notation gives rise to the bracket polynomial when applied to a knot. What happens when we do the same with a tangle?

A tangle corresponds to something with this type:


> type Tangle = (Bool,Bool) -> KnotMonad (Bool,Bool)


Here are two examples of tangles, ie. expressions with this type:


> infinity (i,j) = return (i,j)
> zero (i,j) = do
> cup (i,j)
> cap ()


These correspond to these tangles:

And the names of those expressions are the corresponding extended rational values.

Now here's a surprising but nontrivial fact that holds when T is a rational tangle: if we set A2=i (where i2=-1) then the value of the monadic expression corresponding to the tangle is of the form a<[∞]>+b<[0]> where the [r] are the rational tangles above, and <T> is the bracket polynomial (evaluated at A2=i). The rational associated to the original tangle is then given by -ia/b. I believe this discovery is, like many other things I've talked about in this blog, due to Conway.

We now have the makings of a complete algorithm. Using the polynomial monad defined in the previous installment we evaluate our tangle. We then compute the coefficients a and b, which are polynomials in A. Now compute a and b at A2=-i, and then compute -ia/b and use the operations z→z+1, z→z-1 and z→-1/z to reduce this rational to zero, interpreting these operations as twist, antitwist and rotation respectively (as I described here).

There's a shortcut I'm going to take but I don't have a proof it's guaranteed to work so it's risky. Instead of computing polynomials in A, doing polynomial division, and then evaluating at A2=-i, I'm going to do everything from the first step with the assumption that A2=-1. (The catch is that if a and b are polynomials in A, then a and b may have a common divisor that cancels when computing a/b. If we substitute A2=-1 from the beginning we get division of zero by zero. I don't know for sure if this can't happen. It's not a big deal if it does though, I just need to write a polynomial division routine.)

Now if A2=-1 then we could choose A=(1/2)1/2+(1/2)1/2i. That suggests using real arithmetic but we need to extract results that are rational numbers. So instead introduce the field K=Q((1/2)1/2), ie. the field of numbers of the form u+v(1/2)1/2. We can implement this as:


> data K = K { re2::Rational, im2::Rational } deriving (Eq,Show)

> instance Num K where
> K a b + K a' b' = K (a+a') (b+b')
> K a b * K a' b' = K (a*a'+2*b*b') (a*b'+a'*b)
> negate (K a b) = K (negate a) (negate b)
> abs _ = error ""
> signum _ = error ""
> fromInteger i = K (fromInteger i) 0

> instance Fractional K where
> recip (K a b) = let r = recip (a*a-2*b) in K (r*a) (-r*b)
> fromRational x = K x 0


We actually want to work with the field L=K(i), ie. complex numbers built from elements of K. We can implement this as:


> type L = Complex K


(Notice how it's not at all hard to build up algebraic structures like this in Haskell without a symbolic algebra package. For more powerful code in this vein, see David Amos's code).

Now we have our field, our monad is that of vector spaces over L:


> type KnotMonad = V L


Let's define some useful constants. The square root of 1/2, A, B=1/A and i:


> hr2 = K 0 (1%2)
> a = hr2 :+ hr2
> b = hr2 :+ (-hr2)
> i = 0 :+ 1


Here's the knot monad code I've presented before. (This version uses -|()|- in a way that may appear a bit useless. I'm just making explicit when we're working with a 1-dimensional vector space generated by -|()|-, which is much the same thing as the base field.)


> cup :: (Bool,Bool) -> KnotMonad ()
> cup (u,v) = case (u,v) of
> (False,True) -> (-i * b) .* return ()
> (True,False) -> (i * a) .* return ()
> otherwise -> vzero

> cap :: () -> KnotMonad (Bool,Bool)
> cap () = (-i * b) .* return (False,True) .+ (i * a) .* return (True,False)

> over :: Tangle
> over (u,v) = a .* do
> () <- cup (u,v)
> cap ()
> .+
> b .* return (u,v)

> under :: Tangle
> under (u,v) = b .* do
> () <- cup (u,v)
> cap ()
> .+
> a .* return (u,v)


Here's where we work out the coefficients a and b. We treat a tangle as the function that it is and look at what happens when we apply it to two different values in (Bool,Bool). Applying to value' to infinity and zero shows that they do in fact compute a and b correctly:


> value' p =
> let alpha = coefficient (False,False) $ p (False,False)
> beta = coefficient (True,False) $ p (False,True)
> in (alpha,beta)


From a and b we extract -ia/b as a rational:


> value :: Tangle -> Rational
> value p =
> let (a,b) = value' p
> in re2 $ realPart $ -i*a/b


And now we reduce to 0. This isn't the best algorithm for reducing rationals to zero. But it's what I wrote in my first attempt and I'm sticking with it for consistency with part one:


> steps :: Rational -> [String]
> steps 0 = []
> steps q | q<= -1 = "twist":steps (q+1)
> | -1<q && q<1 = "rotate":steps (-1/q)
> | q>=1 = "antitwist":steps (q-1)

> untangle t = steps (value t)


Here's the example from the first installment:


> example :: Tangle
> example (a,b) = do
> (c,d) <- over (a,b)
> (e,f) <- cap ()
> (g,h) <- over (c,e)
> (i,j) <- over (f,d)
> (m,n) <- cap ()
> (k,l) <- cap ()
> (q,r) <- over (h,k)
> (s,y) <- over (l,i)
> (o,p) <- over (n,g)
> (t,u) <- under (p,q)
> (v,w) <- under (r,s)
> (x,z) <- over (y,j)
> cup (o,t)
> cup (u,v)
> cup (w,x)
> return (m,z)


I'll leave you to type untangle example in ghci and refer back to my original video.

To find out more, and to read about the details I've left out, try the many papers by Kauffman, including this one.


That was hard! I didn't anticipate how much stuff I'd have to introduce to just sketch out how this works. It's one of those cases where the comments need to be about 30 or 40 times larger than the code and done properly, there's enough material to fill a good few chapters of a book. So I apologise if I skimped on some details along the way, and left out some needed explanations. But I hope that I've at least motivated you to take a peek at knot theory and appreciate how nicely tangles and monads play together.

And remember this is only code to illustrate some mathematics - it's inefficient and not how I'd really untangle big rational tangles! (For example, just using the monad associativity laws you can often rearrange the monad expressions for knots into equivalent forms that execute much quicker.)


The rest is similar to code I've used in previous installments:


> swap (x,y) = (y,x)

> class Num k => VectorSpace k v | v -> k where
> vzero :: v
> (.+) :: v -> v -> v
> (.*) :: k -> v -> v
> (.-) :: v -> v -> v
> v1 .- v2 = v1 .+ ((-1).*v2)

> data V k a = V { unV :: [(k,a)] } deriving (Show)

> reduce x = filter ((/=0) . fst) $ fmap swap $ M.toList $ M.fromListWith (+) $ fmap swap $ x

> instance Num k => Functor (V k) where
> fmap f (V as) = V $ map (\(k,a) -> (k,f a)) as

> instance Num k => Monad (V k) where
> return a = V [(1,a)]
> x >>= f = join (fmap f x)
> where join x = V $ concat $ fmap (uncurry scale) $ unV $ fmap unV x
> scale k1 as = map (\(k2,a) -> (k1*k2,a)) as

> instance Num r => MonadPlus (V r) where
> mzero = V []
> mplus (V x) (V y) = V (x++y)

> instance (Num k,Ord a) => VectorSpace k (V k a) where
> vzero = V []
> V x .+ V y = V (x ++ y)
> (.*) k = (>>= (\a -> V [(k,a)]))

> coefficient b (V bs) = maybe 0 id (lookup b (map swap (reduce bs)))

> data Complex a = (:+) { realPart :: a, imagPart :: a } deriving (Eq,Show)

> instance Num a => Num (Complex a) where
> (a :+ b) + (a' :+ b') = (a+a') :+ (b+b')
> (a :+ b) * (a' :+ b') = (a*a'-b*b') :+ (a*b'+a'*b)
> negate (a :+ b) = (-a) :+ (-b)
> fromInteger n = fromInteger n :+ 0
> abs (a :+ b) = undefined
> signum (a :+ b) = undefined

> instance Fractional a => Fractional (Complex a) where
> recip (a :+ b) = let r = recip (a*a+b*b) in ((a*r) :+ (-b*r))
> fromRational q = fromRational q :+ 0