Saturday, August 26, 2006

Algebraic Topology in Haskell

On my shelf I have the book "Basic Topology" by Armstrong. After you've fought your way through 173 pages you eventually get to the section on simplicial topology and you can start playing with one of the basic tools of modern topology: homology groups. But you don't want to go through all that hassle. If you can read Haskell code then you can get there in 20 lines. But don't get too excited - this isn't meant to be an introduction to algebraic topology. Instead it's a demonstration of how incredibly expressive Haskell is. Before starting on this project I expected it to take a couple of weeks of early morning hacking and many hundreds of lines of code. It actually took a couple of hours.


So the goal is this: use the standard chain complex from simplicial homology to compute the Betti numbers.


First we need some code to compute the rank of a matrix. As this is general purpose code I'm not counting it in my 20 line budget. It took me a little while to find a recursive implementation of this suitable for Haskell. Basically, the idea is that a zero matrix has zero rank and the following operations leave the rank of a matrix unchanged:


  1. Deleting a row of zeroes
  2. Deleting a column of zeroes
  3. Multiplying a row or column by a non-zero constant
  4. Adding one row to another.

One last thing is needed: if there is a non-zero value standing on its own in a row or column then deleting both the row and column that it is in reduces the matrix rank by one. Here's the complete code for the rank function:

import List

rank m = let m' = removeZeroRows $ removeZeroColumns m in
if m'==[]
then 0
else let (zero,(pivot:rest)) = break ((0 /=) . head) m' in
1+rank (map (cancelZero pivot) (zero++rest))
where
removeZeroColumns x = let n = foldl1 min (map countZeros x) in
map (drop n) x
removeZeroRows = filter (or . map (0 /=))
countZeros = length . fst . break (0 /=)
cancelZero (a:as) (b:bs) = zipWith (\x y -> a*y-b*x) as bs

With that out of the way, here is the code to compute the homology of a simplicial complex:

d x = zip (d' x) (cycle [1,-1]) where
d' [] = []
d' (x:xs) = xs : (map (x:) (d' xs))

matrix basis1 basis2 op = map (coefficients basis1 . op) basis2 where
coefficients basis x = map (flip coefficient x) basis
coefficient e = sum . lookup e
lookup k [] = fail ""
lookup k ((k',v):xs) = if k == k' then return v else lookup k xs

grade n = filter ((==(n+1)) . length)

dim c = foldl1 max (map length c)-1

dmatrix _ 0 = []
dmatrix c n = matrix (grade (n-1) c) (grade n c) d

betti c = let d = dim c
ranks = map rank $ map (dmatrix c) [0..d]
dims = map (\i -> length (grade i c)) [0..d] in
h dims ranks where
h (c:cs) (d:d':ds) = c-d-d' : h cs (d':ds)
h [c] [d] = [c-d]


That's it! No hundred thousand line algebra library, or polygonal mesh geometry library. Just the standard Haskell List library.


At this point I suggest looking at something like these notes on simplicial complexes. My d function corresponds to ∂ defined at the bottom of page 1.


Let's take the example at the bottom of page 2 and the top of page 3:


example = [
[1,2,3],
[1,2],[1,3],[2,3],[2,4],[3,4],
[1],[2],[3],[4],[5]]

(Note I don't have anything corresponding to F-1. This is always the set containing the empty set so the code is hard coded to work as if it were there.) If you play with the dmatrix routine you'll see that it gives the matrices on page 3 (transposed).


And now we can fire up ghci or hugs and get:


*Main> dmatrix example 1
[[-1,1,0,0,0],[-1,0,1,0,0],[0,-1,1,0,0],[0,-1,0,1,0],[0,0,-1,1,0]]
*Main> betti example
[2,1,0]

The example has 2 connected components, 1 1-hole, and no 2-holes. (Quick one line summary of the whole of algebraic topology: Make a hole in a piece of paper: that's a 1-hole. Hollow out the interior of a solid sphere: that's a 2-hole, etc.)


Here are some more topological spaces to play with:


point = [[0]]
line = [[0],[1],[0,1]]
ball n = sublists [0..n] \\ [[]]
sphere n = ball (n+1) \\ [[0..(n+1)]]
torus = [[1],[2],[3],[4],[5],[6],[7],[8],[9],
[1,2],[2,3],[1,3],
[5,9],[8,9],[5,8],
[4,6],[6,7],[4,7],

[1,5],[4,5],[1,4],
[2,9],[6,9],[2,6],
[3,8],[7,8],[3,7],

[1,9],[3,9],[1,8],
[4,9],[6,8],[5,7],
[1,6],[2,7],[1,7],

[1,5,9],[1,2,9],[2,3,9],[3,8,9],[1,3,8],[1,5,8],
[4,5,9],[4,6,9],[6,8,9],[6,7,8],[5,7,8],[4,5,7],
[1,4,6],[1,2,6],[2,6,7],[2,3,7],[1,3,7],[1,4,7]]

The torus is copied from page 4 of the notes. Here are some more examples:

*Main> betti (sphere 3)
[1,0,0,1]
*Main> betti (sphere 0)
[2]
*Main> betti torus
[1,2,1]

But that's not all. Pay your $19.95 right now and I'll throw in barycentric division for free:

sublists [] = [[]]
sublists (a:as) = let b = sublists as in b ++ map (a:) b

properSublists x = sublists x \\ [x]

chain f x = [x:xs| x' <- f x, xs <- chain f x'] ++ [[x]]

bary x = x >>= chain (filter (not . null) . properSublists)

As expected, barycentric division of a simplicial complex leaves the homology unchanged:

*Main> betti (bary torus)
[1,2,1]
*Main>

OK, I admit, it's a little slow on the last one. But it does have to perform unoptimised Gaussian elimination on fairly large matrices. (Type dmatrix ((bary torus)) 2)


And here's my laundry list of things to do with this project:


  1. More efficiency. The vectors over the simplices are stored incredibly inefficiently.
  2. Compute homology groups over Z, not Q. I just use dimension counting to compute the dimensions of the homology groups over a Q. I really ought to actually compute the kernels of these matrices and then find representatives of the homology groups. I also need to write code to manipulate Abelian groups rather than vector spaces. To my shame, I don't know what the algorithms to use for the latter are.
  3. Compute cohomology groups.
  4. Compute homology groups associated to a poset. (The barycentric subdivision code kinda does this already.) Actually, this is where I started from originally. alpheccar asked if there was a connection between certain types of Hopf algebra and the Euler characteristic. I think the answer is yes: it comes from computing the Euler characterstic of posets used to define the Hopf algebra. Actually, the whole renormalisation with Hopf algebras thing that alpheccar points out is reminiscent of a really beautiful paper I read years ago on "fat graphs" that computes the Euler characteristics of the very moduli spaces used in string theory.

And the last thing on my list I can do right now. Computing the Euler characteristic, via the Betti numbers, is a one liner:

euler c = sum (zipWith (*) (betti c) (cycle [1,-1]))

Of course I showed there was an easier way to compute this earlier.


How can Haskell not be the programming language that all mathematicians should learn?


Update: I just found some slides on the "fat graph" thing I mentioned above. I rate it as one of the most beautiful pieces of mathematics ever. Some time I'll devote a whole post to it.

14 comments:

alpheccar said...

I am a newcomer to Haskell. I just started my study a few weeks ago and I discovered your blog while I was looking for some info about monads.

I like that kind of posts because they are not just examples to learn Haskell so it is more interesting and we learn a bit more than just Haskell.

The more I learn about this language and the more I like it. I spent years using Ocaml which is also a very good language but I think Haskell is much more elegant.

I am totally lost with the fat graphs and the homology groups for poset :-) But, slides are generally not enough to start learning about a topic.

sigfpe said...

alpheccar,

I can't find the original paper on I read on "fat graphs" by RC Penner. It was actually pretty tough, 3 or 4 of us read it together in a reading seminar. It needed both physicists and pure mathematicians working together for us to make sense of it. But there are some really neat ideas buried in there that aren't too hard to understand and maybe at some point someone will write the easy version.

Ocaml isn't bad. You can get good performance out of ocaml. But ocaml has ugly syntax and I its lack of operator overloading is annoying.

Michi said...

Whoa!

I wish I had this post in front of me four years ago, when I started with my poincare project (Mathematical Reviews: MR2227050, ArXiv: math.AC/0502348).

My immediate questions to this expose are:
1) Does the algorithm work for finite fields? Say giving characteristic as a parameter and then calculating matrix rank over Q or over Z/p depending on the given characteristic.
2) How does it compare complexity-wise? Is it fast enough that you'd want to do Large Things with it?
3) How does hugs or ghci work with memory management? If I could expect ending up with several gigs worth of a matrix, would the interpreter choke?

augustss said...

I always like your blog entries and this is no exception. It's fun to see people use Haskell in their own fields and appreciate the power.

BTW, your code could be even shorter by using a few more predefined Haskell functions.

sigfpe said...

augustss,

With Haskell it's very tempting to keep simplifying code by using more and more higher order functions. But eventually you just have to stop being obsessed. Still, what changes might you make?

Mikael,

1. I was wondering about fields other than Q myself. I think it might work fine as my rank algorithm is really stupid - it just uses +, *, -, / and zero detection. (A real rank algorithm would probably do smarter pivoting and assume that the field has a norm.) What I really want to is to work over abelian groups.

2. The performance of this code is bad. The rank algorithm is crude. Also it does a lot of equality testing between non-trivial objects. Some kind of hashing of the vertex and simplex labels might speed things up. Oh...and the vectors aren't simplified - ie. it stores a+b+a as a+b+a without simplifying it to 2a+b. I chose simplicity over efficiency.

3. It starts going annoyingly slow with just (bary torus), not only for Gigs of matrix. Haskell's big weakness is array operations.

alpheccar said...

But you have not used the Haskell Array module in your code ?

sigfpe said...

alpheccar,

There are no arrays in that code. All of the algorithms work sequentially so I can use lists.

Slava Pestov said...

I'm working on some code to compute the cohomology of Lie algebras in my programming language Factor. You can take a look at the code.

sigfpe said...

slava,

I had a look at your code. Your language is awesome! (I've always liked Forth.) And I like your homology code. It's superfically similar to mine in places - like you have a dmatrix function. But you've done things properly with hashes. Nice.

sigfpe said...

Someone on reddit pointed out that the algorithm I need, to work over Z instead of Q, is reduction to Smith normal form.

bandika said...

There was a long time ago a paper by JO Shallit,
"Computational Simplicial Homology in APL" which appeared in the 1982 International Conference on APL. The paper quotes as the originator of this approach 2 papers by Pinkerton which appeared in "Mathematical Algorithms" in 1966.
You must also take a look at Lefschetz's books which still retain this flavor.

You must look over the Internet as there are some Maple implementations of this approach.

However, it would be very interesting if we can formalize the categorical approach too.

sigfpe said...

bandika,

I found the paper online. It gives all of the code. Unfortunately it's 20 years since I last used APL so I couldn't actually read it.

I was recently surprised to find that people are now autonatically computing de Rham cohomology. Simplicial homology is very 'discrete' making it easy to code up. But de Rham cohomology is very different.

Anonymous said...

Hi!
First, great thanks for the nice post. Sadly, I cannot access the "notes" mentioned in the post any more. Can you give me some hint where one can find them?
Thanks a lot!
Best

elviejo79 said...

The link for the notes is incorrect...
They are here:
http://people.reed.edu/~davidp/411/handouts/simplicial.pdf

Blog Archive