Thursday, March 30, 2006

A Neat Proof Technique

Last night I read this paper. There wasn't really anything in the paper I wanted, I was more interesting in flexing my newly developing computer science muscles. Anyway, like with all computer science papers I've managed to finish, I felt like I understood the paper but had no idea what the punch line was. Still, it was worthwhile because part way through there was a neat mathematical proof technique used that I think is worthy of a mention.

The paper is about the generalisations of 'fold' and 'unfold' that I played with recently. The usual fold function acts on a list like
[4,8,15,16,23,42].

Imagine it written as
cons(4,cons(8,cons(15,cons(16,cons(23,cons(42,[]))))))

where the cons function is a 'constructor' that constructs a list from a list element and another list. [] is the empty list. The fold function takes three arguments, a binary function f, a constant g, and a list. It then replaces cons by f and [] by g. For example
fold (+) 0 [4,8,15,16,23,42]
becomes
4+(8+(15+(16+(23+(42+0)))))

and so is 108, the sum of the numbers. An obvious theorem is that
fold cons [] x = x.


fold also generalises to structures such as trees. In this case there is another constructor that builds a tree from a node and its children. I'll stick with lists as the proof technique is much the same. The authors prove a bunch of theorems about the fold function. But now they want to prove something is true of all of the elements in a list (actually, a tree, but I use a list). Suppose the list is as above and you want to prove the property P holds for all elements in that list. Then you want to prove

P(4) && P(8) && P(15) && P(16) && P(23) && P(42) = True.

(I'm using Haskell's && to mean 'and'.) In other words you want to prove that
fold (&&) True [P(4),P(8),P(15),P(16),P(23),P(42)] = True.

The authors then proceed to show their theorem is true using the theorems about fold they just proved. Neat eh? But Maybe I haven't explained very well so I'll try to spell it out differently.

The authors want to prove theorems about all elements of a certain datastructure. So what they do is take the logical proposition that expresses what they want and rewrite the proposition so that the proposition itself in the same 'shape' as the datastructure. The statement of the truth of the proposition is an application of fold to the datastructure. So they can now use the theory of datastructures they develop to prove something about the structure of the logical proposition itself - namely that it's true. You need to read the actual paper (top left, page 5) to see the actual details of the proof technique.

In the particular case of lists the proof technique turns out to be standard induction. If you can prove P of an an element of the list, and you can prove "if P is true of an element it must be true of the next element", then P holds for every element of the list. Every time you generalise fold to a different datastructure there is a new induction principle that goes with it.

I suppose what they have actually done is prove a metatheorem about certain types of logical proposition and then used that to prove specific propositions.

Well, I hope that makes some kind of sense. I think it's quite a neat trick and that many of the paper's readers might miss the cool bit of self-reference that's going on here.

No comments:

Blog Archive