Saturday, April 02, 2011

Generalising Gödel's Theorem with Multiple Worlds. Part IV.

Interpolating Between Propoitions
Suppose a, b and c are propositions. Then a∧b→b∨c. It seems that the a is irrelevant to what's happening on the right hand side. We could remove it and still have a true proposition: b→b∨c. It seems that the c on the right hand side is also irrelevant so that this is also true: a∧b→b. In fact, we can "refactor" the original proposition as a∧b→b→b∨c. (By Curry Howard it *is* a kind of refactoring.)

Let's try using the methods I described in Part I to demonstrate the validity of a∧b→b∨c:



I've done two things slightly differently. I brought the negation inside the implication at the start. This means we start with an implication with a clearly defined left- and right-hand side. It makes no difference to the outcome. But I've also coloured terms coming from the left-hand side in bLue and those on the right in oRange. Eventually we see a b and a ¬b meet each other resulting in the big red X telling us that the negation of a∧b→b∨c is invalid. But notice how the b and ¬b came from opposite sides of the implication. This tells us that the implication is valid because of the b's and that the appearance of a and c plays no role in establishing validity.

Now suppose a blue a met a blue ¬a. They would have both originated from the left hand side, telling us that the left hand side was a contradiction, regardless of what's on the right. So if the original implication we wished to establish were written as L→R then we'd have established that we could factor it as L→⊥→R. Similarly, if an orange a met an orange ¬a we'd have established the invalidity of ¬R meaning that we get L→⊤→R.

In all three cases we've managed to find an "interpolating" formula F such that L→F→R with the property that F only refers to letters that occur *both* in L and R. It may seem intuitively obvious that we can do this. Irrelevant hypotheses shouldn't play a role in establishing an implication. In fact, this is generally true of propositional calculus and is known as the Craig Interpolation Lemma. It also holds for Provability Logic. This is certainly not an obvious fact. The interpolation property fails for many logics.

I'm only going to roughly sketch how the proof of the interpolation lemma looks. Essentially it will be a proof by construction with the construction being the following code. We'll implement a bunch of tableau rules that construct the interpolation lemma. We've also seen what some of these rules look like in the discussion above.

Constructing Interpolations

I coloured propositions orange and blue above. I can't use colour in Haskell code so I'll instead label propositions L and R using the following type:

> data SignedProp a = L a | R a deriving (Eq, Ord, Show)

Think of the colour or sidedness of a proposition as its 'sign'. Sometimes we'll make a selection based on the sign:

> select (L _) l r = l
> select (R _) l r = r

> instance Functor SignedProp where
>     fmap f (L a) = L (f a)
>     fmap f (R a) = R (f a)

Sometimes we'll want to remove the sign:

> unsign (L a) = a
> unsign (R b) = b

Now you may see why I had to use Haskell's ViewPattern extension. I want to do the same pattern matching as before on these propositions even though they are a different type. As all the rules view the patterns through propType we can achieve this by making SignedProp a an instance of PropTypeable:

> instance PropTypeable a => PropTypeable (SignedProp a) where
>     propType (L a)    = fmap L (propType a)
>     propType (R a)    = fmap R (propType a)
>     neg               = fmap neg
>     isF               = isF . unsign
>     positiveComponent = positiveComponent . unsign
>     negative          = negative . unsign

With this framework in place the code to interpolate is quite small. Instead of returning a Bool or a diagram, these rules return Just the interpolating proposition when possible, otherwise a Nothing. But the code is also going to do something slightly more general. If we close a potential world containing propositions l1,l2,...,r1,r2,... we have shown that l1∧l2∧...∧r1∧r2∧... isn't valid. Ie. that l1∧l2∧...→¬r1∨¬r2∨... *is* valid. If the li are L-propositions, and the ri are R-propositions, then the rules I define below will find an interpolating formula for l1∧l2∧...→¬r1∨¬r2∨.... For the original case of p→q we prime it with p on the left and ¬q on the right:

> interp (p :-> q) = let t = runTableau interpRules [L p, R (Neg q)]
>                    in simplify <$> t

And now we can give the rules:

> interpRules = TableauRules {

The first three branches of this case are the three rules discussed at the beginning. The fourth should be fairly obvious:

>     foundContradiction = \a -> Just $ case a of
>        (L _, L _) -> F
>        (R _, R _) -> T
>        (L n, R _) -> n
>        (R n, L _) -> Neg n,

The algorithm needs to know when a world has closed. The original validity rules returned a boolean. These rules return a Maybe, and we know a world closed if the algorithm succeeded in returning an interpolating proposition:

>     closes = isJust,

These are the trivial cases of finding a blue or orange ⊥:

>     foundF = \a -> Just $ select a F T,

Some rules that really just serve as glue:

>     open     = \_ -> Nothing,
>     doubleNegation = \_ t -> t,
>     conjRule = \_ _ t -> t,

Now suppose we want to find an interpolating formula for l1∨l2→r. If we set up our tableau we get:


(Note I'm using l and r as metavariables here, so l1, l2 and r represent propositions made up of (possibly many) ordinary single letter variables.) If we complete the two sides of the divide using our interpolation algorithm recursively we'll find propositions fi such that li→fi→r. Hence we find l1∨l2→f1∨f2→r. Clearly the middle proposition only contains letters that appear on both sides of the original implication. A similar analysis allows us to find an interpolating proposition for l→r1∧r2. That gives us this rule:

>     disjRule = \p _ _ tl tr -> select p (:\/) (:/\) <$> tl <*> tr,

Those are the rules for classical propositional calculus.

Now comes the tricky bit. If we draw a big red X in a subworld it allows us to back out and deduce an interpolating proposition in the parent world. The rule is simple but I'll leave the proof sketch as an appendix:

>     processWorld  = \p t -> select p Dia Box <$> t,

If any subworld of a world is closed, so is the parent world, so we can ignore all but the first closed subworld:

>     combineWorlds = mplus,

>     tableau = \_ t -> t
> }

You can reproduce the wikipedia example with this interp ((neg (p /\q) --> neg r /\q) --> (t --> p) \/ (t --> neg r)).

Definability

In my previous post I talked about how in provability logic we can define propositions implicitly. Now we're in a position to do the construction. Firstly we need to say precisely what we mean by a definition inside the language of provability logic, and then we need to say what ensures that definitions make sense.

An implicit definition of a proposition in provability logic is a function Prop -> Prop that doesn't analyze its argument. Informally, it defines a proposition p if f p is valid. Some candidates might be:

> def1 p = T --> p
> def2 p = p --> T

The first, def1, seems fine. It seems to uniquely single out p == T because def1 T is valid. But def1 (neg F) is also valid. So we can't uniquely pin down propositions, but only up to some sort of equivalence. In fact, we can use <-> as our equivalence relation.

But the second attempted definition is useless. Any proposition satisfies it. So we only consider a definition d to be valid if any two propositions satisfying it are equivalent. So we can say that d defines a proposition h if d p --> (p <-> h) is valid.

So here's the beginning of a function that attempts to find a proposition satisfying an implicit definition:

> beth d = let p = Letter "__p"
>              q = Letter "__q"
>          in if not $ valid $ (d p /\ d q) --> (p <-> q)
>             then error $ show (d p) ++ " doesn't satisfy precondition"

You can see how I've encoded our precondition for a definition to be good. I also used __p and __q to make sure we didn't clash with any letters in the definition. It's called beth because the theorem that ensures we can write the else clause is known as the Beth Definability Theorem. The last line is astonishingly short:

>             else interp (d p /\ p --> (d q --> q))

Suppose we have proved that d(p)∧d(q)→(p↔q). It immediately follows that d(p)∧p→(d(q)→q). (It's essentially a bit of currying.) This is a candidate for Craig interpolation as the left hand side has no q and the right hand side has no p. So we can make a sentence h so that d(p)∧p→h and h→(d(q)→q)). The lettes p and q are just letters. If a proposition is true for q, it's also true for p. So with a little rearrangement we get that d(p)→(p↔h). Or in English, if p satisfies our definition, h is equivalent to it. So we've constructed an h that does what we want.

Fixed Points
Now it's a small step to get a fixed point. We just make a definition of fixed point and apply beth. This looks like it might work:

> isFixedPoint' f p = p <-> f p

Unfortunately it's possible for p and p' to be inequivalent and yet both satisfy isFixedPoint' f. We have to use a "stronger" definition. I'll leave the proof to Boolos's book, but what we'll do is assert not just that p is a fixed point, but also that it is provably so. So we use:

> box' a = a /\ Box a
> isFixedPoint f p = box' (p <-> f p)

In order for the uniqueness to work, every occurence of the argument of f must be inside a Box or Dia.

And that's it. We can now churn out fixed points to our heart's content.

Gödel's Second Incompleteness Theorem again
Let's find a proposition that asserts its own unprovability:

> godel = fixedpoint $ \p -> neg (Box p)

We get Dia T, which is the claim that arithmetic is consistent. So godel shows that if arithmetic is consistent, then it can't prove its consistency.

Here's a large number that I use for regression testing that I lifted from Boolos's book:

> fpexamples = [
>         (\p -> Neg (Box p), Dia T),
>         (\p -> Box p, T),
>         (\p -> Box (Neg p), Box F),
>         (\p -> Neg (Box (Neg p)), F),
>         (\p -> Neg (Box (Box p)), Dia (Dia T)),
>         (\p -> Box p :-> Box (Neg p), Dia (Dia T) \/ Box F),
>         (\p -> Box (Neg p :-> Box F) :-> Box (p :-> Box F),
>                Dia (Dia (Neg (Box F) /\ Neg (Box F)) /\
>                Neg (Box F)) \/ Box (Box F)),
>         (\p -> Box p :-> q, Dia (Neg q) \/ q /\ q),
>         (\p -> Box (p :-> q), Box q),
>         (\p -> Box p /\ q, Box q /\ q),
>         (\p -> Box (p /\ q), Box q),
>         (\p -> q \/ Box p, T),
>         (\p -> Neg (Box (q :-> p)), Dia q),
>         (\p -> Box (p :-> q) :-> Box (Neg p),
>                Dia (Box F /\ Neg q) \/ Box F),
>         (\p -> q /\ (Box (p :-> q) :-> Box (Neg p)), q /\ Box (Neg q)),
>         (\p -> Dia p :-> (q /\ Neg (Box (p :-> q))),
>                Box F /\ Box F \/ q /\ Dia ((Box F /\ Box F) /\ Neg q)),
>         (\p -> Box (Box (p /\ q) /\ Box (p /\ r)),
>                Box (Box q /\ Box r))
>     ]

Note how in every case, p is inside a Box or Dia. It's pretty mind-bending to try to think about what all of these propositions could possibly mean.

We can easily write to test to see whether two propositions are equivalent (in the sense that they imply each other):

> equiv p q = valid (p <-> q)

> regress2 = do

>     print $ and $ map (\(f, x) -> fromJust (fixedpoint f) `equiv` x) fpexamples

I am disconcerted that one of the easy looking examples in Boolos fails my tests. Given that the difficult cases agree with my code I think it is likely an error by Boolos, though that's a scary claim to make against one of the best known logicians in the world.

The company Theory Mine has been selling theorems. But they don't seem too interesting. Far better to generate theorems that generalise Gödel's work and tell you about the very nature of provability. Just send me $19.99 and I'll send you a certificate.

> main = do
>           regress1
>           regress2

Notes
Here is some Prolog code by Melvyn Fitting to do much the same thing. Note that code relies crucially on backtracking whereas my code explicitly searches through subworlds.

The function runTableau is a kind of generalised fold. Such things have associated induction principles. In this case it's a generalisation of the "Unifying Principle" defined by Smullyan in First-Order Logic. I guess it is an ordinary generalised fold over a tree structure representing fully expanded tableaux, but we never explicitly build such a structure.

Oh...and apologies (if anyone has come this far) for how long this took. Last year I threw together code to find fixed points in a couple of hours thinking "that was easy, I can blog about it". But I hadn't realised how many hours of work it would take to explain what I had done. And even if nobody came this far, the rubber ducking improved my own understanding greatly.



Appendix
Suppose we successfully showed that this closed:


Then the generalised Craig Interpolation Lemma says that for some suitable f, both of the following close:


The way we'll prove this is to follow what we must have done to get the first potential world to close, assume inductively that the interpolation lemma works for all of the potential subworlds that we entered.

So let's start by supposing that we used ◊l2 to open up a subworld, giving us:


If the subworld closed then recursively using interpolation we know that for some appropriate choice of g, these closed too:


Now consider trying to close these two worlds:


Using Worlds 1 and 2 above we find they close immediately:


And that now tells us that when we recurse back up, we can use f=◊g.

I started with world containing just 6 propositions so I seem only to have proved this for that case. But in fact, any combination of propositions just ends up with an argument that is substantially the same. The only thing that might change is that if we use ◊r2 to try to open up a subworld we find that f=◻g for some other g. And this is all summarised compactly by the rule processWorld above.




2 comments:

  1. Can you recommend a good intro book on logic so that I might understand what any of this means? I have done pretty well overall as a self-taught programmer but this is definitely an area where I am deficient. Haskell has been on my to-learn list for quite a while and I am thinking maybe I should get some logic under my belt before I seriously tackle it.

    ReplyDelete
  2. Learning logic at the same time as Haskell. Tricky!

    Actually, I have the perfect 'course' for you. For the logic, read the first two chapters of the Smullyan book I cite above. Everything I say is a generalisation of what's in that book to provability logic.

    Then start learning Haskell. There are a couple of good books now.

    And then you can implement analytic tableau in Haskell based on chapter 2 of Smullyan's book. That's exactly what I did when I was learning Haskell. If you grasp basic Haskell, and grasp those two chapters, this shouldn't be hard. In fact, it's fun.

    These posts on provability logic have basically been the next step.

    (But note: the tableau I describe in these posts are not exactly 'analytic', but that won't mean anything at this stage.)

    ReplyDelete