A Neighborhood of Infinity: July 2017

Introduction

TL;DR The behaviour of a certain kind of delay component has a formal similarity to Löb's theorem which gives a way to embed part of provability logic into electronic circuits.

Here's a famous paradoxical sentence:

This sentence is false

If it's false then it's true and if it's true then it's false.

Here's a paradoxical electronic circuit:

The component in the middle is an inverter. If the output of the circuit is high then its input is high and then it's output must be low, and vice versa.

There's a similarity here. But with a bit of tweaking you can turn the similarity into an isomorphism of sorts.

In the first case we avoid paradox by noting that in the mathematical frameworks commonly used by mathematicians it's impossible, in general, for a statement to assert it's own falsity. Instead, a statement can assert its own unprovability and then we get Gödel's incompleteness theorems and a statement that is apparently true and yet can't be proved.

In the second case we can't model the circuit straightforwardly as a digital circuit. In practice it might settle down to a voltage that lies between the official high and low voltages so we have to model it as an analogue circuit. Or instead we can introduce a clock and arrange that the feedback in the circuit is delayed. We then get an oscillator circuit that can be thought of as outputting a stream of bits.

The observation I want to make is that if the feedback delay is defined appropriately, these two scenarios are in some sense isomorphic. This means that we can model classic results about provability, like Gödel's incompleteness theorems, using electronic circuits. We can even use such circuits to investigate what happens when logicians or robots play games like Prisoner's Dilemma. I'll be making use of results found in Boolos' book on The Logic of Provability and some ideas I borrowed from Smoryński's paper on Fixed Point Algebras. I'll be assuming the reader has at least a slight acquaintance with ithe ideas behind provability logic.

Provability Logic

There are many descriptions of provability logic (aka GL) available online, so I'm not going to repeat it all here. However, I've put some background material in the appendix below and I'm going to give a very brief reminder now.

Start with (classical) propositional calculus which has a bunch of variables with names like \(a, b, c, d, \ldots\) and connectives like \(\wedge\) for AND, \(\vee\) for OR, \(\neg\) for NOT and \(\rightarrow\) for implication. (Note that \(a\rightarrow b = \neg a\vee b\).)

Provability logic extends propositional calculus by adding a unary operator \(\Box\). (I apologise, that's meant to be a □ but it's coming out like \(\Box\) in LaTeX formulae. I think it's a bug in Google's LaTeX renderer.) The idea is that \(\Box p\) asserts that \(p\) is provable in Peano Arithmetic, aka PA. In addition to the axioms of propositional calculus we have

\(\Box(p\rightarrow q)\rightarrow\Box p\rightarrow\Box q\)

and

\(\Box p\rightarrow\Box\Box p\)

as well as a rule that allows us to deduce \(\Box p\) from \(p\).

We also have this fixed point property:

Let \(F(p)\) be any predicate we can write in the language of GL involving the variable \(p\), and suppose that every appearance of \(p\) in \(F(p)\) is inside a \(\Box\), e.g. \(F(p)=\Box p\vee\Box(\neg p)\). Then there is a fixed point, i.e. a proposition \(q\) that makes no mention of \(p\) such that \(q\leftrightarrow F(q)\) is a theorem. In effect, for any such \(F\), \(q\) is a proposition that asserts \(F(q)\).

See the appendix for a brief mention of why we should expect this to be true.

From the fixed point property we can deduce Löb's theorem: \(\Box(\Box p\rightarrow p)\rightarrow\Box p\). There is a proof at wikipedia that starts from the fixed point property.

We can also deduce the fixed point property from Löb's theorem so it's more usual to take Löb's theorem as an axiom of GL and show that the fixed point property follows. You can think of Löb's theorem as a cunning way to encode the fixed point property. In fact you can argue that it's a sort of Y-combinator, the function that allows the formation of recursive fixed points in functional programming languages. (That's also, sort of, the role played by the loeb function I defined way back. But note that loeb isn't really a proof of Löb's theorem, it just has formal similarities.)

Back to electronic circuits

In order to make digital circuits with feedback loops well-behaved I could introduce a circuit element that results in a delay of one clock cycle. If you insert one of these into the inverter circuit I started with you'll end up with an oscillator that flips back and forth between 0 and 1 on each clock cycle. But I want to work with something slightly stricter. I'd like my circuits to eventually stop oscillating. (I have an ulterior motive for studying these.) Let me introduce this component:

It is intended to serve as a delayed latch and I'll always have the flow of data being from left to right. The idea is that when it is switched on it outputs 1. It keeps outputting 1 until it sees a 0 input. When that happens, then on the next clock cycle its output drops to 0 and never goes back up to 1 until reset.

Because the output of our delay-latch isn't a function of its current input, we can't simply describe its operation as a mathematical function from \(\{0,1\}\) to \(\{0,1\}\). Instead let's think of electronic components as binary operators on bitstreams, i.e. infinite streams of binary digits like ...00111010 with the digits emerging over time starting with the one written on the right and working leftwards. The ordinary logic gates perform bitwise operations which I'll represent using the operators in the C programming language. For example,

...001110 & ...101010 = ...001010

and

~...101 = ...010

and so on. Let's use □ to represent the effect of latch-delay on a bitstream. We have, for example,

□...000 = ...001

and

□...11101111 = ...00011111.

The operator □ takes the (possibly empty) contiguous sequence of 1's at the end of the bitstream, extends it by one 1, and sets everything further to the left to 0. If we restrict ourselves to bitstreams that eventually become all 0's or all 1's on the left, then bitstreams are in one-to-one correspondence with the integers using the twos complement representation. For example ...111111, all 1's, represents the number -1. I'll simply call the bistreams that represent integers integers. With this restriction we can use a classic C hacker trick to write □p=p^(p+1) where ^ is the C XOR operator. The operator □ outputs the bits that get flipped when you add one.

Let's use the symbol → so that a → b is shorthand for ~a|b. Here are some properties of □:

1. □(-1) = -1

2. □p → □□p = -1

3. □(p → q) → □p → □q = -1

In addition we have the fixed point property:

Let F(p) be any function of p we can write using □ and the bitwise logical operators and such that all occurrences of p occur inside □. Then there is a unique bitstream q such that q=F(q).

We can make this clearer if we return to circuits. F(p) can be thought of as a circuit that takes p as input and outputs some value. We build the circuit using only boolean logic gates and delay-latch. We allow feedback loops, but only ones that go through delay-latches. With these restrictions it's pretty clear that the circuit is well-behaved and deterministically outputs a bitstream.

We also have the Löb property:

4. □(□p → p) → □p = -1

We can see this by examining the definition of □. Intuitively it says something like "once □ has seen a 0 input then no amount of setting input bits to 1 later in the stream make any different to its output".

I hope you've noticed something curious. These properties are extremely close to the properties of \(\Box\) in GL. In fact, these electronic circuits form a model of the part of GL that doesn't involve variable names, i.e. what's known as letterless GL. We can formalise this:

1. Map \(\bot\) to a wire set to 0, which outputs ...000 = 0.

2. Map \(\top\) to a wire set to 1, which outputs ...111 = -1.

3. Map \(p \circ q\), where \(\circ\) is a binary connective, by creating a circuit that takes the outputs from the circuits for \(p\) and \(q\) and passes them into the corresponding boolean logic gate.

4. Map \(\Box p\) to the circuit for \(p\) piped through a delay-latch.

For example, let's convert \(\Box(\Box\bot\rightarrow\bot)\rightarrow\Box\bot\) into a circuit. I'm translating \(a\rightarrow b\) to the circuit for \(\neg a\vee b\).

I'm using red wires to mean wires carrying the value 1 rather than 0. I hope you can see that this circuit eventually settles into a state that outputs nothing but 1s.

We have this neat result:

Because delay-latch satisfies the same equations as \(\Box\) in provability logic, any theorem, translated into a circuit, will produce a bistream of just 1s, i.e. -1.

But here's a more surprising result: the converse is true.

If the circuit corresponding to a letterless GL proposition produces a bistream of just 1s then the proposition is actually a theorem of GL.

I'm not going to prove this. (It's actually a disguised form of lemma 7.4 on p.95 of Boolos' book.) In the pictured example we got ...1111, so the circuit represents a theorem. As it represents Löb's theorem for the special case \(p=\bot\) we should hope so. More generally, any bitstream that represents an integer can be converted back into a proposition that is equivalent to the original proposition. This means that bitstreams faithfully represent propositions of letterless GL. I'm not going to give the translation here but it's effectively given in Chapter 7 of Boolos. I'll use \(\psi(p)\) to represent the translation from propositions to bitstreams via circuits that I described above. Use \(\phi(b)\) to represent the translation of bitstream \(b\) back into propositions. We have \(p\leftrightarrow\phi(\psi(p))\). But I haven't given a full description of \(\phi\) and I haven't proved here that it has this property.

Circuits with feedback

In the previous section I considered letterless propositions of GL. When these are translated into circuits they don't have feedback loops. But we can also "solve equations" in GL using circuits with feedback. The GL fixed point theorem above says that we can "solve" the equation \(p\leftrightarrow F(p)\), with one letter \(p\), to produce a letterless proposition \(q\) such that \(q\leftrightarrow F(q)\). Note here that \(p\) is a letter in the language of GL. But I'm using \(q\) to represent a proposition in letterless GL. If we build a circuit to represent \(F\), and feed its output back into where \(p\) appears, then the output bitstream represents the fixed point. Here's a translation of the equation \(p \leftrightarrow \neg(\Box p \vee \Box\Box\Box p)\):

I'll let you try to convince yourself that such circuits always eventually output all 0's or all 1's. When we run the circuit we get the output ...1111000 = -8. As this is not -1 we know that the fixed point isn't a theorem. If I'd defined \(\phi\) above you could use it to turn the bitstream back into a proposition.

The same, syntactically (optional section)

I have a Haskell library on github for working with GL: provability. This uses a syntactic approach and checks propositions for theoremhood using a tableau method. We can use it to analyse the above example with feedback. I have implemented a function, currently called value', to perform the evaluation of the bitstream for a proposition. However, in this case the fixedpoint function computes the fixed point proposition first and then converts to a bitstream rather than computing the bitstream directly from the circuit for F:

> let f p = Neg (Box p \/ Box (Box (Box p)))
> let Just p = fixedpoint f
> p
Dia T /\ Dia (Dia T /\ Dia (Dia T /\ Dia T))
> value' p
-8

(Note that Dia p means \(\Diamond p = \neg\Box\neg p\).)

The function fixedpoint does a lot of work under the hood. (It uses a tableau method to carry out Craig interpolation.) The circuit approach requires far less work.

Applications

1. Programs that reason about themselves

In principle we can write a program that enumerates all theorems of PA. That means we can use a quine trick to write a computer program that searches for a proof, in PA, of its own termination. Does such a program terminate?

We can answer this with Löb's theorem. Let \(p =\) "The program terminates". The program terminates if it can prove its termination. Formally this means we assume \(\Box p\rightarrow p\). Using one of the derivation rules of GL we get \(\Box(\Box p\rightarrow p)\). Löb's theorem now gives us \(\Box p\). Feed that back into our original hypothesis and we get \(p\). In other words, we deduce that our program does in fact terminate. (Thanks to Sridhar Ramesh for pointing this out to me.)

But we can deduce this using a circuit. We want a solution to \(p\leftrightarrow \Box p\). Here's the corresponding circuit:

It starts by outputting 1's and doesn't stop. In other words, the fixed point is a theorem. And that tells us \(p\) is a theorem. And hence that the program terminates.

2. Robots who reason about each others play in Prisoner's Dilemma

For the background to this problem see Robust Cooperation in the Prisoner's Dilemma at LessWrong. We have two robot participants \(A\) and \(B\) playing Prisoner's Dilemma. Each can examine the other's source code and can search for proofs that the opponent will cooperate. Suppose each robot is programmed to enumerate all proofs of PA and cooperate if it finds a proof that its opponent will cooperate. Here we have \(p =\) "A will cooperate" and \(q =\) "B will cooperate". Our assumptions about the behaviour of the robots are \(p \leftrightarrow \Box q\) and \(q \leftrightarrow \Box p\), and hence that \(p \leftrightarrow \Box\Box p\). This corresponds to the circuit:

This outputs ...1111 = -1 so we can conclude \(p\) and hence that these programs will cooperate. (Note that this doesn't work out nicely if robot B has a program that doesn't terminate but whose termination isn't provable in the formal system A is using. That means this approach is only good for robots that want to cooperate and want to confirm such cooperation. See the paper for more on this.)

At this point I really must emphasise that these applications are deceptively simple. I've shown how these simple circuits can answer some tricky problems about provability. But these aren't simply the usual translations from boolean algebra to logic gates. They work because circuits with delay-latch provide a model for letterless provability logic and that's only the case because of a lot of non-trivial theorem proving in Boolos that I haven't reproduced here. You're only allowed to use these simple circuits once you've seen the real proofs :-)

Things I didn't say above

1. I described the translation from propositions to circuits that I called \(\psi\) above. But I didn't tell you what \(\phi\) looks like. I'll leave this as an exercise. (Hint: consider the output from the translation of \(\Box^n\bot\) into a circuit.)

2. The integers, considered as bistreams, with the bitwise operators, and the unary operator □p=p^(p+1), form an algebraic structure. For example, if we define ⋄p=~□~p we have a Magari algebra. Structures like these are intended to capture the essential parts of self-referential arguments in an algebraic way.

3. Because of the interpretation of □ as a delayed latch in a circuit you could view it as saying "my input was always true until a moment ago". This surely embeds provability logic in a temporal logic of some sort.

4. (Deleted speculations about tit-for-tat that need rethinking.)

5. For even the most complex letterless proposition in Boolos you could check its theoremhood with a pretty small circuit. You could even consider doing this with a steam powered pneumatic circuit. I had to say that to fulfil a prophecy and maintain the integrity of the timeline.

Appendix on provability

The modern notion of a proof is that it is a string of symbols generated from some initial strings called "axioms" and some derivation rules that make new strings from both axioms and strings you've derived previously. Usually we pick axioms that represent "self-evident" truths and we pick derivation rules that are "truth-preserving" so that every proof ends at a true proposition of which it is a proof. The derivation rules are mechanical in nature: things like "if you have this symbol here and that symbol there then you can replace this symbol with that string you derived earlier" etc.

You can represent strings of symbols using numbers, so-called Gödel numbers. Let's pick a minimal mathematical framework for working with numbers: Peano Arithmetic, aka PA. Let's assume we've made some choice of Gödel numbering scheme and when \(p\) is a proposition, write \([p]\) for the number representing \(p\). You can represent the mechanical derivation rules as operations on numbers. And that makes it possible to define a mathematical predicate \(Prov\) that is true if and only if its argument represents a provable proposition.

In other words, we can prove \(Prov([p])\) using PA if and only if \(p\) is a proposition provable in PA.

The predicate \(Prov\) has some useful properties:

1.If we can prove \(p\), then we can prove \(Prov([p])\).

We take the steps we used to prove \(p\), and convert everything to propositions about numbers. If \(Prov\) is defined correctly then we can convert that sequence of numbers into a sequence of propositions about those numbers that makes up a proof of \(Prov(p)\).

2.\(Prov([p\rightarrow q])\) and \(Prov([p])\) imply \(Prov([q])\)

A fundamental step in any proof is modus ponens, i.e. that \(p\rightarrow q\) and \(q\) implies \(p\). If \(Prov\) does its job correctly then it had better know about this.

3.\(Prov([p])\) implies \(Prov([Prov([p])])\)

One way is to prove this is to use Löb's theorem.

4. \(Prov([\top])\)

The trivially true statement had better be provable or \(Prov\) is broken.

Constructing \(Prov\) is conceptually straightforward but hard work. I'm definitely not going to do it here.

And there's one last thing we need: self-reference. If \(p\) is a proposition, how can we possibly assert \(Prov([p])\) without squeezing a copy of \([p]\) inside \(p\)? I'm not going to do that here either - just mention that we can use a variation of quining to achieve this. That allows us to form a proposition \(p\) for which we can prove \(p\leftrightarrow Prov([p])\). In fact, we can go further. We can find propositions that solve \(p\leftrightarrow F(p)\) for any predicate \(F(p)\) built from the usual boolean operations and \(p\) as long as all of the occurrences of \(p\) are inside the appearances of \(Prov\). Even though we can't form a proposition that directly asserts its own falsity, we can form one that asserts that it is unprovable, or one that asserts that you can't prove that you can't prove that you can prove it, or anything along those lines.

Anyway, all that \([]\) and \(Prov\) business is a lot of hassle. Provability logic, also known as GL, is intended to capture specifically the parts of PA that relate to provability. GL is propositional calculus extended with the provability operator \(\Box\). The intention is that if \(p\) is a proposition, \(\Box p\) is a proposition in GL that represents \(Prov([p])\) in PA. The properties of \(Prov\) above become the axioms and derivation rules of GL in the main text.

A Neighborhood of Infinity

Search This Blog

Saturday, July 15, 2017

Self-referential logic via self-referential circuits

Blog Archive