Saturday, January 26, 2008

The Type that Should Not Be



> import Prelude hiding ((^))


I've been pondering a type that's so mind-bogglingly weird that I can no longer think about it without frying my neurons. So instead I'll just dump the contents of my brain here so you can fry your neurons too.

Here's an interesting type to warm up with. In Haskell:


> data X = Tree [X]


At first look it seems not be well-founded in the sense that a value of type X is a list of X's and we haven't yet defined what X's are. But we don't need an X to make an empty list of X's. So we can build elements inductively starting with Tree []. If you think of Tree [] as being a leaf, you can see why I called this type Tree (though they are trees whose limbs can have infinite sequences of children).

Even without the empty set forming a 'foundation' we can still build circular elements like


> x = Tree [x,x]


Now lists are ordered and can have repeated elements. What about if we eliminate these two properties. The way to do that would be with a set type. So how do we represent sets in Haskell. Well certainly not with Haskell's Data.Set because that only represents finite sets. On approach to supporting infinite sets is to identify a set with a predicate for telling whether an element is a member of a set. In other words, the type X→Bool makes a tolerably good set of X's.

So now we're ready for an interesting type:


> data U = U { f :: U -> Bool }


(BTW U stands for universe).

The first thing to think about is what this might mean mathematically. Well it looks like a solution to the equation X=2X, in other words, a set that equals its own power set. We know that in set theory this gives rise to Russell's paradox. One way to embed recursive types into set theory is to tweak set theory, for example by using non-well-founded set theory. But Russell's paradox is so bad that it breaks just about any attempt at devising a set theory where a set can equal its power set.

And yet we can build this type in Haskell. How come? The explanation is simple, Russell's paradox tells us that there must be some elements whose evaluation doesn't terminate. In fact, we can replay Russell's paradox directly in Haskell. But before that we need some definitions.

Firstly, are we sure we can even construct an element of U? In order to define an element of U we need to define a boolean predicate on U but we can't even compare two elements of U for equality. But we can define the 'empty' element phi, along with a membership test:


> phi = U $ const False
> x `e` y = f y x


Unlike conventional set theory, we can define negation of elements. We can also define union and intersection:


> neg a = U $ \x -> not (x `e` a)
> a \/ b = U $ \x -> x `e` a || x `e` b
> a /\ b = U $ \x -> x `e` a && x `e` b


We're ready for the Russell paradox. Define the 'set' (I give in, let's informally call these things sets) of all sets that don't contain themselves.


> nonSelfContaining = U $ \a -> not (a `e` a)


Now try to compute nonSelfContaining `e` nonSelfContaining. It doesn't terminate. It's not just a matter of having implemented it badly, it can't possibly terminate as there is no answer that could make sense. So if you're designing a language that's intended to be total, you'd better rule out types like U.

So we now have two elements of U, phi and neg phi. Can we make any more? Well here's one way:


> c a = U $ \x -> a `e` x


Intuitively, c a is the set of all sets that contain a. But there are two potential problems that come to mind with this. One is, we've only made two sets so far, so "the set of all sets that contain a" might not be interesting. And secondly, can we be sure c a is distinct from a?

The first issue is easily dismissed. neg phi is obviously in c phi, but phi isn't. So we can distinguish phi and c phi. c phi is also different from neg phi (check to see if phi is an element of each). In fact, define functional power by


> f ^ 0 = \x -> x
> f ^ n = \x -> (f^(n-1)) (f x)


and it can be shown that c^m phi is distinct from c^n phi for m≠n. (Hint: Try checking to see if c^m (neg phi) is an element of c^n phi for various m and n.) So we have constructed an infinite number of elements of U.

But, we know that we can write code for elements of U that doesn't terminate. Why not restrict ourselves to just the computable elements of U? But there's a catch. How do we define a computable element of U when elements of U are functions. One approach is this: define x to be computable if y `e` x is computable for all y. But that won't do because it won't terminate if y isn't computable. So how about defining x to be computable if y `e` x is computable for all computable y. But that's circular, and not in a nice inductive way. U is so perverse that just attempting to define computability for it gives a circular definition. But that's not necessarily a problem. Define a 'universe' to be a set, C, of computations of type U such that C = the set of x such that y`e` x terminates for all y in C. The question now is this: is there a unique set C satisfying this 'equation'. We can rule out the empty set as phi must be in any such C. There's also another thing we can prove. If we have two such sets, C and C', we can prove that the intersection of C and C' is a universe. So we can take the intersection of all universes to form the smallest universe. So we can define a something in U to be computable if it is in this smallest set.

So here's my first open question: Are there larger 'universes' than the smallest one, or is it unique? It's not hard to see that a universe is closed under (non-recursive applications of) negation, finite union, finite intersection and the function c above. My second open question: What other computable elements are there?

Time to stop this and start cooking dinner.

13 comments:

Pseudonym said...

The question is actually what does it mean to "solve" a recursive equation like X = 2^X. I recommend Chapter 10 (and hell, the rest of the book too) of Asperti and Longo's book, "Categories, Types and Structures". You can download it here:

http://www.di.ens.fr/users/longo/download.html

sigfpe said...

Pseudonym,

Aha! That looks like the chapter in Pierce's Basic Category Theory that I skipped. I guess I'll have to read it as soon as I get it back from the guy I lent it to.

augustss said...

I usually find it illuminating to try and draw the domain for a type like U to understand it better. Well, draw an approximation of it since it's infinite.

I've not tried in this case, because I'm lazy.

Anonymous said...

This is just a basic Haskell question, but I'm a little confused. If I load everything up to the definition of phi into ghci, I get the following:

*Main> :t f
f :: U -> U -> Bool

*Main> :t U
U :: (U -> Bool) -> U

I don't understand why f is that type. Why isn't it U -> Bool?

Peter Mc said...

Eager chap that I am, I drew a diagram of the domain. Rather than subject you to my ASCII art skills, I'll just describe what came out:

for all n >= m >= 0:

(c^n) (neg phi) `e` (c^m) (neg phi) ==> True
(c^m) (neg phi) `e` (c^n) (neg phi) ==> True

for all n > m >= 0:

(c^n) phi `e` (c^m) (neg phi) ==> True
(c^m) (neg phi) `e` (c^n) phi ==> True

for all n >= 0

(c^n) phi `e` (c^n) (neg phi) ==> True

every other statement of the form:

(c^n) ([neg]? phi) `e` (c^m ([neg]? phi))

evaluates to False.

So is the set (c^n) ([neg]? phi) a computable element of U, or am I missing the point here?

sigfpe said...

Peter Mc,

The point is that you've shown that various values in U are actually distinct from each other and the whole type doesn't just collapse down to something trivial.

Anonymous said...

I may be talking out of my a$$ on this one, but the first type you have defined seems to be a resonable computationnal model of ordinals: an ordinal is the set of ordinals smaller than it. Non well founded definition? No problem! the empty set satisfies every propriety we through at it (at least proprieties defined in terms of those of it's elements). You can then form the ordinal that contains only the empty set, etc. You can only form finite ordinals in this fashion, forming an infinite structure requires a special axiom (the infamous axiom of infinity in ZF!). But i digress.

The type U seems to be a well known example of inductive type without a positivity condition. The computationnal significance of this type is known to be connected to Russel's paradox, and thus a strongly typed system which enjoys termination can lose it in presence of this structure. This would even allow one to prove a false statement in a type-theory based theorem prover!

Unfortunately i have no very relevent references for all of this. You might want to try Coq'Art, by Yves Bertot and Pierre Casteran (not free :( ... )

P.S. Please excuse my speling mestakes...

Samuel Bronson said...

Hmm, some of these lines aren't wrapping right in IE6 -- at least, not in the normal view or in google reader, but the "show original post" link displays it with the lines wrapped properly (but in a different font).

janos said...

You could also define
cbar a = U $ \x -> not (a `e` x)

and compose arbitrary sequences of c and cbar. I wonder how interesting a program can be written in a restricted subset of Haskell where only the types U, Bool, and functions thereof are allowed. Is it Turing-complete?

Pseudonym said...

It's over a year later, but I have one more comment.

U -> Bool is not the same as 2^U in set theory, because it is not the collection of all functions from U to Bool. Rather, it's only the collection of all such computable functions.

Any computable function can be G&oump;del-encoded into an Integer, so the collection of all computable functions are isomorphic to some subset of Integers. Moreover, all Integers are computable. Therefore, by the Cantor–Bernstein–Schröder theorem, the collection of all computable functions is isomorphic to Integer.

So X = 2^X has a solution: X = Integer.

This is an important point for people learning category theory. The category of computable sets with computable functions is not the same as the category of sets with functions, and the main difference is the exponential objects.

g said...

http://en.wikibooks.org/wiki/Haskell/Denotational_semantics#Na.C3.AFve_Sets_are_unsuited_for_Recursive_Data_Types

sigfpe said...

Pseudonym,

The isomorphism of the Cantor–Bernstein–Schröder theorem is itself not computable. So X=2^X is not an isomorphism in the category of computable functions.

Vlad Patryshev said...

Good stuff, thanks.

Actually, if we don't care about keeping ZF as it is, foundation axiom may be extended without breaking any logic up to something like Karoubian closures... but not everything there has been proven yet.

Blog Archive