Saturday, March 06, 2010

Products, Limits and Parametric Polymorphism

`> {-# LANGUAGE RankNTypes #-}`

When I wrote about memoizing polymorphic types I mentioned that you can think of forall a. F(a) as the product over all types a of F(a), where F is some type level function. For example F might be a type constructor like []. That's not completely accurate, as I hope to now explain. Along the way we should get some insight into the meaning of the limit of a functor in a category.

Suppose we have two types, A and B. We can form their product (A, B). We have the two projections fst and snd and if we have an element x in (A, B) we know that there is no necessary relationship between fst x and snd x. We can freely choose x so that each of fst x and snd x can take on any values we like in A and B.

But now consider an element of forall a. F(a). For each concrete type X we have a projection πX::forall a. F(a) -> F(X). So it looks like a product of all types. However, we can't freely choose elements of forall a. F(a) so as to get any element of F(X) we like for each choice of X. To demonstrate this, consider an element x of the type forall a. [a]. For any choice of X we get a projection. For example, picking X to be Int or String gives:

`> p1 :: (forall a. [a]) -> [Int]> p2 :: (forall a. [a]) -> [String]> p1 x = x> p2 x = x`

We can draw a simple category diagram representing this:

However, forall a. [a] comes with a free theorem. For this particular type we have a free theorem that says that for any f :: X -> Y, fmap f (p1 x) == p2 x. For example consider the well known function show :: Int -> String. The free theorem tells us that this diagram commutes:

So if p1 x == [3] then p2 x == ["3"]. We have lost free choice. But we have lost a lot more freedom than this. We have a commuting triangle like this for absolutely any function f :: X -> Y. It should be clear that there is no way we can pick elements of our list to satisfy all of these constraints. So x must be the empty list.

Cones
This scenario of having one projection for each type has a name. It's an example of a cone. Let's borrow the definition from Wikipedia:

Let F:J→C be a functor. Let N be an object of C. A cone from N to F is a family of morphisms with one morphism for each X,

πX:N→F(X)

so that for every morphism f:X→Y, the following diagram commutes:

For any Haskell functor F, the free theorem tells us that we have exactly these diagrams with forall a. F(a) playing the role of N. So forall a. F(a), with its projections to F(X), forms a cone.

Limits
If F is an instance of the Haskell Functor type class, ie. an endofunctor on Hask, then the type forall a. F(a) gives us a cone. But not just any old cone. I don't know how to prove this but I'm pretty sure it's true that the free theorems for a functor are the only non-trivial relations between F(X) and F(Y) that we're forced to obey. If that's true, then, in a sense, forall a. F(a) is the "biggest" type satisfying the free theorems. We can make this more precise by saying that for any cone N with associated projections πX, we can can map it uniquely to forall a. F(a) so that the following diagram commutes:

This special kind of cone has a name. It's called a limit. In other words, ∀a. F(a) = lim F. In fact, this is exactly how Limit is defined in category-extras.

In a sense you can think of a limit of a functor in any category as being like a product for which a version of the free theorems for a functor holds.

Colimits
A dual story can be told for existential types and colimits. But to do this we need free theorems for existential types. I'll leave that until I've figured out a nice way to derive these free theorems...

Final words
I should have written this article before I wrote one on coends. Think of it as a prequel.

The fact that we don't have complete freedom of choice when defining polymorphic elements in Haskell is what we mean by 'parametric' polymorphism. Instead of specifying one individual value for each type we define the elements in a uniform way. In a language like C++ we can use template specialisation to freely construct a rule for getting a value from a type using 'ad hoc' polymorphism. That freedom comes at a price - it becomes harder to reason about polymorphism.

It's amazing that many definitions in category theory emerge naturally (pun fully intended) from the free theorems. I keep hoping that one day I'll find a paper on exactly what is going on here that I understand. The original free theorems paper is very uncategorical in its language.

Derek Elkins said...

There's an easy and direct way to prove that ∀a.F(a) is the limit.

We can describe cones in a nicely categorical way as simply natural transformations.

A cone N over the functor F is the same thing as a natural transformation, KN → F where KN is the constantly N functor. Now L is a limit of a functor F, if Hom(-,L) ~ Nat(K-,F), which is to say, if L represents the cone functor. This can be shown to be equivalent to the definition of limit in your post.

Nat(K-,F) is ∀a.K b a → f a or just ∀a.b → f a in Haskell. Then, essentially by continuity of Hom or by continuity of right adjoints, we can move the ∀ inwards and get: b → ∀a.f a, which, by Yoneda, means L is isomorphic to ∀a.f a. A similar manipulation can be done for colimits, and indeed, this was how Kan extensions, (co)ends, and (co)limits were derived for category-extras.

Sunday, 07 March, 2010
sigfpe said...

Janis Voigtländer pointed me to Types, abstraction, and parametric polymorphism, part 2 that might be just what I am after.

Sunday, 07 March, 2010
gelisam said...

Suppose i has type (∃a. a → a). For example, it might be (+1), or (++"foo"). Then there exists a function f of type X → Y, for some types X and Y, such that f (i x) == i (f x). f = id, for example, does the trick.

Similarly, suppose xs has type (∃a. [a]). For example, it might be [1, 2, 3], or ["foo", "bar"]. Then there exists a function f of type X → Y, for some types X and Y, such that fmap f xs == xs. f = id, for example, does the trick.

All in all, free theorems for existential types don't seem that useful (so far).

Tuesday, 09 March, 2010
sigfpe said...

I've already guessed the free theorems for those cases. A more interesting case is exists a.(a->X, F a) for functor F which I mentioned in my coend post. It gets more interesting for exists a.X a where X is an interesting algebraic structure of some sort.

In each case, I have a handwavey argument for what the free theorem should be. But I haven't derived them formally yet.

In some cases I think they are useful but they encode knowledge about the notion of an abstract interface that object oriented programmers already informally use.

When I work out a rigorous and simple derivation I'll write about it.

Tuesday, 09 March, 2010
sigfpe said...

gelisam,

A nice example that illustrates what I'm talking about. Try guessing the free theorem for

exists a. (a, a -> a, a -> X)

where X is a fixed type of your choosing.

Or try

exists a. ([a], a -> a -> a, a -> X).

Tuesday, 09 March, 2010
gelisam said...

Wait a minute; it's even worse than I thought. We're not free to functionalize the relation since all we know is that there exists some relation such that so and so. Here is a detailed example of the mistake.

The free theorem for (1) is (2), which we can specialize to (3) by instantiating R as in (4).

(1) id : ∀ A → (A → A)
(2) id~ : ∀ X Y R
→ (x : X)
→ (y : Y)
→ R x y
→ R (id x) (id y)
(3) id~' : ∀ X Y
→ (f : X → Y)
→ (x : X)
→ (y : Y)
→ f (id x) == id (f x)
(4) R x y = (f x == y)

But the free theorem for (5) is (6), which we can not specialize to (7) by instantiating R as in (4).

(5) i : ∃ A → (A → A)
(6) i~ : ∃ X Y R
→ (x : X)
→ (y : Y)
→ R x y
→ R (id x) (id y)
(7) i~' : ∃ X Y f
→ (x : X)
→ (y : Y)
→ f (i x) == i (f x)

Tuesday, 09 March, 2010
gelisam said...

This comment has been removed by the author.

Tuesday, 09 March, 2010
sigfpe said...

Consider

data A = forall a. A (a, a->Z)

Then we have that for all f:X -> Y,

A (f a, g) = A (a, g . f)

This is a non-trivial theorem (in the form of a universally quantified proposition) about an existential type.

Wednesday, 10 March, 2010
Dan Doel said...

Are you sure that's a theorem? If f : X -> Y, it seems to me we can't possibly use it with (a, g) : exists a. (a, a -> Z) because:

1) We don't know if a = X, Y or something else entirely.
2) Even if we did, then

(f a, g) : (Y, X -> Z) (in the a = X case)
(a, g . f) : (Y, X -> Z) (in the a = Y case)

Neither of which has the right type to be exists a. (a, a -> Z). And in the other two combinations of rules with instantiations of the variable, there are just straight-forward type errors.

I could believe something like:

f : forall a. a -> a

(f a, g) = (a, g . f)

Perhaps this can even be derived in a roundabout way from the free theorem for

forall r. (forall a. a -> (a -> Z) -> r) -> r.

All that said, I don't think it's surprising that a type like 'exists a. a -> a' doesn't have an interesting free theorem. That type is isomorphic to the unit type. What is the free theorem for ()?

Wednesday, 10 March, 2010
Dan Doel said...

(Sorry for the double comment.)

Oh! I see what you're saying. You're saying that for:

f : X -> Y
g : Y -> Z
a : X

(f a, g) = (a, g . f) as existentials.

Yes, I can believe this.

Wednesday, 10 March, 2010
sigfpe said...

Dan,

So can you derive that theorem about ∃a.(a,a->Z) from the free theorem theorem?

I think these theorems for existentials are non-trivial and useful. They encode the folk knowledge that if a type is hidden behind an interface, and you can replace the type with an isomorphic one, people in front of the interface won't be able to tell. The free theorem theorem should tell you the precise meaning of 'isomorphism' for each type. (Actually, it doesn't have to be an *iso*morphism, just a morphism that doesn't lose too much information.)

Wednesday, 10 March, 2010
gelisam said...

@Dan: The free theorem for () is that () == (). This is because constant types, like Bool and (), are read as identity relations, which are only satisfied when both arguments are equal (see Wadler's paper, page 5, first paragraph).

Wednesday, 10 March, 2010
Dan Doel said...

I still have a rudimentary understanding of free theorems. However, if you ask lambdabot for the free theorem for my encoding:

ep : forall r. (forall a. a -> (a -> Z) -> r) -> r

you get back:

(forall g z h. f (x z (h . g)) = y (g z) h) => f (ep x) = ep y

which obviously has something similar to your interchange in there (x z (h . g) vs. y (g z) h).

But, really, your theorem is probably about the constructor, or whatever operation you want to call it, for putting things in an existential box. Thus we should probably look at:

econ :: forall t. t -> (t -> Z)
-> (forall r. (forall a. a -> (a -> Z) -> r) -> r)

And what @free spits back at us is:

(forall k u p. h (y u (p . k)) = z (k u) p)
=> h (econ x (g . f) y) = econ (f x) g z

But, if we ask for the free theorem of:

z :: forall a. a -> (a -> Z) -> R

we get:

z x (g . f) = z (f x) g

Which means that we can choose h = id and y = z in the free theorem for econ, giving:

forall z. econ x (g . f) z = econ (f x) g z

And then by extensionality, we get:

econ x (g . f) = econ (f x) g

So it turns out your theorem isn't as general as possible, I guess. But we've done it. :)

Wednesday, 10 March, 2010
gelisam said...

I need to distinguish the A type from the A constructor, so let me give them distinct names.

data T = forall a. C (a, a->Z)

The theorem you gave is the free theorem for C, not T. The fact that T is an existential type plays no role in the derivation of the free theorem, since the type of C begins with a universal quantifier. If it is theorems like the one you gave which you want to derive, it suffices to follow the technique given in Wadler's paper, treating T as a constant type.

For example, here is how to derive the theorem you gave using parametricity.

The free theorem for (1) is (2), which can be specialized to the theorem you gave, (3), by instantiating R as in (4).

(1) C : forall a. (a, a→Z) → T
(2) C# : forall X Y R
→ (x : X)
→ (y : Y)
→ R x y
→ (zx : X→Z)
→ (zy : Y→Z)
→ (forall u v → R u v → zx u == zy v)
→ C (x, zx) == C (y, zy)
(3) C#' : forall X Y R
→ (x : X)
→ (zy : Y→Z)
→ C (x, zy . f) == C (f x, zy)
(4) R x y = (f x == y)

Wednesday, 10 March, 2010
sigfpe said...

Yes, I'd got as far as realizing these were theorems for the constructor type. But when I used the free theorem generator I was using it was taking me a long time to interpret the result. I just tried @free on #haskell and I'm finding it much easier to parse the result. I'll have to install the underlying generator.

Wednesday, 10 March, 2010
Dan Doel said...

I guess the interesting properties of existentials may be related to free theorems about their 'constructors'. For instance, one likes to think of (exists a. a) as isomorphic to the unit type. And the free theorem for:

c :: forall a. a -> R

is c = c . f, which if we take f to be of the form 'const y', can turn into

forall x, y. c x = c y

which, if c is the 'constructor' for exists a. a, states that all elements of that type are equal.

forall a. (a -> a) -> R

comes with a similar theorem, although it has a precondition that might limit it from saying that all elements are equal (I'm not really sure).

Wednesday, 10 March, 2010
gelisam said...

Dan, restricting yourself to the functional specialization of the free theorems is slowing you down.

The free theorem for (1) is (2), which can indeed be specialized to (3) by instantiating R as in (4). But if you instead let R be always satisfied, as in (5), you immediately get (6), a proof that c always returns the same value.

(1) c : forall a. a → R
(2) c# : forall X Y R
→ (x : X)
→ (y : X)
→ R x y
→ c x == c y
(3) c#' : forall X Y f
→ (x : X)
→ c x == c (f x)
(4) R x y = (f x == y)
(5) R x y = true
(6) c#'' : forall X Y
→ (x : X)
→ (y : Y)
→ c x == c y

The second type (7) you state also guarantees that results are always the same, but as you have noticed, the functional specialization (9) of its free theorem (8) hides this fact behind an assumption. If you once again let R be always satisfied, as in (5), you easily obtain (10), a proof that d always returns the same value.

(7) d : forall a. (a → a) → R
(8) d# : forall X Y R
→ (fx : X → X)
→ (fy : Y → Y)
→ (f# : forall x y → R x y → R (fx x) (fy y))
→ d fx == d fy
(9) d#' : forall X Y f
→ (fx : X → X)
→ (fy : Y → Y)
→ (f# : forall x → f (fx x) == fy (f x))
→ d fx == d fy
(10) d#'' : forall X Y
→ (fx : X → X)
→ (fy : Y → Y)
→ d fx == d fy

Thursday, 11 March, 2010
Daniel Brown said...

You all should have a look at Plotkin, Abadi. "A logic for parametric polymorphism", TLCA 1993.

Although it has no diagrams, all the constructions are essentially categorical. And the free theorem for existentials is at the bottom of p. 11 below Thm 7—see the brief paragraph broken over pp. 11–12.

Moreover, this paper finally enabled me to understand parametricity (i.e. Reynolds' "Abstraction Theorem", which Wadler renamed the "Parametricity Theorem"). Only after reading Plotkin and Abadi was I able to make sense of Reynolds' original papers and Wadler's follow up. In case anyone finds it helpful, I also have an annotated bibliography for these papers from a course last spring.

Monday, 13 September, 2010