Fun with Derivatives of Containers
      This is a neat paper.
So more thinking out loud...
Consider functions that map streams of X's to streams of Y's. As streams are a linear ordering of elements we can label positions in the output stream using natural numbers starting at zero. So a function that maps streams of X's to streams of Y's can be viewed as a function that maps a stream of X's, and a natural number, to a Y. We assemble the output stream by evaluating this function on the input stream for each natural number.
We could imagine generalising this construction to other container datatypes besides streamns. But streams have this particularly convenient way to address their elements. Each element of a stream is the nth element, for some n. For other structures such as trees it becomes more complicated to address particular elements in the tree. For a binary tree, say, we can specify an element by listing all of the binary decisions, left or right, that you need to make when descending to that element from the root of the tree.
Look at a slightly different construction. Instead of trying to address a particular element of a container consider the type which is containers of X's with one of the X's replaced by a 'hole'. For example a pair of X's can be written as X×X=X². A pair with a 'hole' is either something like (x,ˆ) or (ˆ,x) where ˆ is the 'hole'. So a pair with a hole is simply of type X+X=2.X. More generally, suppose for a given type of container F is the functor that takes a type X to the type F[X] of containers of X's. Then F'[X] is the type of containers of X's with one of the X's replaced by a hole, where F' is the derivative of F. By 'derivative' we mean derivative in the ordinary sense of calculus in that it's computed using the usual Leibniz rules. This is a completely amazing fact - I think it's one of the most amazing facts in all of computer science. I wrote about it in more detail here. That's loosely based on this paper, but my document is completely non-rigorous and probably full of errors.
There are all kinds of interesting algebraic properties that relate F and F'. For example if F'[X] is a container with a hole then X.F'[X] is a container with a hole and an X that could potentially be put into the hole. If we put the X into the hole then we don't just get an F[X] again because we still have the information about where the hole was. In other words, X.F'[X] is an F[X] together with the address of one of the X's within it. (Pity we can't write X.F'[X]-F[X] to mean just the address, but '-' makes no sense in this context.) In fact we have a projection map throwing away the address: X.F'[X]→F[X]. I think this map is a natural transformation from X.d/dX to 1.
Anyway, in the paper I started off with, the authors show that the functor X→X.F'[X] is a comonad. One of the key results in the paper can be summarised as defining a function called 'run' with this type signature:
In other words, given a function that maps a container, and the address of an X within it, to a Y, we can construct a function that maps containers of X's to containers of Y's. Intuitively this is obvious form what I said in the third paragraph above: we're specifying how for each location in an F[Y] its value should be computed from an F[X]. You can't construct any old function F[X]→F[Y] in this way. These are only the functions obtained by replacing each X-value in the original container with a Y-value. In other words they are maps F[X]→F[Y] that preserve the structure of the original container. For example, if F[X] is lists of X's, then we are talking about functions that map lists of length n to lists of length n. In other words, X.dF[X]/dX→Y is isomorphic to the type of 'structure-preserving' maps F[X]→F[Y]. I have to say, X.dF[X]/dX→Y is not what I would have expected this to look like a priori. In fact, it's pretty amazing that you can express the concept of a structure-preserving map as a regular type.
Anyway, the paper does something more useful, it shows how when F is a tree-structure then the function X.dF[X]/dX→Y can be defined 'locally' giving a really beautiful way of implementing attribute grammars. In this case the objects in F'[X] are called 'zippers'. In fact, the paper makes no reference to functors F and their derivatives, that's my generalisation. I'm not interested in the attribute grammar thing but instead I'm fascinated by the fact that there are these natural maps:
and wonder if there is a more general theory saying when such maps might exist.
Oh well, that's enough for now. I'm hoping to post some very high resolution pictures of quaternionic Julia sets some time soon - just gotta get permission to display them from my employer whose machines were used to render them.
    So more thinking out loud...
Consider functions that map streams of X's to streams of Y's. As streams are a linear ordering of elements we can label positions in the output stream using natural numbers starting at zero. So a function that maps streams of X's to streams of Y's can be viewed as a function that maps a stream of X's, and a natural number, to a Y. We assemble the output stream by evaluating this function on the input stream for each natural number.
We could imagine generalising this construction to other container datatypes besides streamns. But streams have this particularly convenient way to address their elements. Each element of a stream is the nth element, for some n. For other structures such as trees it becomes more complicated to address particular elements in the tree. For a binary tree, say, we can specify an element by listing all of the binary decisions, left or right, that you need to make when descending to that element from the root of the tree.
Look at a slightly different construction. Instead of trying to address a particular element of a container consider the type which is containers of X's with one of the X's replaced by a 'hole'. For example a pair of X's can be written as X×X=X². A pair with a 'hole' is either something like (x,ˆ) or (ˆ,x) where ˆ is the 'hole'. So a pair with a hole is simply of type X+X=2.X. More generally, suppose for a given type of container F is the functor that takes a type X to the type F[X] of containers of X's. Then F'[X] is the type of containers of X's with one of the X's replaced by a hole, where F' is the derivative of F. By 'derivative' we mean derivative in the ordinary sense of calculus in that it's computed using the usual Leibniz rules. This is a completely amazing fact - I think it's one of the most amazing facts in all of computer science. I wrote about it in more detail here. That's loosely based on this paper, but my document is completely non-rigorous and probably full of errors.
There are all kinds of interesting algebraic properties that relate F and F'. For example if F'[X] is a container with a hole then X.F'[X] is a container with a hole and an X that could potentially be put into the hole. If we put the X into the hole then we don't just get an F[X] again because we still have the information about where the hole was. In other words, X.F'[X] is an F[X] together with the address of one of the X's within it. (Pity we can't write X.F'[X]-F[X] to mean just the address, but '-' makes no sense in this context.) In fact we have a projection map throwing away the address: X.F'[X]→F[X]. I think this map is a natural transformation from X.d/dX to 1.
Anyway, in the paper I started off with, the authors show that the functor X→X.F'[X] is a comonad. One of the key results in the paper can be summarised as defining a function called 'run' with this type signature:
run :: X.dF[X]/dX → Y → F[X] → F[Y]
In other words, given a function that maps a container, and the address of an X within it, to a Y, we can construct a function that maps containers of X's to containers of Y's. Intuitively this is obvious form what I said in the third paragraph above: we're specifying how for each location in an F[Y] its value should be computed from an F[X]. You can't construct any old function F[X]→F[Y] in this way. These are only the functions obtained by replacing each X-value in the original container with a Y-value. In other words they are maps F[X]→F[Y] that preserve the structure of the original container. For example, if F[X] is lists of X's, then we are talking about functions that map lists of length n to lists of length n. In other words, X.dF[X]/dX→Y is isomorphic to the type of 'structure-preserving' maps F[X]→F[Y]. I have to say, X.dF[X]/dX→Y is not what I would have expected this to look like a priori. In fact, it's pretty amazing that you can express the concept of a structure-preserving map as a regular type.
Anyway, the paper does something more useful, it shows how when F is a tree-structure then the function X.dF[X]/dX→Y can be defined 'locally' giving a really beautiful way of implementing attribute grammars. In this case the objects in F'[X] are called 'zippers'. In fact, the paper makes no reference to functors F and their derivatives, that's my generalisation. I'm not interested in the attribute grammar thing but instead I'm fascinated by the fact that there are these natural maps:
X.dF[X]/dX → F[X] and X.dF[X]/dX → Y → F[X] → F[Y]
and wonder if there is a more general theory saying when such maps might exist.
Oh well, that's enough for now. I'm hoping to post some very high resolution pictures of quaternionic Julia sets some time soon - just gotta get permission to display them from my employer whose machines were used to render them.


7 Comments:
Much of this stuff originates with Andres Joyal, when he invented the concept of a species of structures, and showed how you can give a combinatorial interpretation of the calculus of formal power series. It's basically the theory of the category of (objects) finite sets and (arrows) isomorphisms between them.
I desperately want to learn more about this.
Yeah, Joyal found a nice way to interpret generating functions, not just as generating functions, which merely count the size of various structures, but as a functorial representation of the combinatorial objects themselves. Unfortunately Joyal's paper isn't available on the web (and my French isn't too good anyway) and his book is way too expensive (even on ebay) and I don't have easy access to a mathematics library. So most of what I know on this is second hand from John Baez.
There's this paper but it's over my head. (Need to read up on linear logic...)
I don't mean Joyal's book, I mean this.
You wrote:
(Pity we can't write X.F'[X]-F[X] to mean just the address, but '-' makes no sense in this context.)
Since having two information is represented by multiplication (product type), one would rather expect to divide than subtract, giving X.F'[X]/F[X]. If we don't mind just canceling those Xs, that's even useful for tuples. For sum types, we need the fact that the terms are related and have to 'divide' like this: (a+b)/(c+d):=a/c+b/d.
While I have no good justification for this, it works for sums of tuples like Maybe, i.e. 1+X: X.(0+1)/(1+X)=(0+X)/(1+X)=0/1+X/X=0+1=1.
Note also that the function you call projection is not generally onto, its image is what might be called F[X]-F[0].
(Subtraction is useful if it somehow is of the form A+B-A.)
I had my last comment in my head for quite a while before I wrote it down. But once again, writing down something helped rethinking it, I see it much more clearly now:
If X.F'[X] is an F[X] with an address, and we only want the address, let's forget the "content" via the map X->1, which induces a map X.F'[X]->F'[1].
A value of type F'[1] represents a position in an F[X] including the shape. For example, if F[X]=X^2+X^3 (pairs and triples), it doesn't just say something like "the first component", but also if it belongs to a pair or triple.
Not sure if that is exactly the "address" you wanted, but it seems to be something a category theorist should be happy with.
I just wanted to check my understanding of this stuff; but shouldn't the type signature of run be:
run :: (X.F'[X] -> Y) -> F[X] -> F[Y]
?
(i.e. isn't -> right associative?)
cheers,
Duncan
Duncan,
You're right. Not sure how I missed that.
Post a Comment
<< Home