Comments on A Neighborhood of Infinity: Perturbation confusion confusion

To be fair, that "impossible to compute deriv...

2012-11-20T08:05:00.414-08:00

To be fair, that "impossible to compute derivative" proof relies on assumption under which it is also impossible to compute, say, 1/x, or to test if x is positive. These are interesting results, but I don't think they actually imply that it is impossible to compute derivative functions automatically, in the sense we desire.

@Anonymous I don't know a canonical source of ...

2011-07-26T09:12:39.012-07:00

@Anonymous I don't know a canonical source of the theorem but here's one I found through web searching: http://eccc.hpi-web.de/resources/pdf/ica.pdf

This book goes into quite a bit of detail: http://www.amazon.com/Computable-Analysis-Introduction-Theoretical-Computer/dp/3540668179/

One way to think of it is that differentiation is itself not uniformly continuous. You can make arbitrarily small perturbations to a function and wildly change its derivative.

Dan, can you point me to a proof that there is no ...

2011-07-26T06:45:16.008-07:00

Dan, can you point me to a proof that there is no way to implement differentiation in a way that is also a function, or perhaps name the theorem in question?

Thanks in advance!

Barak, If by glib you mean that more could be sai...

2011-07-01T06:36:42.241-07:00

Barak,

If by glib you mean that more could be said on the subject, then of course you are correct. There is no computable function for differentiation of type Num a=>(a->a)->(a->a) and this ultimately reflects the fact that classically, differentiation is not uniformly continuous, and constructively, there is no differentiation function at all.

A mathematical proof that there is no way to implement differentiation in a way that it is also a function (in the mathematical sense) seems like a good justification for using lifting to me.

It seems a bit glib to say that the type of diff i...

2011-07-01T04:31:45.191-07:00

It seems a bit glib to say that the type of diff is "Num a=>(D a->D a)->(a->a)" and use that to justify the needed lifting etc. Because the type of diff is, mathematically speaking, "Num a=>(a->a)->(a->a)". The dual numbers are an implementation technique being inappropriately exposed in the signature of the API. This is the root cause of the modularity and nesting problems that make AD so brittle in most current systems.

@Oleksandr You did a good job of convincing me :-)...

2011-05-01T07:23:37.432-07:00

@Oleksandr You did a good job of convincing me :-)

More generally, we can also build jet bundles. It's also fun to look at Lie algebras built this way from Lie groups: http://blog.sigfpe.com/2008/04/infinitesimal-rotations-and-lie.html

@Anonymous No, it's not just about literals.

2011-05-01T07:16:54.060-07:00

@Anonymous No, it's not just about literals.

Dan, thank you for your reply. In fact, I knew th...

2011-05-01T04:55:22.023-07:00

Dan, thank you for your reply. In fact, I knew the answer to my question; I only wanted to emphasize that the type system of Haskell does not rule out perturbation confusions -- you still need to carefully insert lifts in appropriate places. I must admit that I haven't written any substantial program in Haskell that uses AD, so it may be that in practice lifts don't present a real problem. I guess it is time to write such a program.

By the way, I happen to be a PhD student of Barak, and one of the things I've been thinking about recently is typology of AD. There are a few interesting ideas floating around. For example, dual numbers are really the tangent bundle over the real line. What if we try to extend the tangent bundle construction to other types? Tangent bundle is a functor T from the category of smooth manifolds to itself. Furthermore, from the point of view of synthetic differential geometry, it is a representable functor represented by the "set" D of nilsquare elements. In particular, it enjoys some nice properties. Obviously, it commutes with products, so one can define T (a, b) = (T a, T b) for types. More interestingly, if follows from the representability that T (a -> b) is isomorphic to a -> T b. Furthermore, the representability of T implies that it admits the structure of a monad, of which lift is the unit. This structure can also be derived directly, without appealing to synthetic differential geometry. I can go on.

Ultimately I would like to have these ideas embodied in the form of a type system that would allow us to reason about AD. This dream is still under construction...

CC Shan gave a pretty good overview of the issue a...

2011-05-01T04:05:58.651-07:00

CC Shan gave a pretty good overview of the issue and presented (even if he did not invent) the approach used to resolve it that I currently use, which is to use universal quantification to keep you from mixing up your infinitesimals, here:

http://conway.rutgers.edu/~ccshan/wiki/blog/posts/Differentiation/

In http://hackage.haskell.org/package/ad, I make that quantified infinitesimal do double duty as the differentiation mode. (That said, with hindsight, I would like have figured out a way to quantify over the mode separately to enable fewer things to have to dispatch with the dictionary.)

Is the problem Haskell's overloading of numeri...

2011-04-30T23:01:24.532-07:00

Is the problem Haskell's overloading of numeric literals? I.e. would explicitly annotating them always fix things?

@Oleksandr Just inserting lifts anywhere to make t...

2011-04-30T18:32:05.944-07:00

@Oleksandr Just inserting lifts anywhere to make the expression typecheck isn't good practice. It's easy to write code that typechecks but isn't correct.

The easiest rule i can suggest for now is that you need to place it around any real-valued variable involved in the computation that the lambda captures from its environment. In this case 'lift x'. It'd be nice if the type system could spot when we need this but I don't know if it can. I've never had any difficulty applying this rule but now I'm concerned that there may be situations where it's hard to apply it. I can't think of one.

I get a slightly different error message for examp...

2011-04-30T17:23:48.916-07:00

I get a slightly different error message for example1:

*Main> d (\x -> x*(d (\y -> x+y) 1)) 1

<interactive>:1:12:
    Occurs check: cannot construct the infinite type: a = D a
      Expected type: D a
      Inferred type: a
    In the second argument of `(*)', namely `(d (\ y -> x + y) 1)'
    In the expression: x * (d (\ y -> x + y) 1)

And now something curious happens: based solely on this error message my initial, arguably naive guess would be to insert lift around the second argument of (*), like this:

*Main> d (\x -> x * lift (d (\y -> x+y) 1)) 1
2

This even typechecks, but produces the wrong answer. How do you know where to insert lift? The type system does not help you here.

Chris, > the Haskell code really does give an ...

2011-04-30T14:18:08.256-07:00

Chris,

> the Haskell code really does give an incorrect answer.

That's no incorrect answer. You've asked to evaluate the derivative at a weird value, not the integer 1. But I agree that it could be misread as 1, especially to someone not familiar with the Num typeclass. So you've convinced me that there are indeed dangers here. I've modified my position with an update.

If you write some Haskell code that requires lifts...

2011-04-30T13:17:15.377-07:00

If you write some Haskell code that requires lifts (that includes code like that using monad transformers or using fmaps with functors) but deliberately leave out the lifts and then at the end hunt around at random to find places where you can insert lifts to keep the type checker happy, then you can expect buggy code.

S&P exhibit the function called constant_one with type signature: Num a => D a -> a. The type signature should make you suspicious. In fact constant_one (D 10 1), say, is not 1. You don't fix the code by inserting a lift at "at the point of type violation". I'm not sure what to call that kind of debugging. When I debug code I think about the semantics of what I'm doing.

Note that you can write an analogous version of should_be_one for the symbolic differentiator too.

Anonymous, I wrote what I think is a very accessib...

2011-04-30T13:07:39.634-07:00

Anonymous, I wrote what I think is a very accessibly introduction some 4 or so years ago.

http://cdsmith.wordpress.com/2007/11/29/some-playing-with-derivatives/

Here's an example (by Barak Perlmutter, which ...

2011-04-30T13:06:04.672-07:00

Here's an example (by Barak Perlmutter, which he provided me some time ago) where the Haskell code really does give an incorrect answer.

d (\x -> (d (x*) 2)) 1

This gives an incorrect answer when you keep the rank 1 type assigned by type inference to d. However, if you write the rank 2 type that d should logically have:

dd :: Num t => (forall a. Num a => a -> a) -> t -> t

then it fails with an error as well.

Where can I find a nice accessible introduction to...

2011-04-30T12:56:59.085-07:00

Where can I find a nice accessible introduction to AD? It intrigues me.

I was also puzzled by the problem claimed in that ...

2011-04-30T12:44:10.359-07:00

I was also puzzled by the problem claimed in that paper. But then at the end of the paper S&P presented Haskell code that they claimed would demonstrate the problem.

I didn't try out this code. Does their presented code actually demonstrate a problem? Or does it fail with a type error such as the ones you showed? Or something else?