OK, I will explain. Consider the line of code, in C++, a += b
. We can write this in matrix form as
(a) = (1 1) (a)
(b) (0 1) (b)
If we take the transpose of the matrix we get
(a) = (1 0) (a)
(b) (1 1) (b)
which is the same as the code
b += a
. More generaly, any block of code that performs linear updates to variables can be converted to its adjoint by writing the steps backwards and transforming each line of code like this. A more complex example (exercise) is a = 2*a+3*b-4*c; b = 2*a
which maps to a += 2*b; b = 0; b += 3*a; c -= 4*a; a *= 2
. If your code has loops, eg. iterates over arrays of variables, you just run the loops backwards. The application for me is that there is a published fast algorithm for some linear operator but I need the adjoint. Surprisingly the adjoint of the fast algorithm can actually turn out to be a whole new fast algorithm.
Incidentally, I asked the journal editor for the expected turnaround time for the decision on whether to publish. He responded with the mean, standard deviation, extrema and quartiles. If only people like plumbers, doctors and cable installers could respond in this way. "If I attempt to fix this leak it has a 50% chance of taking 1 hour to fix, a 25% chance of taking 2 and there's a 5% chance I won't get it done today" rather than "I'm charging by the hour and it'll take as long as it takes unless there's a complication in which case it'll take longer".
This is just reverse mode AD. But I've never seen anyone else factor reverse mode AD as the product of two program transformations: forward AD followed by the adjoint. The adjoint step on its own is useful for applications besides differentiating.
ReplyDeleteMy description of loops is correct, as you say, for fixed loops. For iterate-until-fixed-point, the implementations I have seen simply 'record' the forward pass and 'replay' it backwards with the same number of iterations on the reverse pass (eg. by building a tree that represents the unrolled loop during the forward run). So I don't see that my description is off. However, it seems to me that you're talking about something more interesting than this simple minded approach. Maybe you have a reference. If you don't post back here I might have to email you...