Sunday, February 26, 2006

The ambiguous operator, pt.2

In 1963 John McCarthy, the inventor of Lisp, published the paper A Basis for a Mathematical Theory of Computation in which he proposed the function (in the computer program sense of the word) amb(.,.). The idea is that amb(x,y) is first equal to x. But if later in the computation it is found that this leads to some sort of contradiction the value of x is retracted and replaced with y. This is a much more complex business than it may seem to be at first. Retracting a value essentially means winding back the entire state of the computation to where it was when amb returned the value x, and then slipping in the value of y. This means somehow freezing and copying the entire state when x was first returned. When a contradiction is found the entire state of the program is discarded and replaced with the frozen version which is reactivated. These frozen states are known as continuations. In many ways it's like a GOTO statement on acid. It can cause a jump to an arbitrary spot on your code. But continuations are nicer than GOTOs because they are more amenable to logical reasoning.

There are a number of languages that have support for continuations built in, including Scheme with its call-with-current-continuation function. But it takes a little care to understand exactly what is meant by "wind back" in this context. Although the state of the current computation is wound back, including things like local variables and the state of the functions currently being executed, global variables remain unchanged. So even though we wind back time and throw away much of the current state some information can still "leak through" from the path that we are now pretending never happened.

So, to get back to the devilish story of part 1: suppose we have a C keywords called TRY and FAIL with the bizarre property that TRY functions just like the return statement except that if a later FAIL statement is met your program winds back to the last TRY statement, undoing any effects caused by the previous return value, and continues execution after the TRY statement. The ambiguous operator would be implemented like this:


int amb(int a,int b)
{
    TRY(a);
    return b;
}


amb(a,b) would return a. But if a later FAIL is met it would unwind the computer back to its previous state and then continue execution with the return b line. Global variables would remain unundone. Given such keywords we can now define:


int devilChoice()
{
    printf("I choose B\n");
    TRY(B);
    printf("Sorry I meant to say A\n");
    TRY(A);
}


void devilCheats()
{
    FAIL;
}


But C has no such keywords and there's no way to write such functions - unless you cheat a bit. The required code is on my web site here. The trick is to freeze the state of the program by literally reaching in and taking a copy of the C stack. Remarkably this code runs fine, without modification, when compiled with gcc, mipspro or MSVC on IA32, AMD64, PowerPC and mips under Windows, MacOS X, Irix and Linux for all legal combinations of those compilers, processors and operating systems.

But despite the implementation of TRY and FAIL being an ugly hack they really are nice operators to have. Papers like Wadler's lend a little legitimacy to continuation shenanigans by showing there is a nice theory underlying them, and that by allowing them you bridge the gap between intuitionistic and classical logic in the Curry-Howard isomorphism. They can also make for some surprisingly simple implementations of problems. For example, using TRY you can write an elegant non-brute force C sudoku solver in a couple of lines. In fact, you find yourself able to write code that is declarative rather than imperative, even though you are working in C. The backtracking is handled behind the scenes by TRY and FAIL which can be nested recursively (as suggested in McCarthy's original paper) without any difficulty. I originally wrote that code in order to implement a pattern matcher similar to Mathematica's over ten years ago. It took me several years to figure out how to rewrite it without using continuations - using continuation passing style instead. (I had no Computer Science training, I hadn't even heard of this stuff.) Nowadays I'd just use a backtracking monad in Haskell...

For an implementation in Ruby see randomhacks. For code similar to my C code see Aubrey Jaffer's implementation (in C) of Scheme. You can also do the same thing semilegally through the setcontext() and getcontext() functions if they are available for your operating system. You can probably do something similar using threads or fork().

2 comments:

Jonathan said...

Could you use fork to implement this? i.e., before calling an amb() function, call fork. Tell the child process its alternate return value, and then have the process just wait around. When FAIL is called from the parent process, the parent should effectively kill itself and call the child process.

sigfpe said...

I think the amb function itself could call fork() though I haven't worked out the details. There might be a bit of work involved in juggling multiple processes when amb() is used recursively, but it doesn't sound impossibly hard. An issue with fork() is that the entire process is saved/copied/restored including global variables. You may or may not want this behaviour. A thread based solution would allow some sharing of variables between 'counterfactualities'. (Reminds me...I must write something here about counterfactual computation...)

Blog Archive