A Neighborhood of Infinity: February 2007

Wednesday, February 28, 2007

Monads in C, pt. II

This is a quick followup to a post on reddit. This time a crude approximation to the Maybe monad implemented in C:


include <stdio.h>

typedef struct {
   int something;
   int just;
} MaybeInt;

MaybeInt returnMaybe(int i)
{
   MaybeInt r;
   r.something = 1;
   r.just = i;
   return r;
}

MaybeInt nothing()
{
   MaybeInt r;
   r.something = 0;
   return r;
}

MaybeInt bind(MaybeInt (*f)(int),MaybeInt x)
{
   if (x.something)
   {
       return f(x.just);
   } else
   {
       return nothing();
   }
}

MaybeInt print(int i)
{
   int written = printf("%d\n",i);
   if (written>=0)
   {
       return returnMaybe(written);
   } else
   {
       return nothing();
   }
}

MaybeInt printplus_bad(int i)
{
   MaybeInt x = print(i);
   return print(x.just); /* cheating! */
}

MaybeInt printplus(int i)
{
   MaybeInt x = print(i);
   return bind(print,x);
}

Again the idea is that printplus() is a version of printplus_bad() that uses just the two function bind/return interface to the MaybeInt type to achieve its effect. This time, instead of simply tainting IO with a particular type, the Maybe monad is able to deal gracefully with failure. If, for some obscure reason, printf() fails, returning an integer less than zero, it returns an object representing this fact. If, as in printplus(), you have two calls to print() in a row, bind() handles all the plumbing for you automatically. The really important thing is this: the implementation of printplus() is identical to my previous example, and yet the semantics is quite different because printplus() is able to bail out early. This bailing out is completely hidden inside bind().

I hope that this gives some hint of what monads can do to even hardcore non-functional programmers. If not, I'll probably write another example soon.

(Remember of course that this isn't meant to be practical code. It was a response to someone who wanted to at least some some C code for monads to get an idea of what they're about.)

Labels: monad

Sunday, February 25, 2007

Monads for vector spaces, probability and quantum mechanics pt. I

I've been enjoying Eric Kidd's articles on probability theory with Haskell. So I thought I'd follow them up with two things: (1) finding the underlying algebraic structure and (2) showing how it's general enough to do more than just probability.

Firstly, Eric found a really neat factoring of the probability monad as a 'product' of two monads: the Perhaps monad and the List monad. This factoring can be interpreted algebraically.

A monoid is defined to be a set m with a binary operator (typically written as abuttal, ie. the product of a and b is ab) that is associative, and with an element, 1, that is an identity for the binary operator. The List monad gives rise to the monoid freely generated by a set, in other words it defines the smallest monoid containing the set and with no extra relationships between the elements that don't come from the definition of a monoid. The binary operator for List is called ++ and we embed the original set in the monoid using the function \x -> [x], otherwise known as return. [] is the identity. It's not hard to see that we have asociativity as (a ++ b) ++ c == a ++ (b ++ c). If we use Set instead of List then we get the free commutative monoid instead, ie. one where a+b=b+a.

(Eh? Where's my paragraph on m-sets. Don't tell me I have to format all those HTML subscripts again! I hate it when computers do that. I wrote a paragraph and it just vanished. Oh well, maybe version 2.0 will be better. But without subscripts this time...)

If m is a monoid then an m-set is a set s with an action of m on it. An action of a monoid on a set is a scheme that converts each element of the monoid, say g, into a map g^*:S→S so that 1^* is the identity map and g^*h^*=(gh)^*. The type Writer m s corresponds to pairs in s×m but this also doubles as the free m-set. We simply define f^*(x,g)=(x,fg). It's free because we never get f^*(x,f')=g^*(x',g') unless x=x', so we never get any surprising equalities that aren't inherent in the definition of an m-set. Perhaps, which is actually a pseudonym for Writer Float, defines an R-set, where R is the monoid of reals under multiplication.

In an earlier post I showed how you could layer up algebraic structures. There I showed that you could combine a commutative monoid structure with a monoid structure to get a semiring. So what happens if we layer up a commutative monoid and an m-set? We'll get something that has a commutative binary operator but that can also be acted on my elements of m. Specialise to the case when m is a field like the real or complex numbers. Does this sound like a familiar structure? I hope so, it's a description of a vector space. What Eric Kidd has shown is that we can build a vector space monad by applying an m-set monad transformer (with m a field) to a commutative monoid monad. (This isn't a completely trivial fact and it depends on the fact that the definition of PerhapsT handles distributivity correctly.) I think that's pretty neat. But I'm getting slightly ahead of myself here as I haven't shown what Eric's stuff has to do with vector spaces.

A probability distribution can be thought of as a vector with outcomes forming a basis. Any distribution attaches a probability 'weight' to each possible outcome in the same way that a vector can be written as a weighted sum of basis elements. Each time we have a probabilistic transition, we're effectively multiplying our distribution vector by a stochastic matrix. Eric (and others)'s monad allows you to write these matrix multiplications in a very natural way.

A vector space forms a monad in a straightforward way. If V(B) is the vector space generated by basis B, and V(C) is another vector space, then any function B→V(C) can be lifted to a linear map V(B)→V(C). This lift is clearly of type (B→V(C))→(V(B)→V(C)). With a little flip we get V(B)→(B→V(C))→V(C). This is the main step in showing V to be a monad. And what the probability monad allows us to do is write our stochastic matrices in terms of what happens to the individual basis elements (ie. outcomes) instead of having to write out the entire matrix.

Anyway, this is all just waffle and it really needs some code to make it more concrete. I have an ulterior motive here. It's not just probability theory that is described by vector spaces. So is quantum mechanics. And I plan to work my way up to defining quantum computers and implementing a bunch of well known quantum algorithms in Haskell. In the next installment I expect to get at least as far as writing the code to play with the square root of NOT operator.

PS This idea of layering up algebraic structures is one I haven't found in the textbooks yet. I'm guessing it's in Mac Lane, but I haven't yet justified the cost of that book to myself. But I don't actually have a clue what a category theorist would call a monad transformer and don't recall reading about them in any texts that weren't specifically about computer science. Maybe someone else can fill me in. I do know that it's absolutely standard to get free algebraic structures from monads.

Errata: As notfancy points out, I should really be talking about multisets rather than sets, otherwise we have a+a=a. As programmers often use lists to represent both sets and multisets it can sometimes get confusing.

Labels: monad, physics

Sunday, February 18, 2007

The Essence of Quantum Computing

A popular view of quantum computation is that its power (at least the power of certain well known quantum algorithms) derives from a form of parallel computation coming from the fact that the state of a quantum computer can be a superposition of many classical states. I want to argue that this is in fact a bit of a red herring because there are other models of computation that allow formal superpositions and yet clearly don't provide a form of parallel computation. Instead, the essence of quantum computing is derived from a kind of non-convexity property and this is the common feature of all of the well known quantum algorithms that isn't shared by classical algorithms.

Before continuing there's something important I should say. Let me quote Bennett et al. in Strengths and Weaknesses of Quantum Computing

"The power of quantum computers comes from their ability to follow a coherent superposition of computation paths."

In other words, what I am saying is in direct contradiction with what is said by some of the founding fathers of quantum computing. Actually, I think that's a good thing, it means that whether I'm right or wrong, I must be saying something non-trivial.

Consider a probabilistic computer. We can represent the probability of its being in any state by an element of a vector space whose basis is the set of possible states and where the coefficient of each basis element is the probability of the computer being in that state. Given any pair of vectors that represent possible ditributions of states. v₁ and v₂, then av₁+bv₂, where a,b>=0 and a+b = 1 is also a possible distribution.In other words we can form superpositions exactly analogously to the way we can form superpositions of quantum computer states. The mathematics that describes the notion of a quantum computer performing computations in parallel by superposing N states is formally almost identical to the mathematics describing a probabilistic computer as a superposition of states.

So if we can superpose N states in both classical probabilistic computers and quantum computers, why do neither seem to offer an N times speedup? In the case of quantum computers one popular explanation is that then though we run N states in parallel, we can't perform I/O with those N states in parallel. When we observe a quantum computer we cause a wavefunction collapse (it doesn't matter whether you are a believer in such collapse, that is how we model quantum computers) and instead we make a single observation from a probability distribution. Mathematically this is identical to what happens in a probabilistic machine - a superposition is nondeterministically mapped into one of many definite states.

So whatever the difference is between probabilistic computers and quantum computers, it doesn't lie in the fact that you can form superpositions. Mathematically speaking, probabilistic computers and quantum computers are almost identical in this respect.

But we know that quantum algorithms can do things that no probabilistic algorithm can, so there must be some difference. There is of course. Here's an approach to seeing what it is.

Consider a probabilistic branching process:


    __ A
   /  
  2
 / \__ B
1   __ A
 \ / 
  3
   \__ C

Starting at state 1 we can either go to state 2 or state 3, and from there we end up at either A, B or C. Assume that the choice of branch is nondeterministic and that we can assign probabilities to each branch. Suppose also that the branch probabilities are all non-zero. The important thing to note in that diagram is that there are two ways to reach it. The probability of reaching A is the sum of the probabilities for each of the different ways to reach A. The key point I want to make here is that A is more likely because there are two ways to reach it. If I were to replace one or other of the A's with a D then the likelihood of finding our process in state A at the end is reduced.

But in quantum mechanics we replace probabilities with complex 'amplitudes'. When we make our observation at the end, the probability of any outcome is given by the modulus of the amplitude squared. The amplitudes still add, just as in probability theory. But this has a very important consequence. If the amplitudes for the two different ways to reach A lie in opposite directions in the complex plane, then the probability of ending up in state A after observation is reduced by having two ways of reaching A. This is completely different to the probabilistic case. We can paradoxically make something less likely by adding a new way to achieve it. This is where the essence of quantum computing lies.

Pick up just about any paper describing a quantum algorithm and you'll see that the important step is a 'conspiracy' to make the amplitudes of 'undesirable' states sum to zero (or at least something small). Probably the simplest example is counterfactual computation where it is arranged that the computation is executed along two different branches with equal and opposite amplitudes. The neatest example is probably Shor's cycle discovery algorithm that forms the basis of his factoring algorithm. By clever use of the Fourier transform he arranges that branches leading to non-cycles cancel each other out leaving only the periods of cycles with high amplitudes.

However, I think it's important to stress that given any individual quantum computer Bennett et al. and I probably don't disagree on what it actually does. This is an issue of semantics. But semantics does make a difference. The notion of a quantum computer as a super-parallel machine is one being promulgated in the popular science press and I think this causes much confusion with people expecting all kinds of algorithms to run in O(1) time.

So why do people talk about parallel computation if that's not what quantum computer do? I think it's because people aren't comfortable with quantum computers yet. This means they don't want to think of quantum computers as something fundamental and irreducible and instead want to reduce them to something classical. The easiest classical model is the parallel one and so they 'reduce' it to that.

PS This is essentially a specialisation of an old post to the context of quantum computing.

Labels: physics

Modular arithmetic with regular expressions

Problem: Find a regular expression, compatible with GNU grep, that recognises strings of 1's and 0's that form the binary expansion of multiples of 7. (Consider the empty string to be zero.)

Solution:
Here's one (though not necessarily a good one):


^\(0\|1\(1\(01*00\)*01*010\|\(0\|1\(01*00
\)*01*011\)\(10\(01*00\)*01*011\|\(0\|11\
)1\)*\(10\(01*00\)*01*010\|\(0\|11\)0\)\)
*\(1\(01*00\)*1\|\(0\|1\(01*00\)*01*011\)
\(10\(01*00\)*01*011\|\(0\|11\)1\)*10\(01
*00\)*1\)\)*$

The standard approach to solving this problem is probably to construct a finite state automaton to compute the remainder modulo 7 of a binary string and then use the proof of Kleene's theorem to convert the automaton to a regular expression. But as I have some regular expression code left over from a previous post I thought I'd try a purely algebraic approach. Although it's ultimately equivalent to deriving a finite state automaton I thought it might be more fun to use intuitions about linear algebra to get to the solution. And anyway, drawing diagrams with HTML is hard...

Consider first recognising multiples of 3. Assume we have three regular expressions, A, B and C. A corresponds to multiples of 3, B corresponds to numbers that are 1 modulo 3 and C matches numbers that are 2 modulo 3. Then we know

A = A0+B1+1

B = C0+A1

C = B0+C1

First a note about notation: I'm using bold 0 and 1 to represent binary digits but using regular 0 to represent the regular expression matching nothing and regular 1 to represent the regular expression matching the empty string. a+b means match a or b. So we can interpret the equation for A, say, as saying that a multiple of 3 is the empty string, or a multiple of 3 followed by a 0, or a 1 after one more than a multiple of 3. Similar interpretations apply for the other two equations. We can rewrite this as

x = xM+v

where

M=
0 1 0
1 0 0
0 0 1

and

x = [A B C], v = [1 0 0]

We can now use our ordinary intuitions about solving linear systems. We want to solve for A so we need to eliminate B and C. Start by eliminating C. We have

C = B0+C1

If we were doing ordinary linear algebra we could say

C(1-1) = B0

C = B0(1-1)^-1

but for regular expressions we can neither subtract nor divide. However, arguing informally now, (1-a)^-1=1+a+a²+a³+…=a^* so now we can write

C = B01^*

Anyway, those are all the principles we need. Just eliminate each variable in turn by solving for it in terms of the other remaining variables and substituting the solution back into the remaining equations. You should be able to finish the job and solve for A. But you'll notice the work gets tedious for larger integers. That's where the code comes in.

It starts as before but we can delete a whole lot of material:



> import List
> data RE a = Symbol a | Star (RE a) | Sum [RE a] | Prod [RE a]
>   deriving (Eq,Show,Ord)

> a <*> b = Prod [a,b]
> a <+> b = Sum [a,b]

> zero = Sum []
> one = Prod []

My simplification rules are still really poor. (Buggy, in fact, but not buggy in a way that leads to incorrect results.) That means we don't expect the final regular expressions to be in any way efficient. But I'm not trying to get the best solution, just a solution.



> simplify a = let b = simplify' a in if a==b then b else simplify b where
>   simplify' (Prod [a]) = simplify a
>   simplify' (Prod a) | zero `elem` a = zero
>   simplify' (Prod (a:b)) = case simplify a of
>       Prod x -> Prod $ x ++ map simplify b
>       a -> Prod $ filter (/= one) $ map simplify (a:b)

>   simplify' (Sum [a]) = a
>   simplify' (Sum (a:b)) = case simplify a of
>       Sum x -> Sum $ x ++ map simplify b
>       a -> Sum $ nub $ sort $ filter (/= zero) $ map simplify (a:b)

>   simplify' (Star (Sum [])) = one
>   simplify' (Star (Prod [])) = one
>   simplify' (Star a) = Star (simplify a)
>   simplify' a = a

I'm representing the vector v and matrix M above as simple associative arrays whose elements are found using lookup. Elements not represented are assumed to be 0. fetch handles the sparse lookup:



> type Matrix s a = [((s,s),RE a)]
> type Vector s a = [(s,RE a)]

> fetch i x = maybe zero id $ lookup i x

solve implements exactly the solution method described above. The use of Star corresponds to my abuse of the binomial series for (1-a)^-1 above. However rather than manipulate equations I'm always working in terms of matrices and vectors over the regular expressions:



> solve range i (m,v) = (
>  [(j,simplify $ fetch (i,j) m <*> Star (fetch (i,i) m)) |
>        j <- range, j/=i],
>        simplify $ fetch i v <*> Star (fetch (i,i) m)
>     )

substituteFor substitutes a solution to an equation given by solve into the remainder of the equations.



> substituteFor range k (v',c') m v = (
>  [((i,j),(fetch j v' <*> fetch (i,k) m) <+> fetch (i,j) m) |
>         i <- range, i/=k, j <- range, j/=k],
>  [(i,(c' <*> fetch (i,k) m) <+> fetch i v) |
>         i <- range, i /= k]
>     )

As I describe above, a variable is eliminated by solving for it and then substituting:



> elim range s (m,v) = let (v',c') = solve range s (m,v)
>                in substituteFor range s (v',c') m v

Now some code to output our regular expressions in a form compatible with grep:



> bracket n (s,n') = if n'>=n then s else "\\(" ++ s ++ "\\)"
> pretty (Symbol a) = ([a],9)
> pretty (Star a) = (bracket 8 (pretty a) ++ "*",8)
> pretty (Prod a) = (concat $ map (bracket 7) $ map pretty a,7)
> pretty (Sum a) = (foldr1 (\a b -> a ++ "\\|" ++ b) $
>                   map (bracket 6) $ map pretty a,6)

Now construct M and v for the equations describing the regular expressions for the classes of binary strings modulo n:



> m n = [(((2*i) `mod` n,i),Symbol '0')|i <- [0..n-1]] ++
>       [(((2*i+1) `mod` n,i),Symbol '1')|i <- [0..n-1]]
> v n = [(0,one)]

To solve a complete set of equations we reduce it one equation at a time by eliminating variables:



> reduce [a] (m,v) = solve [a] a (m,v)
> reduce (a:as) (m,v) = reduce as $ elim (a:as) a (m,v)
> test n = let (_,r) = reduce [n-1,n-2..0] (m n,v n) in simplify r

Print regular expression to detect multiples of 7:



> main = putStrLn $ "^" ++ fst (pretty (test 7)) ++ "$"

Try integers besides 7. But note that the expressions get more and more complex. And of course you can use the equation solving approach for problems besides recognising multiples of integers.

PS I just noticed some of the indentation went wrong on translation to HTML. But the code appears to still compile fine with GHC.

Sunday, February 04, 2007

Exceptions, Disjunctions and Continuations

At the end of section 4 of Call-by-value is dual to call-by-name, Wadler tells a story about the Devil and a billion dollars. After a long time thinking about it, the meaning of it suddenly hit me like an epiphany a few weeks ago, and I'm finally in a position to give an explanation of what it means. (This is also an exercise in TMR6.) And just for a bit of variety I'm going to throw in some C++ and Scheme, starting with some simple C++ code:


#include <iostream>
#include <string>

using namespace std;

template<class A,class B>
class Either
{
public:
    virtual B value() const throw (A) = 0;
};

template<class A,class B>
class Left : public Either<A,B>
{
    A a;
public:
    Left(A a0) : a(a0) { }
    B value() const throw (A) { throw a; }
};

template<class A,class B>
class Right : public Either<A,B>
{
    B b;
public:
    Right(B b0) : b(b0) { }
    B value() const throw (A) { return b; }
};

template<class A,class B,class C>
C either(const Either<A,B> &e,C (*f)(A),C (*g)(B))
{
    try {
        return g(e.value());
    } catch (A &a) {
        return f(a);
    }
}

string f(int a)
{
    return "int";
}

string g(float b)
{
    return "float";
}

int main()
{
    std::cout <<
        either<int,float,string>(Left<int,float>(7),f,g)
        << std::endl;
    std::cout <<
        either<int,float,string>(Right<int,float>(7.0f),f,g)
        << std::endl;
}

This provides a (somewhat unreliable, I think) implementation of a discriminated union. An object of type Either<A,B> is either an object of type A or an object of type B. But unlike the ordinary union, it remembers the type of its contents. I've also provided an either function. Given a function of type A→C and a function of type B→C, then if we're handed an object of type Either<A,B> we know that we can apply one or other function to it to get an object of type C. The functon either applies the appropriate one for you. What's unusual is the way I've implemented it. I'm representing an object of type Either<A,B> as a function of no arguments (in effect a lazy value) returning something of type B (*)() throw (A). There are two paths by which an object can get out of such a function: either it is returned or it is thrown out. Having two paths is what I've used to implement of a template that represents either one thing or another. This simple example should make it clear that there is some sort of a connection between exceptions and logical or (ie. disjunction).

In functional languages exceptions tend to be handled a little differently. Whereas C++ has a throw keyword, there is no equivalent in Scheme, say. Instead
the call-with-current-continuation function acts like catch. The argument to call-with-current-continuation is a function (lambda (throw) ...) and the argument to this function (itself a function), here called throw, does the throwing. For example consider the scheme expression:


(call-with-current-continuation
    (lambda (throw) (+ 1 (throw 7))))

The result is 7, not 8. The 7 is thrown 'past' the (+ 1 ...) and is caught by the surrounding call-with-current-continuation. The important difference from C++ is that if any subexpression is to throw some value, it must be 'handed' a suitable throw function. Note also that Scheme is dynamically typed. In a static functional language throwing exceptions is a tiny bit more subtle. Not only must a subexpression be handed a suitable function for throwing, it must be handed a function of a suitable type. In particular, if the data thrown is to be of type a, then the subexpression must be handed a suitable 'throw' function that can take an object of type a as argument. Think of such a function as an a-consumer. So if we want to implement the C++ idea above in a statically typed functional language that supports exceptions we need to replace the function of type B (*)() throw (A) with a function that consumes an A-consumer and returns an object of type B. Such a function can either throw a result of type A or return a result of type B.

Anyway, here's a complete Scheme example (BTW it's my first bit of Scheme code longer than one line):


(define (right b) (lambda (throw) b))
(define (left a) (lambda (throw) (throw a)))

(define (compose f g) (lambda (x) (f (g x))))
(define (id x) x)

(define (either f g x)
   (call-with-current-continuation
       (lambda (throw)
           (g (x (compose throw f))))))

(define (f x) (cons "left" x))
(define (g x) (cons "right" x))

(either f g (left 1))
(either f g (right 2))

OK, time for some logic.

As everyone familiar with propositional calculus knows, a ∨ b = ~a → b. Or in English "a or b" is the same as "if a isn't true then b is".

Less well known is the Curry-Howard isomorphism. In a nutshell it goes like this: if you know a, and you know a → b, then you can deduce b (using modus ponens). Similarly, in a strictly typed programming language, if you have an object of type a, and a function of type a → b, then you can use them to make an object of type b (just apply the function to a). So given a bunch of propositions in 1-1 correspondence with a bunch of objects, you should be able to see that the set of propositions you can deduce by modus ponens is going to be in 1-1 correspondence with the objects you can make by function application.

But there's more to logic than modus ponens. In classical logic you also have double negation elimination. In other words, if you know ~~a then you can deduce a. So double negation elimination is like having a polymorphic function ~~a → a for any a. What does "~" mean in terms of functions? In logic, define ⊥ to be the unprovable false proposition. Then ~a is simply a→⊥. In programming define ⊥ to be the type with no instances and we can define ~a to be the type of functions a→⊥. A function of type ~a is a function that consumes an instance of type a and gives you back nothing in return. Clearly if you can make an object of type a then no function of type ~a can exist or else you could apply it to a and get an object f type ⊥, which is impossible by definition.

It's not hard to see that there can be no well behaved function that for any type a can map an ~~a to an a. An ~~a is a thing that consumes a-consumers. In a sense it's destructive. There are no raw ingredients from which you could possibly make an a for all a. So there is no function of type ~~a → a. But this type can serve very well as the type of the catch operator described above. Rewriting the scheme example above with Haskell-like notation we expect something like

catch $ \throw -> ... (throw a) ...

Because throw never returns we can assign it the type a -> ⊥ without problems and hence catch can be assigned type ~~a. This now means that the Curry-Howard isomorphism can be extended to classical logic if we throw in catch along with the usual functions you expect in a pure typed functional language.

So If we use catch to throw an exception out of a function, then the function must be handed a ~a as argument. So in order to implement Either, we must replace the C++ type B (*)() throw(A) with ~a -> b. In other words, if we use the symbol ∨ to mean either, then exactly in classical logic, we have a ∨ b = ~a → b.

You can think of the argument to catch as being a bit like a bungee cord with one end attached to the catch. If a function is handed the other end of such a bungee cord it can either use it to catapult out a value of type a, or the function can ignore it and return somehing of type b. But there's a neat twist on this. If b = ~a then the function, on being handed a bungee cord, could simply hand it back as a return value. In other words, whatever type a is, we can always construct an object of type a ∨ ~a from the identity function. This corresponds to the fact that a ∨ ~a is a tautology of classical logic.

And now you can reread Wadler's story about the Devil and see that my story about a bungee cord is really the same one.

Exercise: Implement Left, Right and either using the method above in the Haskell Cont monad.

PS Reasoning about exceptions in this way is new to me and so I don't feel I have the above on a firm footing. So expect some errors. In particular I haven't thought about what happens if you try to nest disjunctive structures this way. You can be sure that the C++ implementation will fail because the exception may be caught by the wrong try.

Update: fixed some typos. But also here's an exercise:

Exercise: a ∧ b = ~(~a ∨ ~b). What does that mean in the above context? (I haven't got my head around this yet.)

Labels: types

Saturday, February 03, 2007

Comonads and reading from the future

I'm too busy reading Bernie Pope and Russell O'Connor's articles in the latest Monad Reader to think about much else right now, so I'm only going to make a brief post. I want to mention something pointed out by Nick Frisby over here. The loeb function is in fact closely related to cfix, the dual of mfix for monads, defined by Dave Menendez. (Nick Frisby's statement isn't quite correct but it's interest value outweighs its truthiness.) In fact, I can borrow Russell's language to say that whereas mfix sends data back in time, cfix and loeb read data from the future!

Think of it like this: when you use cobind with a comonad you provide some function that distills down a comonadic data structure into a single value and then cobind applies this distillation repeatedly at every point in the comonadic structure to give you another comonadic structure. My canonical example is the cellular automaton where you give a rule that defines how to compute each individual cell as a function of the whole grid, and cobind applies this throughout an entire input grid to give you back an entire output grid. The functions cfix and loeb allow you to give rules for each cell where you're allowed to make reference to the entire output grid. If you think of the output grid as something in the future that you haven't yet computed, then cfix and loeb allow you to read from the future.

Here's an example of cfix it in use (where I make [] a comonad by interpreting it as a kind of one-sided zipper):


> instance Show (x -> a)
> instance Eq (x -> a)

> instance (Num a,Eq a) => Num (x -> a) where
>     fromInteger = const . fromInteger
>     f + g = \x -> f x + g x
>     f * g = \x -> f x * g x
>     negate = (negate .)
>     abs = (abs .)
>     signum = (signum .)

> class Comonad w where
>    coreturn :: w a -> a
>    cobind :: (w a -> b) -> w a -> w b

> instance Comonad [] where
>     coreturn (x:xs) = x
>     cobind f [] = []
>     cobind f (x:xs) = f (x:xs) : cobind f xs

> cfix :: Comonad d => d (d a -> a) -> a
> cfix d = coreturn d (cobind cfix d)

> ouroboros = [2*head.tail,1+head.tail,17]
> test = cfix ouroboros

In the expression test we can use head and tail to refer to the sublist to the right of each element from within the list itself. For example, in the subexpression 2*head.tail the head.tail is being applied to cfix [1+head.tail,17].

Note how in this example (1) loeb gives you back an entire list, not just one element and (2) cfix respects the comonadic structure in the sense that you can only refer to elements to the right of the current one whereas loeb lets you refer to the entire output list. I think this helps to make clear the differences and similarities between cfix and loeb.

That wasn't too brief. But I did cheat by copying and pasting code from an old post. Anyway, back to tinkering with my implementation of dropWhile...

A Neighborhood of Infinity