Firstly, let's review classical computing. In a classical computer you have a machine in a state, call it s. You let a bit of time pass and it ends up in a new state, call it t. We can write this using the notation
|s> => |t>
That's basically all there is to know about computing, the rest is details. Let's look at an example. Suppose we have a computer with two one bit registers. The first is an on-off switch and the second will contain the result of the computation. For technical reasons we'll make it so that the answer doesn't just go into the second register but is exclusived-ored with it. We initialise the second register to zero so it's effectively the same thing as storing the result there. Write the state as |s,x> where s is the state of the switch and x is the second register. Suppose the result of the computation is r - this is the thing we are actually interested in. Then we have
|s,x> => |s,x^(s&r)>
In other words, this is saying that we only exclusive-or x with r if s is 1. (I used '^' to mean boolean 'exclusive-or' and '&' to mean 'and'.) For example
|0,0> => |0,0> and |1,0> => |1,r>.
Obviously s must be 1 in order to get a useful value of x.
Now let's move onto quantum computing. Now we no longer have a discrete set of states but instead a vector space. The time evolution of a quantum computer is given by a unitary linear operator. If the states evolves to state t we write this as
|s> => |t>
where we now interpret |s> and |t> as vectors in a normed vector space. The first important fact to know about quantum computers are that if |a> and |b> are possible states then so is any complex linear combination of these states. The second (fact(2)) is that if system A has states in a vector space V(A), and
system B has states given by vector space V(B), then the combined system is given by V(A) (x) V(B), where (x) is the usual tensor product over C. The third important fact (fact (3)) is about the interpretation of the state |a>+|b>. This makes no sense classically, but in QM linear combinations (called superpositions) are allowed. The rule is that if you observe this state to see whether it's in state |a> or state |b> then it 'collapses' into state |a> or |b>. The probability of state |s> collapsing into state |a> is given by |<a|s>|^2/|<a|a>||<s|s>>| where <x|y> is fancy notation for the inner product of |x> and |y>. (There's another rule of QM, if |a> and |b> are distinct possible observable outcomes of an experiment, then <a|b>=0.)
We now move our classical computer into Quantumland by considering the states |s,x> to be the basis elements of a vector space. Using fact (2) above we can consider these to be |s> (*) |x>. The rule
|s,x> => |s,x^(s&r)> (rule A)
defines what happens to each basis element and hence defines a linear map. For basis elements it acts just like a quantum system. But with a quantum computer we have available all kinds of interesting linear operators that make no sense for a classical computer. In particular, given a register we can apply this operator to it:
|0> => a*(|0>+|1>)
|1> => a*(|0>-|1>)
where a = 1/sqrt(2). This is linear and unitary. Notice also that if we do it twice we get back to where we started from. To make things easy we'll drop the constant a and work with linear operators that are unitary up to multiplication by a constant. The state |0>+|1> is a kind of weird mixture of on and off-ness.
So the trick to counterfactual computation is that instead of switching on the computer you instead 'half' switch it on using the linear operator above. Note that if you 'half' switch it on twice it just ends up being off again. But instead of 'half' switching it on twice in a row, we half switch it on, allow the quantum computer to evolve by rules (A) and then half switch it on again. Let's work through the details. First note that we are only 'half' switching on the first register. This means the linear operator applies only to the first factor in the tensor product |s>(x)|x>. So we get:
|0,0> => |0,0> + |1,0>
Now consider the result of this followed by allowing the computer to evolve by (A)
|0,0> => |0,0> + |1,r>.
And now follow this by another 'half' switch on:
|0,0> => |0,0> + |1,0> - |1,r> + |0,r>.
If r = 0 then:
|0,0> => |0,0> (modulo a constant multiple)
In other words, the state is unchanged.
If r = 1 then:
|0,0> => |0,0>+|1,0>+|0,1>-|1,1>
We no longer have the cancellation. Following fact (3), because of the |0,1> term above we have a non-zero probability of finding that the register x contains the answer r and yet the on-off switch is off.
And that's all there is to it. With a bit more cunning you can actually get the probability of finding r as high as you like. There are some great potential applications here. We have effectively left x untouched and this translates into practical 'interaction-free' experiments that allow us to make certain kinds of measurement leaving the measure system unchanged. (You may see why I used 'exclusive-or' above - just copying the result into the second register wouldn't have given me a unitary operator.)
BUT, and this is a big but, I see no valid way to interpret this as 'counterfactual computation'. In other words, I completely disagree with the interpretation of this result. In particular, suppose r equals zero. Then the reason we end up with the result |0,0> is that we have had a fortuitious cancellation. But we have actually passed the register |0>+|1> through the system. We only get |0,0> at the end because our second 'half' switching operation has made the effects of this 'destructively' interfere. In other words, it's just like we put on noise-cancelling headphones and then claim that there was no sound at all. That's silly, there was lots of sound, we just arranged for it to cancel at our ears. In our quantum computer both |0> and |1> waves passed through the system, but the second 'half' switch operation made them cancel. In the r=1 case you can thing of the shapes of these waves being messed up so they no longer cancel. (By the way, the talk of waves is quite fair here, you can think of the elements of these state vector spaces in fact being wavefunctions and the linear operators being time evolution of differential equations remarkably similar to the wave equation.)
In other words, I'm calling these physicists' bluff. There is nothing counterfactual about this computation at all. Though we might be able to get the result of our computation with the switch in the off state, that's only because it was partly on during the computation and we arranged for the state of that register to be destructively cancelled out at the end. I claim that the descriptions given of the experiment are completely bogus.
In fact, if you read Jozsa and Mitchison's paper you'll see that in his discussion his language is quite guarded. I think they realise that this talk of counterfactuality is a little bogus.
Update: Scott Aaronson rants about this too over here. Make sure you read Kwiat's response.