Before continuing there's something important I should say. Let me quote Bennett et al. in Strengths and Weaknesses of Quantum Computing
"The power of quantum computers comes from their ability to follow a coherent superposition of computation paths."
In other words, what I am saying is in direct contradiction with what is said by some of the founding fathers of quantum computing. Actually, I think that's a good thing, it means that whether I'm right or wrong, I must be saying something non-trivial.
Consider a probabilistic computer. We can represent the probability of its being in any state by an element of a vector space whose basis is the set of possible states and where the coefficient of each basis element is the probability of the computer being in that state. Given any pair of vectors that represent possible ditributions of states. v1 and v2, then av1+bv2, where a,b>=0 and a+b = 1 is also a possible distribution.In other words we can form superpositions exactly analogously to the way we can form superpositions of quantum computer states. The mathematics that describes the notion of a quantum computer performing computations in parallel by superposing N states is formally almost identical to the mathematics describing a probabilistic computer as a superposition of states.
So if we can superpose N states in both classical probabilistic computers and quantum computers, why do neither seem to offer an N times speedup? In the case of quantum computers one popular explanation is that then though we run N states in parallel, we can't perform I/O with those N states in parallel. When we observe a quantum computer we cause a wavefunction collapse (it doesn't matter whether you are a believer in such collapse, that is how we model quantum computers) and instead we make a single observation from a probability distribution. Mathematically this is identical to what happens in a probabilistic machine - a superposition is nondeterministically mapped into one of many definite states.
So whatever the difference is between probabilistic computers and quantum computers, it doesn't lie in the fact that you can form superpositions. Mathematically speaking, probabilistic computers and quantum computers are almost identical in this respect.
But we know that quantum algorithms can do things that no probabilistic algorithm can, so there must be some difference. There is of course. Here's an approach to seeing what it is.
Consider a probabilistic branching process:
/ \__ B
1 __ A
Starting at state 1 we can either go to state 2 or state 3, and from there we end up at either A, B or C. Assume that the choice of branch is nondeterministic and that we can assign probabilities to each branch. Suppose also that the branch probabilities are all non-zero. The important thing to note in that diagram is that there are two ways to reach it. The probability of reaching A is the sum of the probabilities for each of the different ways to reach A. The key point I want to make here is that A is more likely because there are two ways to reach it. If I were to replace one or other of the A's with a D then the likelihood of finding our process in state A at the end is reduced.
But in quantum mechanics we replace probabilities with complex 'amplitudes'. When we make our observation at the end, the probability of any outcome is given by the modulus of the amplitude squared. The amplitudes still add, just as in probability theory. But this has a very important consequence. If the amplitudes for the two different ways to reach A lie in opposite directions in the complex plane, then the probability of ending up in state A after observation is reduced by having two ways of reaching A. This is completely different to the probabilistic case. We can paradoxically make something less likely by adding a new way to achieve it. This is where the essence of quantum computing lies.
Pick up just about any paper describing a quantum algorithm and you'll see that the important step is a 'conspiracy' to make the amplitudes of 'undesirable' states sum to zero (or at least something small). Probably the simplest example is counterfactual computation where it is arranged that the computation is executed along two different branches with equal and opposite amplitudes. The neatest example is probably Shor's cycle discovery algorithm that forms the basis of his factoring algorithm. By clever use of the Fourier transform he arranges that branches leading to non-cycles cancel each other out leaving only the periods of cycles with high amplitudes.
However, I think it's important to stress that given any individual quantum computer Bennett et al. and I probably don't disagree on what it actually does. This is an issue of semantics. But semantics does make a difference. The notion of a quantum computer as a super-parallel machine is one being promulgated in the popular science press and I think this causes much confusion with people expecting all kinds of algorithms to run in O(1) time.
So why do people talk about parallel computation if that's not what quantum computer do? I think it's because people aren't comfortable with quantum computers yet. This means they don't want to think of quantum computers as something fundamental and irreducible and instead want to reduce them to something classical. The easiest classical model is the parallel one and so they 'reduce' it to that.
PS This is essentially a specialisation of an old post to the context of quantum computing.