The Essence of Quantum Computing
A popular view of quantum computation is that its power (at least the power of certain well known quantum algorithms) derives from a form of parallel computation coming from the fact that the state of a quantum computer can be a superposition of many classical states. I want to argue that this is in fact a bit of a red herring because there are other models of computation that allow formal superpositions and yet clearly don't provide a form of parallel computation. Instead, the essence of quantum computing is derived from a kind of non-convexity property and this is the common feature of all of the well known quantum algorithms that isn't shared by classical algorithms.
Before continuing there's something important I should say. Let me quote Bennett et al. in Strengths and Weaknesses of Quantum Computing
In other words, what I am saying is in direct contradiction with what is said by some of the founding fathers of quantum computing. Actually, I think that's a good thing, it means that whether I'm right or wrong, I must be saying something non-trivial.
Consider a probabilistic computer. We can represent the probability of its being in any state by an element of a vector space whose basis is the set of possible states and where the coefficient of each basis element is the probability of the computer being in that state. Given any pair of vectors that represent possible ditributions of states. v1 and v2, then av1+bv2, where a,b>=0 and a+b = 1 is also a possible distribution.In other words we can form superpositions exactly analogously to the way we can form superpositions of quantum computer states. The mathematics that describes the notion of a quantum computer performing computations in parallel by superposing N states is formally almost identical to the mathematics describing a probabilistic computer as a superposition of states.
So if we can superpose N states in both classical probabilistic computers and quantum computers, why do neither seem to offer an N times speedup? In the case of quantum computers one popular explanation is that then though we run N states in parallel, we can't perform I/O with those N states in parallel. When we observe a quantum computer we cause a wavefunction collapse (it doesn't matter whether you are a believer in such collapse, that is how we model quantum computers) and instead we make a single observation from a probability distribution. Mathematically this is identical to what happens in a probabilistic machine - a superposition is nondeterministically mapped into one of many definite states.
So whatever the difference is between probabilistic computers and quantum computers, it doesn't lie in the fact that you can form superpositions. Mathematically speaking, probabilistic computers and quantum computers are almost identical in this respect.
But we know that quantum algorithms can do things that no probabilistic algorithm can, so there must be some difference. There is of course. Here's an approach to seeing what it is.
Consider a probabilistic branching process:
Starting at state 1 we can either go to state 2 or state 3, and from there we end up at either A, B or C. Assume that the choice of branch is nondeterministic and that we can assign probabilities to each branch. Suppose also that the branch probabilities are all non-zero. The important thing to note in that diagram is that there are two ways to reach it. The probability of reaching A is the sum of the probabilities for each of the different ways to reach A. The key point I want to make here is that A is more likely because there are two ways to reach it. If I were to replace one or other of the A's with a D then the likelihood of finding our process in state A at the end is reduced.
But in quantum mechanics we replace probabilities with complex 'amplitudes'. When we make our observation at the end, the probability of any outcome is given by the modulus of the amplitude squared. The amplitudes still add, just as in probability theory. But this has a very important consequence. If the amplitudes for the two different ways to reach A lie in opposite directions in the complex plane, then the probability of ending up in state A after observation is reduced by having two ways of reaching A. This is completely different to the probabilistic case. We can paradoxically make something less likely by adding a new way to achieve it. This is where the essence of quantum computing lies.
Pick up just about any paper describing a quantum algorithm and you'll see that the important step is a 'conspiracy' to make the amplitudes of 'undesirable' states sum to zero (or at least something small). Probably the simplest example is counterfactual computation where it is arranged that the computation is executed along two different branches with equal and opposite amplitudes. The neatest example is probably Shor's cycle discovery algorithm that forms the basis of his factoring algorithm. By clever use of the Fourier transform he arranges that branches leading to non-cycles cancel each other out leaving only the periods of cycles with high amplitudes.
However, I think it's important to stress that given any individual quantum computer Bennett et al. and I probably don't disagree on what it actually does. This is an issue of semantics. But semantics does make a difference. The notion of a quantum computer as a super-parallel machine is one being promulgated in the popular science press and I think this causes much confusion with people expecting all kinds of algorithms to run in O(1) time.
So why do people talk about parallel computation if that's not what quantum computer do? I think it's because people aren't comfortable with quantum computers yet. This means they don't want to think of quantum computers as something fundamental and irreducible and instead want to reduce them to something classical. The easiest classical model is the parallel one and so they 'reduce' it to that.
PS This is essentially a specialisation of an old post to the context of quantum computing.
Before continuing there's something important I should say. Let me quote Bennett et al. in Strengths and Weaknesses of Quantum Computing
"The power of quantum computers comes from their ability to follow a coherent superposition of computation paths."
In other words, what I am saying is in direct contradiction with what is said by some of the founding fathers of quantum computing. Actually, I think that's a good thing, it means that whether I'm right or wrong, I must be saying something non-trivial.
Consider a probabilistic computer. We can represent the probability of its being in any state by an element of a vector space whose basis is the set of possible states and where the coefficient of each basis element is the probability of the computer being in that state. Given any pair of vectors that represent possible ditributions of states. v1 and v2, then av1+bv2, where a,b>=0 and a+b = 1 is also a possible distribution.In other words we can form superpositions exactly analogously to the way we can form superpositions of quantum computer states. The mathematics that describes the notion of a quantum computer performing computations in parallel by superposing N states is formally almost identical to the mathematics describing a probabilistic computer as a superposition of states.
So if we can superpose N states in both classical probabilistic computers and quantum computers, why do neither seem to offer an N times speedup? In the case of quantum computers one popular explanation is that then though we run N states in parallel, we can't perform I/O with those N states in parallel. When we observe a quantum computer we cause a wavefunction collapse (it doesn't matter whether you are a believer in such collapse, that is how we model quantum computers) and instead we make a single observation from a probability distribution. Mathematically this is identical to what happens in a probabilistic machine - a superposition is nondeterministically mapped into one of many definite states.
So whatever the difference is between probabilistic computers and quantum computers, it doesn't lie in the fact that you can form superpositions. Mathematically speaking, probabilistic computers and quantum computers are almost identical in this respect.
But we know that quantum algorithms can do things that no probabilistic algorithm can, so there must be some difference. There is of course. Here's an approach to seeing what it is.
Consider a probabilistic branching process:
__ A
/
2
/ \__ B
1 __ A
\ /
3
\__ C
Starting at state 1 we can either go to state 2 or state 3, and from there we end up at either A, B or C. Assume that the choice of branch is nondeterministic and that we can assign probabilities to each branch. Suppose also that the branch probabilities are all non-zero. The important thing to note in that diagram is that there are two ways to reach it. The probability of reaching A is the sum of the probabilities for each of the different ways to reach A. The key point I want to make here is that A is more likely because there are two ways to reach it. If I were to replace one or other of the A's with a D then the likelihood of finding our process in state A at the end is reduced.
But in quantum mechanics we replace probabilities with complex 'amplitudes'. When we make our observation at the end, the probability of any outcome is given by the modulus of the amplitude squared. The amplitudes still add, just as in probability theory. But this has a very important consequence. If the amplitudes for the two different ways to reach A lie in opposite directions in the complex plane, then the probability of ending up in state A after observation is reduced by having two ways of reaching A. This is completely different to the probabilistic case. We can paradoxically make something less likely by adding a new way to achieve it. This is where the essence of quantum computing lies.
Pick up just about any paper describing a quantum algorithm and you'll see that the important step is a 'conspiracy' to make the amplitudes of 'undesirable' states sum to zero (or at least something small). Probably the simplest example is counterfactual computation where it is arranged that the computation is executed along two different branches with equal and opposite amplitudes. The neatest example is probably Shor's cycle discovery algorithm that forms the basis of his factoring algorithm. By clever use of the Fourier transform he arranges that branches leading to non-cycles cancel each other out leaving only the periods of cycles with high amplitudes.
However, I think it's important to stress that given any individual quantum computer Bennett et al. and I probably don't disagree on what it actually does. This is an issue of semantics. But semantics does make a difference. The notion of a quantum computer as a super-parallel machine is one being promulgated in the popular science press and I think this causes much confusion with people expecting all kinds of algorithms to run in O(1) time.
So why do people talk about parallel computation if that's not what quantum computer do? I think it's because people aren't comfortable with quantum computers yet. This means they don't want to think of quantum computers as something fundamental and irreducible and instead want to reduce them to something classical. The easiest classical model is the parallel one and so they 'reduce' it to that.
PS This is essentially a specialisation of an old post to the context of quantum computing.
Labels: physics
13 Comments:
There is no real contradiction. Bennett wrote:
"... follow a coherent superposition of computation paths."
Coherence means phase correct superposition, or: |a + b|^2.
Probability superposition is superposition of the absolute square of the wave functions: |a|^2 + |b|^2.
Quantum computing of course only works with the first kind of superposition. This is well known and of course also known by Bennet et. al.
You might be interested in the work of Scott Aaronsen. He discusses the limitations und nature of Quantum Computing very well. Don't miss his his Quantum Computing lectures
I especially like his connection of the intractibility of NP-complete problems as conjectured physical principle!
I don't see the contradiction between what you said and the "usual" claims about why quantum computing's speedup works.
You are correct to put the emphasis on "we can paradoxically make something less likely by adding a new way to achieve it". But the reason we are able to do it, in terms of quantum mechanics, is the fact that the system exists in a superposition of states which are interfering with each other. Quantum interference is the reason that we can paradoxically etc.
The crucial difference between a probabilistic classical computer and a quantum computer is not merely the fact that in the latter the amplitudes are complex numbers. In the classical case, you know that the computer exists in some particular state, you just don't know which one it is. You can call that state of affairs a superposition of states if you wish, but the fact is, only one of these states is actually real, and the non-real ones have no way of influencing the real one. No interference can happen. In the quantum case, saying that the computer exists in a superposition of states is not merely saying that it'll be discovered to be in one of them with such and such probabilities when we look under the hood later. It means more than that: it means that all of those states are simultaneously "real"; until the wave collapses they are all "there" co-existing simultaneously in the superposition and that is why they can influence each other and create quantum interference.
In the classical case, "nature" needs to keep track of one state only, it's just that we don't know yet which one it is. In the quantum state, "nature" is forced to keep track of exponentially many different states at once - otherwise interference wouldn't happen. As we subject the system to various operators, all the states undergo the operations in parallel; and while we can't reach that huge freebie exponential store of information directly, we can manipulate interference into revealing some of it to us, hence Shor's algorithm etc.
So when Bennett et al. say that the power of quantum computers lies in following a superposition through a computation, they refer to a "real" superposition rather than a "formal" one in the classical probabilistic case, and the difference between them is manifested precisely in the "real" superposition's ability to create interference between its states - the "paradoxical" property that you stress. So you're saying more or less the same thing, in different words, it seems to me.
The link to the "old post" just goes to your blog.
Anatoly,
Some of what you're saying is metaphysical speculation, especially your claim that "only one of these states is actually real". I tend to ignore metaphysical speculation and concentrate on what we can actually measure in experiments.
> You can call that state of affairs a superposition of states if you wish
I do. It seems like the same sort of thing to me, and the mathematics is very similar.
> but the fact is, only one of these states is actually real, and the non-real ones have no way of influencing the real one. No interference can happen.
Interference happens all the time in classical probabilistic systems - interference is nothing but the linear combination of vectors. I don't see why you think there's some magic whereby classical systems have one "real" state and quantum systems somehow have all states real. What's different about quantum mechanics is that the weightings in the linear combinations aren't all directed along the positive real axis. I don't see how this makes one state "real". My branching diagram was a perfectly good example of interference in a classical system. The only difference from a quantum system is that in a classical system the interference is always 'constructive' because of the positive weights.
Can you explain what you mean further?
Could your entire post be summarized by the phrase, "Bells's Theorem" ?
In other words, doesn't Bell's Theorem provide the underpinning of the difference between probabilistic and quantum computing, by showing that quantum mechanics cannot be simulated by any
"hidden variable" theory ?
I agree that the existing quantum algorithms employ superposition in a clever way to enhance/decrease the amplitudes for the wanted/unwanted states respectively. For that we need probability amplitudes. However, the point is that to the extent that quantum computers will eventually be built, they will rely on these superpositions actually existing and evolving during execution. As far as I understand classical computation, such superpositions do not exist classically (with neither real nor complex number coefficients). I mean they do not exist as real physical states even though we could perhaps formally write down linear superposition of states. I guess you see this as soon as you try to run (simulate really) a probabilistic or non-determinsitic computer (or program) - in the general case you get exponential slowdown just as you get when you simulate a quantum computer classically. You simply have to go down all the paths in the computational tree (in a breadth-first search and/or using many parallell processes). Hope I have not missunderstood your post!
Can you explain what you mean further?
Yes. Althought I spoke of states being "real" in a quantum superposition, there was less metaphysics in that then the words perhaps implied. The claim has physical rather than metaphysical content, and it is the situation of a quantum interference that shows this, especially vividly in the case of a single particle.
In the double-slit experiment, a narrow beam of light falls on a black surface with two very narrow vertical slits in it close to each other. After passing through the slits the light falls onto a screen parallel to the surface. On the screen there is an interference pictureL periodic bands of darker and lighter regions, rather than two separate identical lit regions
with darkness everywhere else, which is what you would expect if light behaved like a lot of very small particles (like bullets). In the classical model, the interference is explained by looking at the light as a wave that is split into two waves by the slits and those interfere with each other.
When, however, the beam of light is more and more diluted until it consists of a single photon, something happens that cannot be explained in the classical picture: if we repeatedly send in single photons and record the places they illuminate the screen, over time exactly the same interference picture emerges as in the case of a beam of light. In the classical model, a photon, being a particle, must pass through one of the slits
if it passes at all; we can set up the light emissions so that the probability of it passing through each of the slits is equal. A single particle cannot 'interfere' with itself in the classical model; if it passes through one slit, it'll strike the screen somewhere in one band; if the other - in another; if the two bands overlap, it can even contribute to the them both at the same time (but this isn't interference as understood in the classical physics model, where it is studied in terms of waves).
In reality, however, what we see is that a single photon appears to "know" about two slits simultaneously, even though, being a particle, it cannot pass through both at the same time. If we place a detector on one of the slits to tell us when photons pass by, the interference picture on the screen immediately changes to a bullet-like picture of two bands - that is, even photons that pass through the other slit, without the detector in it, are somehow affected by the fact that we are observing a different slit. There is simply no way to explain coherently in the classical model how a single particle, passing through one slit at a time, can build up an interference picture. The unescapable conclusion is that the theory must allow for some kind of interference of the particle with itself. That is the physical content, and the formulas give the exact numbers. The metaphysical explanations (the so-called interpretations of quantum mechanics) differ: some say there is no photon as an entity, there is only the wavefunction that evolves, interferes and then collapses when we make an observation; some say there are parallel universes in one of which the photon passed through one slit, in another through the other, and the only way those universes can interact is through interference of those "copies" of the same photon with each other. And so on. But the essential difference with the classical probabilistic model of a single particle moving around is physical, not metaphysical.
When you say that in the quantum case the formalism is the same as i n the classical probabilistic case, merely the "weightings" are complex and not just real positive, that misses the picture of how probabilities are in fact used in both models. In the classical probabilistic case, the probabilities of differnet outcomes are "distributed" at the moment of probabilistic choice, even if we don't know those distributions yet. The photon passes through one slit with probability 50% and the other with probability 50%, and at that moment the probabilities are final. Sure, we can construct our final result sets in an overlapping manner so that sometimes either possibility contributes to a particular result, and its probability is thereby increased. But that's just an example of additivity of probabilities. In the quantum case, the probabilities only 'get assigned' at the moment an observation happens - the photon strikes the screen. You are correct in saying that a complicated picture of interference is possible at that time because the amplitudes are complex and not just real positive; but not less important is the fact that the probabilities are calculated from those complex amplitudes only at the moment of observation. If the amplitudes were complex, but the probabilities were calculated at the slits, interference wouldn't happen. In fact, this is precisely what happens when we put a detector on one of the slits (that detects a photon but still lets it pass through). In the classical probabilistic case, placing a detector that tells us which slit the photon went through but otherwise leaves the photon be cannot change the picture on the screen, quite obviously.
In the double slit experiment, in the fraction of time the photon flies from the surface with the slits to the screen is time in which the photon exists in a superposition of two states: one of having passed through the first slit, the other - through the second. Because we know that the final calculation of probabilities, a.k.a. the collapse of the wavefunction, happens only at the moment of observation on the screen, we are forced to conclude that each of those states "evolves" independently and simultaneously with the other. In the two states the photon has different position and vector of movement, and each state changes according to the physical laws (here a simple movement in a straight line with the speed of light towards the screen) simultaneously with the other, at the same time also interfering with the other according to particular formulas. Another way to phrase it is to say that the universe is forced to keep track of each state's information (position and vector of movement) and process that information in parallel. In a system with N photons arranged carefully into a suitable superposition, the universe will be forced to keep track of information about an exponential number of states, simultaneously, until at some point we observe and the wavefunction collapses. By carefully guiding the states through some physical changes (that apply to them simultaneously in parallel) that affect interfernce between them in a direction suitable to us, we increase the chances of an interesting, to us, outcome when we collapse the wavefunction later by observing. Thus Shor's algorithm etc.
Hi Anatoly --- I'm not sure you can gain anything though thinking of the "evolution" of a pure state through time.
Whether you take your measurement after one millimeter or one meter, the interference pattern will be precisely analogous to the one as described by sigfpe's linear superposition. No extra "processing" is going on
Indeed it could well be wrong to assume that there is a background metric of time ticking away alongside pure quantum states. This is probably an example of what Jaynes would call "the Mind Projection Fallicy"
Here's a nice bombastic Jaynes quote, just to get everyone riled up :)
"A standard of logic that would be considered a psychiatric disorder in other fields, is the accepted norm in quantum theory"
Anatoly,
Suppose nature is keeps track of all of the superposed states simultaneously. Now suppose we have N+1 entangled qubits in a row. You claim that nature is tracking 2^(N+1) states. If this were the case then we could get nature to store 2^N bits in N+1 qubits. Here's the scheme I propose: list your 2^N bits in a row. Label the first one 0, the next one 1 and so on. So we have 2^N pairs (i,f(i)) for i = 1,...,2^N-1. We can encode the 'i' part as N qubits in the obvious way and we can store f(i) as the (N+1)-th qubit. Write this as |i,f(i)>. Now form the state psi that is sum_i |i;f(i)>. This encodes our entire set of 2^N qubits in one state. For example, suppose we wish to store the 4 bits of data (0,1,1,1) then we could encode these as the state
|0,0;0>+|0,1;1>+|1,0;1>+|1,1;1>
For example, the term |0,1,1> tells us that the 01th (in binary) element in the sequence is 1. So if you claim that nature tracks N+1 superposed states then you must also agree that nature must be using at least 2^N bits to store this state. So we have a pretty cool database here that allows 2^N bits to be stored in N+1 qubits.
But now consider how to use this database. For any i in the range 0,...,2^N-1 we expect to be able to extract f(i). Let's reduce it to the simple case: extracting f(0) from psi. We'd like f(0) to be an observable, so we need a Hermitian operator, H, for which f(0) is the eigenvalue. In other words, we want H such that H psi = f(0) psi.
Let's take the simplest case N=1. Then our state is |0;f(0)>+|1;f(1)>. We want
H(|0;0>+|1;0>) = 0
H(|0;1>+|1;0>) = |0;1>+|1;0>
H(|0;0>+|1;1>) = 0
H(|0;1>+|1;1>) = |0;1>+|1;1>
From the first and third line we get
H(|1;0>-|1;1>) = 0
From the second and fourth lines we get
H(|1;0>-|1;1>) = |1;0>-|1;1>
So |1;0>=|1;1> which is clearly nonsense.
So there is no such Hermitian operator, f(0) isn't an observable and the database idea fails.
So either (1) nature isn't really keeping track of our 2^N superposed states or (2) nature keeps track of our 2^N states but when we make an observation it throws that information away. I don't know about you, but entities that magically disappear when you observe them give me a bad feeling. At the very least, I'd refrain from calling them 'real'.
This isn't really my original point. But it does show that a quantum computer in a superposition of N states falls a long way short of being anything like N computers running in parallel.
Neat post, thanks
Really neat, thank you!
To those insisting on "coherent superposition" as an accurate description of quantum computing, I must ask: Which supersedes, the noun or the adjective?
Non-convexity, FTW!
Really neat, thank you!
To those insisting on "coherent superposition" as an accurate description of quantum computing, I must ask: Which supersedes, the noun or the adjective?
Non-convexity, FTW!
Post a Comment
<< Home