Saturday, March 31, 2007

The Curious Rotational Memory of the Electron, Part 1

There's a curious and bizarre fact about the universe that is introduced in physics courses without anyone stopping to point out just how curious and bizarre it is. I think it deserves to be more widely known to non-physicists. It's a little esoteric, but not so esoteric it can't be grounded in ordinary everyday concepts. And it isn't something hypothetical but something that is fundamental to modern physics and testable in the lab. It is also the reason why I chose to use theta/2 instead of theta in my quantum computing code.

This post will be in two parts. This part will mainly be about geometry and topology, and in the second part I'll talk about the physics.

In order to explain the idea I want to introduce the concept of a universal cover. If you look at the definition of this notion anywhere it can seem pretty daunting to non-mathematicians, but in actual fact the key idea is so simple that even a child could understand it. I can say that without fear of exaggeration because universal covers used to play a role in a childhood fantasy of mine. (OK, I was a weird kid, but not that weird.)

My crazy idea as a kid was this: if there were two ways to get from A to B, maybe the B you reached by taking either of the two routes were actually slightly different, even though they looked the same. I was actually more concerned with journeys from A to A. Suppose the kitchen has a door at each end and that you start in the bedroom, go down to the kitchen at one end, come out the other end, and return by a different path to the bedroom. How do you know this is the same bedroom? (I also admit I read a lot of science fiction as a kid.)



Suppose instead you took a simpler path: you left the bedroom and went to another room and then returned by the path you went. If you had taken a piece of string and tied one end to the bedroom and walked this path playing out the string behind you as you went then when you returned to your bedroom you would have both ends of the string visible. You could then reel in the string from your journey, without letting go of the ends, and see that indeed, both ends were in fact the ends of the same string and so you had retuned to the start of your journey. But if you'd taken the round trip through the double-doored kitchen you wouldn't be able to pull the string tight and so you could no longer be sure.

In what follows we consider two paths from A to B equivalent in this way: mark the two paths with strings going from A to B. If you can pull one string to line up exactly with the other (you're allowed to stretch) without breaking the string or moving the endpoints then they are considered equivalent. (The correct technical term is homotopic.) So if a path that appears to go from A to A is homotopic to the trivial zero length path, then you know that the path really is from A back to A. But if you can't shrink the path down to zero length you're faced with the possibility that the string is extended from A to a similar looking place, A'. (Of course this implies a considerable conspiracy with someone tying a distinct identical string to an identical bedroom, but if you can't see both ends, alien beings from a higher dimension could do anything...)

The best case scenario is that all of the bedrooms are the same - a conventional house. Call this house X. But what's the worst case scenario? How many different copies of the bedroom might there be? Well let's call the starting bedroom 0. If we walk anti-clockwise round the house, going through the kitchen, we end up back at a bedroom. In the worst case scenario this will be another bedroom. Let's call it 1. Walk round again and we get to another copy of the bedroom, call it 2 and so on. So we have an infinite sequence of bedrooms, (0,1,2,…). But we could have walked clockwise. In that case we'd get a sequence (-1,-2,…). So now think about the set of all points in this infinite house. Each point can be described as a pair (x,n) where x is a point in the house X and n labels which of the infinite number of paths you could have taken to get there. (Remember, if the two paths are homotopic we consider them to be the same path.)

If using a house as an example seems too far-fetched think of a mult-storey car park. It's easy to walk what you think is a closed loop but find that you've in fact walked to almost identical looking, but different, floor.

(Incidentally, this kind of infinite house is the kind of thing that can be created in a Portal engine such as in the game Portal.)

Let's think about this more generally. Given a topological space X we can form another space X' from it by (1) picking a point p in it and (2) forming the set of all pairs (x,n) where x is a point in X and n is a path (up to homotopy) from p to x. X' is known as the universal cover of X.

Enough with houses, let's try a more purely geometric example. Consider the space of rotations in 2D, otherwise known as SO(2). A rotation can simply be described by an angle with the proviso that a rotation through an angle θ is the same as a rotation through θ+2nπ where n is an integer. The space of rotations is essentially the same thing as the circle (by circle I mean the 1-dimensional boundary of a 2-dimensional disk). Looping once round the circle brings you back to where you started. So how can we visualise the universal cover of SO(2)? There's a nice trick for this. Imagine a disk that's free to rotate around one axis. SO(2) describes the rotations of this disk. But now imagine a belt, fixed at one end, connected at the other end to the centre of this disk like this:



The rule is that we're allowed to twist the belt as much as we like as long as the centre line of the belt is along the axis. Notice how if we rotate the disk through an angle of 2nπ the disk returns to its starting point, but the belt 'remembers' how many twists we took to get there. Notice another thing: if we travel along the belt from one end to the other, the orientation of the belt at each point gives a 2D rotation. In other words, the belt itself represents a path in SO(2) with one one representing a fixed point in SO(2) and the other end matching the current orientation of the disk. So if the orientations of the disk represent elements of SO(2), the possible twistings of the belt represent elements in the universal cover of SO(2). And what the twists do is allow us to differentiate between rotations through θ and rotations through θ+2nπ. In other words, the universal cover is just the real line, but not folded down so that θ is the same as θ+2nπ.

The universal cover is the worst case scenario where all non-homotopic paths are considered to represent distinct points. But it's also possible to construct an in-between scenario where some paths that aren't homotopic to each other are still considered to lead to the same place. For example, take the real line again. If we identify 0 and 2π we get SO(2). If we consider 0 and 2nπ to be distinct for n≠0 we get the universal cover. But we pick an N≠1 and consider 2Nπ to be the same as 0, but that 2nπ is distinct from 0 for 0<n<N. What we now end up with is an N-fold cover of SO(2). Unfortunately I don't know a way construct a physical example of a general N-fold cover using belts and discs. (At least for N>2. For N=2 you should easily be able to come up with a mechanism based on what's below.)

With the examples so far, the universal cover of a space has always had an infinite number of copies of the original space. That isn't always the case and now we'll meet an example where the universal cover is a double cover: the real projective plane, RP2. The easiest way to visualise this is to imagine the game of Asteroids played on a circular TV screen. In the usual Asteroids, when you go off one side of the screen you reappear on the opposite side. In this version it's much the same, points on the edge that are diametrically opposite are considered to be the same points. Such a universe is a real projective plane. Let P be the point at the centre of the screen. Suppose we attach one end of our string to P and fly off towards the edge, reappearing on the other side. Eventually we get back to P again. Can we shrink this path down to a point? Unfortunately, any path that crosses the edge of our screen will always cross the edge no matter how much we try to deform it. If we move the point at which it crosses one edge of the screen, the point at the other side always remains diametrically opposite.




This means you can never bring these points together in order to collapse down a loop. On the other hand, suppose we loop twice round the edge. The following diagram shows how this path can be shrunk down to a loop:



So given two distinct points in RP2 there are two inequivalent paths between them, and so the universal cover of RP2 is made from two copies of the space and is in fact a double cover.

Now we'll try a 3D space - the space of 3D rotations, otherwise known as SO(3). This space is 3-dimensional because it takes 3 numbers to define a rotation, for example we can specify a rotation using a vector whose direction is the axis of rotation and whose length is the angle of rotation. Now all possible rotations are rotations through an angle 0≤θ≤π for some exis. So every rotation can be represented as a point in the ball of radius π. But a rotation around the axis a through an angle of π is also the same as a rotation around the axis -a through π. In other words, SO(3) is the closed 3-dimensional ball with diametrically opposite points identified. This is very similar to the situation with RP2. In fact, almost exactly the same argument shows that the universal cover of SO(3) is the double cover. (For an alternative wording, see this posting.) There is also another way of seeing this that is analogous to the disk with belt example above. This time we have a sphere instead of a disk. The belt is fixed at one end and the other end is attached to the sphere. We are free to rotate the sphere however we want, and we're no longer confined to having the belt along one axis. It is an amazing but easily demonstrable fact that if you rotate the sphere through 2π, you cannot untwist the belt, but if you rotate it through 4π you can! This is known as the Dirac Belt Trick, and if you don't believe it, watch the Java applet written by science fiction author Greg Egan.

If you've done much computer graphics you'll have already met the double cover of SO(3). Computer graphics people sometimes like to represent rotations using unit quaternions. The unit quaternions form a 3-dimensional sphere but there are two such quaternions for each rotation. So the unit quaternions actually represent this double cover, and SO(3) is a 3-dimensional sphere with antipodean points identified.

And that's the geometry dealt with. In the next installment I'll say what this has to do with electrons and my quantum computing code.


Footnote


Balls and Spheres. The closed n-dimensional ball is the set of points (x1,...,xn) such that x12+...+xn2≤1. The (n-1)-dimensional sphere is the set of points (x1,...,xn) such that x12+...+xn2=1.

Wednesday, March 21, 2007

The Angel Problem has been Solved! (Maybe)

Forget computing the Kazhdan-Lusztig polynomials for E8, a far more important theorem has apparently been proved. I won't state the answer here as it's fun to think about the problem for yourself. But a number of proposed proofs of solutions to the Angel problem have been informally published, at least electronically.

Firstly, what is the problem? Wikipdia has a succinct write up. But here's a quick summary anyway:

The game of Angels and Devils, invented by John Conway (the same Conway as in the Game of Life, the Monstrous Moonshine conjectures, the proof that subatomic particles have free will and so on), is played on an infinite chessboard. Fix some constant k. One player plays the Angel. The other plays the Devil. On the Devil's turn, he may eat one square on the chessboard. (The devil doesn't actually have a piece, he just eats whatever square he wants.). On the Angel's turn she jumps to a square (distinct from the current one) up to k king's moves away. The Devil wins if the Angel has no legal move. The Angel wins if she has a strategy that guarantees always to have a legal move. For each k, who wins? Until recently it was unknown if the Angel could win even for arbitrarily large k. This may seem like a surprise, after all, if the angel can move 1,000 king's moves, surely she can easily evade the jaws of the devil who can only chomp one square at a time. But surprisingly, the obvious Angel strategies that come to mind can be countered.

What's nice about this problem is that it's so easy to state, and yet has stumped some of the smartest people who like to think about these problems.

Two papers with possible solutions are Máthé's and Kloster's. I don't think either of them have formally been published yet so I don't know if the peer review is complete.

Sunday, March 18, 2007

The Shor Quantum Error Correcting Code (and a Monad for Heat)

Quantum computers promise great things, but that promise will only come to be if quantum computers can be made reliable. In order to achieve that reliability we need qubits that can maintain their state for extended periods of time. Unfortunately, just about any type of interaction between a qubit and its environment can render the qubit useless. But there is a solution - a quantum error correcting code. This is very similar to classical error correcting code in which a number of bits are stored, in a distributed fashion, over a larger set of bits, in such a way that if some of the bits are corrupted, the original state can be recovered. In the following will be some code to simulate the implementation of one of these codes: the nine bit Shor code.

Firstly, let me include the quantum monad that I've developed over earlier weeks:


> import Prelude hiding (flip)
> import Maybe
> import List
> import Data.Map (toList,fromListWith)
> import Complex
> import Data.Bits hiding (rotate)

> infixl 1 ->-, -><, =>=
> infixl 7 .*


Remember that W b a is the type of elements of a vector space over b with a basis whose elements are labelled by elements of a. I'm using the letter W as I'm thinking of elements of b as being weights.


> data W b a = W { runW :: [(a,b)] } deriving (Eq,Show,Ord)


It's useful to be able to apply transforms to all of the weights:


> mapW f (W l) = W $ map (\(a,b) -> (a,f b)) l


W is an instance of Num, Functor and Monad. Vector spaces form monads because you can specify a function between vectors spaces in terms of how it maps basis elements in the first vector space. The monad bind function lifts the function on basis elements to the entire vector space.


> instance Functor (W b) where
> fmap f (W a) = W $ map (\(a,p) -> (f a,p)) a

> instance Num b => Monad (W b) where
> return x = W [(x,1)]
> l >>= f = W $ concatMap (\(W d,p) -> map (\(x,q)->(x,p*q)) d) (runW $ fmap f l)

(.*) gives us scalar multiplication:

> a .* b = mapW (a*) b

> instance (Eq a,Show a,Num b) => Num (W b a) where
> W a + W b = W $ (a ++ b)
> a - b = a + (-1) .* b
> _ * _ = error "Num is annoying"
> abs _ = error "Num is annoying"
> signum _ = error "Num is annoying"
> fromInteger a = if a==0 then W [] else error "fromInteger can only take zero argument"


It'd be cool if the bind function for W could simplfy our vectors by collecting together multiples of the same basis elements. Unfortunately this is tricky to do for reasons described in Eric Kidd's recent Set monad article. Instead I'll use an explicit function, collect, that collects our terms together. I've also used a crude method for throwing out terms that probably ought to be zero. A better solution might be to work only with rational numbers, or exact algebraic numbers.


> nonTrivial x = 10+abs x/=10
> collect :: (Ord a,Num b) => W b a -> W b a
> collect = W . filter (nonTrivial . snd) . toList . fromListWith (+) . runW


And here is the quantum monad:


> type Q a = W (Complex Double) a


There are a number of ways to apply functions to vector spaces. I'll be writing functions on the right so that I can apply operations to a quantum state reading from left to right. So here's a bunch of combinators to implement this. First there's ordinary function application:


> (->-) :: a -> (a -> b) -> b
> x ->- f = f x


Then there are functions that operate on the vector space by simply relabelling the basis:


> (=>=) :: Q a -> (a -> b) -> Q b
> x =>= f = fmap f x


Then there are functions that map from Q a to Q b but are defined by how they act on the basis elements of Q a:


> (-><) :: Q a -> (a -> Q b) -> Q b
> g ->< f = g >>= f


As I mentioned last post on entanglement, we can form states for joint independent systems using the tensor product:


> (<*>) :: Q a -> Q b -> Q (a,b)
> x <*> y = do
> a <- x
> b <- y
> return (a,b)


The tensor product is a lot like a multiplication table. The coefficient of (a,b) in x <*> y is the product of the coefficients of a and b in x and y respectively. If two systems are unentangled then you can simulate the physics of each part separately. So it'd be nice to have a way to test whether or not a given state is unentangled, in other words, to test whether a state is a tensor product. As I'm not developing a serious application here, here's a quick and dirty algorithm that more or less does what anyone would do when faced with the task of recovering the row and column headings from a multiplcation table:


> norm x = sqrt $ sum $ map (\(_,a) -> magnitude a^2) $ runW $ collect x

> untensor :: (Ord a,Ord b) => Q (a,b) -> (Q a,Q b)
> untensor (W a) =
> let u = nub $ sort $ map (fst . fst) a
> v = nub $ sort $ map (snd . fst) a
> p = W $ map (\x -> (x,fromJust (lookup (x,head v) a))) u
> q = W $ map (\x -> (x,fromJust (lookup (head u,x) a))) v
> in ((1/norm p :+ 0) .* p,(1/norm q :+ 0) .* q)


untensor recovers x and y from x <*> y (up to multiplication by a complex number with unit norm) but it gives garbage if the argument isn't a tensor product. So we need to test whether or not the original vector is the tensor product of the inputs. So here's a test of how close the original is to the product of the putative factors:


> independence x = collect (x-uncurry (<*>) (untensor x))


We'll be doing lots of stuff with tensor products so it's useful to have some tools to apply functions to one or other factor:


> mapFst f (a,b) = (f a,b)
> mapSnd f (a,b) = (a,f b)
> mapFstM f (a,b) = do
> a' <- a ->- f
> return (a',b)
> mapSndM f (a,b) = do
> b' <- b ->- f
> return (a,b')


If a bit is a Bool then by rights, a qubit should be a Q Bool. But I think 0 and 1 look nicer so I'm using Q Integer instead. So here's the fundamentally quantum rotation operator:


> rotate :: Double -> Integer -> Q Integer
> rotate theta 1 = let theta' = theta :+ 0
> in cos (theta'/2) .* return 1 - sin (theta'/2) .* return 0
> rotate theta 0 = let theta' = theta :+ 0
> in cos (theta'/2) .* return 0 + sin (theta'/2) .* return 1


Although rotate theta is a map Integer -> Q Integer, using ->< it becomes a map Q Integer to Q Integer. Note that the Double argument isn't float data inside a quantum device, it's a hard-coded feature of the hardware itself. For example, if we choose theta to be pi/2 and -pi/2 we get the standard square root of not operation and its inverse:


> hnot = rotate (pi/2)
> unhnot = rotate (-pi/2)


(You may be wondering about the use of theta/2 in my definition of rotate. There's an interesting story behind that involving what I think is one of the weirdest features of physics. For another day.)

I'm also goint to manipulate lots of lists of qubits so here are some tools for that:


> mapAt i f x = let (h,t:ts) = splitAt i x in h ++ (f t : ts)

> mapAtM :: Monad m => Int -> (a -> m a) -> [a] -> m [a]
> mapAtM i f x = let (h,t:ts) = splitAt i x in do
> y <- f t
> return $ h ++ (y : ts)

> flip a = 1-a
> flipAt i = mapAt i flip


So here's a statement of the problem. We have a qubit, ie. something of type Q Integer. But quantum systems are very delicate and our qubit may wander into a different state. In particular, after a time t it will be as if rotate theta has been applied to it, except that we can't control theta. We need to find a way to recover the original qubit. In fact, it's worse than that: our qubit is in some environment which we can consider to be a list of qubits. So our state is given by an instance of Q ([Integer],Integer) and the interaction might not simply be a rotation of the qubit but an operation that entangles the qubit with the environment. This might happen if the qubit isn't perfectly isolated.

Let's forget about the full generality here and consider just one type of data corruption - a qubit that might or might not have been flipped. How can we make a device robust against such occurences? Well this is practically identical to the problem of making classical data robust against noise. We can simply use a 3 bit repetition code to store our qubits as three qubits. So here's an implementation of a function to repair a codeword from the classical 3 bit repetition code:


> repairRepetition' [b,c,d] = (if a<3 then mapAt a flip else id) [b,c,d] where
> a = case (b,c,d) of
> (0,0,0) -> 3
> (1,1,1) -> 3
> (0,0,1) -> 2
> (1,1,0) -> 2
> (0,1,0) -> 1
> (1,0,1) -> 1
> (1,0,0) -> 0
> (0,1,1) -> 0


But there are a couple of problems with this. Firstly this is a many-to-one map because [0,0,0] and [0,0,1], say, both get mapped to [0,0,0]. In quantum mechanics, all of our maps need to be unitary, and in particular, invertible. A many-to-one map isn't invertible. Secondly, we're creating a register a, and again this is not an invertible operation. These two problems will actually cancel each other out because the apparent destruction of information that comes about from our many-to-one map can be fixed by allowing a to carry that information away. And we can ensure reversibility by using a supplied register rather than creating a new one:


> repairRepetition'' (a,[b,c,d]) = (a,(if a<3 then mapAt a flip else id) [b,c,d]) where
> a = case (b,c,d) of
> (0,0,0) -> a `xor` 3
> (1,1,1) -> a `xor` 3
> (0,0,1) -> a `xor` 2
> (1,1,0) -> a `xor` 2
> (0,1,0) -> a `xor` 1
> (1,0,1) -> a `xor` 1
> (1,0,0) -> a `xor` 0
> (0,1,1) -> a `xor` 0


It's easy to write an inverse for this function so it's a valid unitary transform. But it's actually a bit tedious to write out all those xors. So we can cheat a bit. Let's just assume there's an infinite supply of input zeroes and use them at our leisure. We no longer need to write the xors, we just note that every time we assign a value to a new register the hardware engineer building your device installs some means of import zero qubits. The a register in the above function will never be touched again. But we can't simply throw it away. It's a kind of side effect and we'll call these bits <em>ancillae</em>. We'll have lots of these side effects that we need to collect up, so here's the final quantum repetition codeword repair code:


> repairRepetition (a,[b,c,d]) = (i:a,(if i<3 then mapAt i flip else id) [b,c,d]) where
> i = case (b,c,d) of
> (0,0,0) -> 3
> (1,1,1) -> 3
> (0,0,1) -> 2
> (1,1,0) -> 2
> (0,1,0) -> 1
> (1,0,1) -> 1
> (1,0,0) -> 0
> (0,1,1) -> 0


There's one last thing we could do to this code: the pairs being handed in and out of this function are essentially just instances of the Writer monad. So we could just squirrel away our ancillae inside a monad. I won't do that here because I think it's good to see everything explicitly when figuring this stuff out for the first time. But think about the bits that we'd be writing out. They correspond to information, or entropy, that we'd ultimately like to pump out of our device as exhaust. In other words they are heat. So this monad would, in effect, be a heat monad. I found this amusing given what I wrote in reason number eight in my list from last year.

So far so good. But that only corrects one type of error: a bit flip. But note that it also protects superpositions from bit flips. So let's take a quantum state built from repetition codewords placed in an empty environment:


> state1 = return [] <*> (return [0,0,0] + return [1,1,1])


Now corrupt the last bit:


> state1' = state1 =>= mapSnd (mapAt 2 flip)


And repair it:


> state1'' = state1' =>= repairRepetition


Although we have repaired the state of our system, the state of the environment has changed. But that's fine because this state is still an unentangles tensor product of the environment and our system. You can see this just by looking at the state, but for peace of mind use independence state1'' to see that the deviation from being a tensor product is the zero vector.

But that's still only a narrow class of error that we can fix. Another type of error maps return 0 to return 0 but maps return 1 to -return 1. This is known as a phase flip. One way to view this as as a flip in a rotated basis:

> phaseFlip x = x ->- hnot ->- fmap flip ->< unhnot
> pFlipAtM i = mapAtM i phaseFlip

There's a nice strategy to protect against phase flips: rotate our data into a basis where phase flips become ordinary flips, and use the repetition code. So we'll use these two states to represent 0 and 1:


> plus,minus :: Q [Integer]
> plus = return [1,1,1] ->< mapM unhnot
> minus = return [0,0,0] ->< mapM unhnot


So let's make a state that is a superposition of these states in some environment:


> state2 = return [] <*> (plus + minus) ->- collect


Corrupt the middle bit with a phase flip:


> state2' = state2 ->< mapSndM (mapAtM 1 phaseFlip)


And repair it:


> state2'' = state2 ->< mapSndM (mapM hnot) =>= repairRepetition ->< mapSndM (mapM unhnot) ->- collect


Comparing state2 and untensor state2'' we see that the repair is successful. Unfortunately, we can use this scheme to repair phase flips, but not plain flips. We need a scheme that can repair both of these. So now we turn to Shor's ingenious solution: use both of these schemes as the same time. So we'll use the scheme that protects against phase flips with the twist that instead of using three qubits we use three triplets of qubits. We can use the three triplets to fix phase flips and we can use the three bits within a triplet to fix flips. We'll represent 0 and 1 using shor 0 and shor 1 as defined here:


> tripletPlus = (1/sqrt 2) .* return [0,0,0] + (1/sqrt 2) .* return [1,1,1]
> tripletMinus = (1/sqrt 2) .* return [0,0,0] - (1/sqrt 2) .* return [1,1,1]

> shor 1 = do
> a <- tripletPlus
> b <- tripletPlus
> c <- tripletPlus
> return $ a ++ b ++ c

> shor 0 = do
> a <- tripletMinus
> b <- tripletMinus
> c <- tripletMinus
> return $ a ++ b ++ c


Repair is a little more complex before. First we use 6 ancilla bits to fix qubits within each triplet using the repetition code three times:


> ancilla (i,j,k) x = ((x!!i) `xor` (x!!j),(x!!j) `xor` (x!!k))

> fix (i,j,k) (e,x) = let (u,v) = ancilla (i,j,k) x
> b = case (u,v) of
> (0,0) -> x
> (0,1) -> flipAt k x
> (1,0) -> flipAt i x
> (1,1) -> flipAt j x
> in (u:v:e,b)

> flipRepair start = start
> =>= fix (0,1,2)
> =>= fix (3,4,5)
> =>= fix (6,7,8)


We also use two ancilla bits to locate phase flips between the triplets. phaseFlipFix will be applied in a rotated basis inside phaseFlipRepair:


> phaseFlipFix :: ([Integer],[Integer]) -> Q ([Integer],[Integer])
> phaseFlipFix (e,y) = do
> let u = (y!!0) `xor` (y!!1) `xor` (y!!2) `xor` (y!!3) `xor` (y!!4) `xor` (y!!5)
> let v = (y!!3) `xor` (y!!4) `xor` (y!!5) `xor` (y!!6) `xor` (y!!7) `xor` (y!!8)
> let c = case (u,v) of
> (0,0) -> y
> (0,1) -> y ->- flipAt 6
> (1,0) -> y ->- flipAt 0
> (1,1) -> y ->- flipAt 3
> return (u:v:e,c)

> phaseFlipRepair x = x
> ->< mapSndM (mapM hnot)
> ->- collect
> ->< phaseFlipFix
> ->- collect
> ->< mapSndM (mapM unhnot)
> ->- collect


Now we can pull all of this together into a single process to repair both kinds of error:


> shorRepair x = x ->- flipRepair ->- collect ->- phaseFlipRepair ->- collect


So now all that's needed is to try corrupting some qubits to see if we can repair them.


> state3 = return [] <*> shor 1
> state3' = state3 ->< mapSndM (pFlipAtM 0)
> state3'' = state3' ->- shorRepair


Let's try a more complex superposition with a more complex corruption:


> state4 = return [] <*> (0.6.*shor 0 + 0.8.*shor 1) ->- collect
> state4' = state4 ->< mapSndM (pFlipAtM 5) =>= mapSnd (flipAt 5) ->- collect
> state4'' = state4' ->- shorRepair ->- collect


This is all well and good. But there are more ways to corrupt a qubit beyond simple flips. Let's mess up our state using two additional kinds of 'noise' that we haven't discussed. Firstly, instead of just flipping, we'll rotate one of the bits. Secondly, we'll additionally 'leak' that bit to the environment so as to cause decoherence. We've discussed solutions to neither of these problems, so let's see what happens.


> leak i x = do
> (e,v) <- x
> return ((v!!i):e,v)

> state5 = state4
> state5' = state5 ->- leak 4 ->< mapSndM (mapAtM 4 (rotate (pi/10))) ->- collect
> state5'' = state5' ->- shorRepair ->- collect


A minor miracle has taken place. The Shor code is robust against just about anything you can throw at a single qubit. In fact, any code that can survive flips and phase flips of a single qubit can survive any kind of corruption, including entanglement with the environment. (Despite having read the proof, I was still pretty surprised when the code actually worked!) And that gives researchers one hope that these methods might serve as ways to make quantum computing reliable.

I'll mention a couple more things in closing. Firstly, nowhere above have I mentioned wavefunction collapse, observations or superoperators. You don't need any of these things to define the Shor code and show that it can correct errors. (You probably do need some extra assumptions about quantum mechanics if you want to calculate probabilities of various outcomes in the event that the code fails - eg. due to multiple qubits being corrupted.) And secondly, despite it being my April 1st post last year, I still have the doubts I expressed here about quantum computing.

Oh...and expect those lines to take a few seconds to evaluate. When you simulate the Shor code you're performing computations in a 512-dimensional vector space!

Bibliography

  1. Peter Shor's original paper has good discussion of the issues but I found the description of the correction algorithm a little laconic.
  2. John Preskill's Lecture Notes were probably the most useful document.
  3. A student term paper by Dheera Venkatraman was also pretty useful though I was confused for a while by what might be considered errors in it.


Update
Fixed a couple of minor typos. And remember this code should just run, without modifications, in GHCi, so you can play with it yourself.

Sunday, March 11, 2007

Independence, entanglement and decoherence with the quantum monad

Before I can do anything more sophisticated with my code for probability theory and quantum mechanics I need to talk about a couple of related subjects - independence and entanglement. Again, I'm using the fact that formally, quantum mechanics is almost identical to probability theory, to kill two birds with one stone. So firstly, here's the probability/quantum code that I need from before:

> import Data.Map (toList,fromListWith)
> import Complex
> infixl 7 .*

> data W b a = W { runW :: [(a,b)] } deriving (Eq,Show,Ord)

> mapW f (W l) = W $ map (\(a,b) -> (a,f b)) l

> instance Functor (W b) where
>   fmap f (W a) = W $ map (\(a,p) -> (f a,p)) a

> instance Num b => Monad (W b) where
>   return x = W [(x,1)]
>   l >>= f = W $ concatMap (\(W d,p) -> map (\(x,q)->(x,p*q)) d) (runW $ fmap f l)

> a .* b = mapW (a*) b

> instance (Eq a,Show a,Num b) => Num (W b a) where
>    W a + W b = W $ (a ++ b)
>    a - b = a + (-1) .* b
>    _ * _ = error "Num is annoying"
>    abs _ = error "Num is annoying"
>    signum _ = error "Num is annoying"
>    fromInteger a = if a==0 then W [] else error "fromInteger can only take zero argument"

> collect :: (Ord a,Num b) => W b a -> W b a
> collect = W . filter ((/= 0) . snd) . toList . fromListWith (+) . runW

> type P a = W Float a
> type Q a = W (Complex Float) a


Now consider this probability distribution:

> dice = sum [1/36 .* return (i,j) | i <- [1..6], j <- [1..6]]
This corresponds to the usual distribution you expect from rolling two dice. As we know, under normal circumstances, the outcome of one die is doesn't depend on the other - in other words they are independent. We can express this independence mathematically as follows. First make the distribution for one die:
> die = sum [1/6 .* return i | i <- [1..6]]
Now we can write code to make two independent die rolls:
> two_rolls = do
>   a <- die
>   b <- die
>   return (a,b)
We find that dice=two_rolls. We can make this more generic by defining a joint distribution as follows:
> tensor dist1 dist2 = do
>   a <- dist1
>   b <- dist2
>   return (a,b)
So tensor die die=dice. (This operation is actually a tensor product.) So now we can give a definition: if we have a distribution on (a,b), then the first and second components are said to be independent if the distribution can be written as tensor dist1 dist2 for some choice of dist1 and dist2. But so far I've only been talking about the probability monad. What about the quantum monad Q? In that case the notion is almost identical, except the language is inverted. In that case, a distribution on (a,b) is said to be entangled if it can't be written as tensor state1 state2. Despite all of the mystification in the popular literature, that's all it means. So why is entanglement important? One of the key ideas is this. Suppose we have a process that acts only one one part of a combined system. In other words, consider something like this:
> apply_part f x = do
>   (a,b) <- x
>   b' <- f b
>   return (a,b')
If the two subcomponents are independent (or unentangled) so that x=tensor dist1 dist2 then apply_part f x=tensor dist1 (dist2 >>= f). In other words, if we're dealing with a distribution like tensor dist1 dist2, and we're interested in studying operations on the second component, then we can completely ignore the first component during the process.
> coin = 0.75 .* return True + 0.25 .* return False
> f x = 0.125 .* return x + 0.875 .* return (not x)
> ex1 = collect (tensor coin (coin >>= f))
> ex2 = collect (apply_part f (tensor coin coin))
ex1=ex2. Now suppose we have a combined system consisting of two parts: some subsystem, and its environment, ie. something of type (Environment,Subsystem). Then as long as the environment and the subsystem are independent, we can study the subsystem as if the environment weren't there. But now suppose information can 'leak' out from the subsystem. For example consider a process like:
> a `xor` b = a/=b
> process x = do
>   (e,s) <- x
>   let e' = e `xor` s
>   return (e',s)
Because there is some flow from s to e', there is no guarantee that process x has independent subparts. As a result, we can no longer treat s as an independent subsystem, and theorems that are true about an independent s no longer hold. For example, define:
> rotate :: Float -> Bool -> Q Bool
> rotate theta True = let theta' = theta :+ 0
>   in cos (theta'/2) .* return True - sin (theta'/2) .* return False
> rotate theta False = let theta' = theta :+ 0
>   in cos (theta'/2) .* return False + sin (theta'/2) .* return True
We know that return True >>= rotate (pi/10) >>= rotate (-pi/10) gives us back the state we started with. Similarly, with
> pure x = do
>   (a,b) <- x
>   b' <- rotate (pi/10) b
>   b'' <- rotate (-pi/10) b'
>   return (a,b'')
pure acts as the identity on the second component. In particular, pure (e,True) is the same as tensor e (return True) because the terms involving false cancel out. But if we have a 'leak' as in:
> impure x = do
>   (a,b) <- x
>   b' <- rotate (pi/10) b
>   a' <- return (a `xor` b') -- information flowing out
>   b'' <- rotate (-pi/10) b'
>   return (a',b'')
and consider impure (e,True) we end up with something where the False terms don't cancel out. Compare collect $ pure $ return (False,False) with collect $ impure $ return (False,False). We're familiar with the idea that something flowing into a system might mess it up. But here we have the situation that information flowing out of a physical system can also mess it up. Any kind of entanglement between a system and its environment is called decoherence. And what I've shown is that interactions between a system and its environment can cause a physical system to 'decohere', even if the direction of interaction appears to be only outwards.

And that's one reason why quantum computing is so hard. Not only must nothing in the environment affect our qubits, neither must any information about our qubits leave the system before we're ready to have them do so.

Anyway, now I'm equipped to start writing code to simulate quantum error correction...

Update: Tried to reuse some of this code and found a bug which I've now fixed. Sorry if this pops it to the front of various feeds. It's an old post.

Tuesday, March 06, 2007

Monads, Vector Spaces and Quantum Mechanics pt. II

Back from wordpress.com:

I had originally intended to write some code to simulate quantum computers and implement some quantum algorithms. I'll probably eventually do that but today I just want to look at quantum mechanics in its own right as a kind of generalisation of probability. This is probably going to be the most incomprehensible post I've written in this blog. On the other hand, even though I eventually talk about the philosophy of quantum mechanics, there's some Haskell code to play with at every stage, and the codeives the same results as appear in physics papers, so maybe that will help give a handle on what I'm saying.

First get some Haskell fluff out of the way:

> import Prelude hiding (repeat)
> import Data.Map (toList,fromListWith)
> import Complex
> infixl 7 .*

Now define certain types of vector spaces. The idea is the a W b a is a vector in a space whose basis elements are labelled by objects of type a and where the coefficients are of type p.

> data W b a = W { runW :: [(a,b)] } deriving (Eq,Show,Ord)

This is very similar to standard probability monads except that I've allowed the probabilities to be types other than Float. Now we need a couple of ways to operate on these vectors.

mapW allows the application of a function transforming the probabilities...

> mapW f (W l) = W $ map (\(a,b) -> (a,f b)) l

and fmap applies a function to the basis element labels.

> instance Functor (W b) where

> fmap f (W a) = W $ map (\(a,p) -> (f a,p)) a

We want our vectors to support addition, multiplication, and actually form a monad. The definition of >>= is similar to that for other probability monads. Note how vector addition just concatenates our lists of probabilities. The problem with this is that if we have a vector like a+2a we'd like it to be reduced to 3a but in order to do that we need to be able to spot that the two terms a and 2a both contain multiples of the same vector, and to do that we need the fact that the labels are instances of Eq. Unfortunately we can't do this conveniently in Haskell because of the lack of restricted datatypes and so to collect similar terms we need to use a separate collect function:

> instance Num b => Monad (W b) where
> return x = W [(x,1)]
> l >>= f = W $ concatMap (\(W d,p) -> map (\(x,q)->(x,p*q)) d) (runW $ fmap f l)

> a .* b = mapW (a*) b

> instance (Eq a,Show a,Num b) => Num (W b a) where
> W a + W b = W $ (a ++ b)
> a - b = a + (-1) .* b
> _ * _ = error "Num is annoying"
> abs _ = error "Num is annoying"
> signum _ = error "Num is annoying"
> fromInteger a = if a==0 then W [] else error "fromInteger can only take zero argument"

> collect :: (Ord a,Num b) => W b a -> W b a

> collect = W . toList . fromListWith (+) . runW

Now we can specialise to the two monads that interest us:

> type P a = W Float a

> type Q a = W (Complex Float) a

P is the (hopefully familiar if you've read Eric's recent posts) probability monad. But Q allows complex probabilities. This is because quantum mechanics is a lot like probability theory with complex numbers and many of the rules of probability theory carry over.

Suppose we have a (non-quantum macroscopic) coin that we toss. It's state might be described by:

> data Coin = Heads | Tails deriving (Eq,Show,Ord)

> coin1 = 0.5 .* return Heads + 0.5 .* return Tails :: P Coin

Suppose that if Albert sees a coin that is heads up he has a 50% chance of turning it over and if he sees a coin that is tails up he has a 25% chance of turning it over. We can describe Albert like this:

> albert Heads = 0.5 .* return Heads + 0.5 .* return Tails
> albert Tails = 0.25 .* return Heads + 0.75 .* return Tails

We can now ask what happens if Albert sees a coin originally turned up heads n times in a row:

> repeat 0 f = id
> repeat n f = repeat (n-1) f . f
> (->-) :: a -> (a -> b) -> b

> g ->- f = f g

> (-><) :: Q a -> (a -> Q b) -> Q b
> g ->< f = g >>= f

> albert1 n = return Heads ->- repeat n (->< albert) ->- collect


Let me explain those new operators. ->- is just function application written from left to right. The > in the middle is intended to suggest the direction of data flow. ->< is just >>= but I've written it this way with the final < intended to suggest the way a function a -> M b 'fans out'. Anyway, apropos of nothing else, notice how Albert approaches a steady state as n gets larger.

Quantum mechanics works similarly but with the following twist. When we come to observe the state of a quantum system it undergoes the following radical change:

> observe :: Ord a => Q a -> P a

> observe = W . map (\(a,w) -> (a,magnitude (w*w))) . runW . collect

Ie. the quantum state becomes an ordinary probabilistic one. This is called wavefunction collapse. Before collapse, the complex weights are called 'amplitudes' rather than probabilities. The business of physicists is largely about determining what these amplitudes are. For example, the well known Schrödinger equations is a lot like a kind of probabilistic diffusion, like a random walk, except with complex probabilities instead of amplitudes. (That's why so many physicists have been hired into finance firms in recent years - stocks follow a random walk which has formal similarities to quantum physics.)

The rules of quantum mechanics are a bit like those of probability theory. In probability theory the sum of the probabilites must add to one. In addition, any process (like albert) must act in such a way that if the input sum of probabilities is one, then so is the output. This means that probabilistic process are stochastic. In quantum mechanics the sum of the squares of the magnitudes of the amplitudes must be one. Such a state is called 'normalised'. All processes must be such that normalised inputs go to normalised outputs. Such processes are called unitary ones.

There's a curious subtlety present in quantum mechanics. In classical probability theory you need to have the sum of the probabilities of your different events to sum to one. But it's no good having events like "die turns up 1", "die turns up 2", "die turns up even" at the same time. "die turns up even" includes "die turns up 2". So you always need to work with a mutually exclusive set of events. In quantum mechanics it can be pretty tricky to figure out what the mutually exclusive events are. For example, when considering the spin of an electron, there are no more mutually exclusive events beyond "spin up" and "spin down". You might think "what about spin left?". That's just a mixture of spin up and spin down - and that fact is highly non-trivial and non-obvious. But I don't want to discuss that now and it won't affect the kinds of things I'm considering below.

So here's an example of a quantum process a bit like albert above. For any angle $latex \theta$, rotate turns a boolean state into a mixture of boolean states. For $latex \theta=0$ it just leaves the state unchanged and for $latex \theta=\pi$ it inverts the state so it corresponds to the function Not. But for $latex \theta=\pi/2$ it does something really neat: it is a kind of square root of Not. Let's see it in action:

> rotate :: Float -> Bool -> Q Bool
> rotate theta True = let theta' = theta :+ 0
> in cos (theta'/2) .* return True - sin (theta'/2) .* return False

> rotate theta False = let theta' = theta :+ 0
> in cos (theta'/2) .* return False + sin (theta'/2) .* return True

> snot = rotate (pi/2)

> repeatM n f = repeat n (>>= f)

> snot1 n = return True ->- repeatM n snot ->- observe

We can test it by running snot1 2 to see that two applications take you to where you started but that snot1 1 gives you a 50/50 chance of finding True or False. Nothing like this is possible with classical probability theory and it can only happen because complex numbers can 'cancel each other out'. This is what is known as 'destructive interference'. In classical probability theory you only get constructive interference because probabilities are always positive real numbers. (Note that repeatM is just a monadic version of repeat - we could have used it to simplify albert1 above so there's nothing specifically quantum about it.)

Now for two more combinators:

> (=>=) :: P a -> (a -> b) -> P b
> g =>= f = fmap f g
> (=><) :: P (Q a) -> (a -> Q b) -> P (Q b)
> g =>< f = fmap (>>= f) g

The first just uses fmap to apply the function. I'm using the = sign as a convention that the function is to be applied not at the top level but one level down within the datastructure. The second is simply a monadic version of the first. The reason we need the latter is that we're going to have systems that have both kinds of uncertainty - classical probabilistic uncertainty as well as quantum uncertainty. We'll also want to use the fact that P is a monad to convert doubly uncertain events to singly uncertain ones. That's what join does:

> join :: P (P a) -> P a
> join = (>>= id)

OK, that's enough ground work. Let's investigate a physical process that can be studied in the lab: the Quantum Zeno effect, otherwise known as the fact that a watched pot never boils. First an example related to snot1:

> zeno1 n = return True ->- repeatM n (rotate (pi/fromInteger n)) ->- collect ->- observe

The idea is that we 'rotate' our system through an angle $latex \pi/n$ but we do so in n stages. The fact that we do it in n stages makes no difference, we get the same result as doing it in one go. The slight complication is this: suppose we start with a probabilistic state of type P a. If we let it evolve quantum mechanically it'll turn into something of type P (Q a). On observation we get something of type P (P a). We need join to get a single probability distribution of type P a. The join is nothing mysterious, it just combines the outcome of two successive probabilistic processes into one using the usual laws of probability.

But here's a variation on that theme. Now we carry out n stages, but after each one we observe the system causing wavefunction collapse:

> zeno2 n = return True ->- repeat n (
> \x -> x =>= return =>< rotate (pi/fromInteger n) =>= observe ->- join
> ) ->- collect

Notice what happens. In the former case we flipped the polarity of the input. In this case it remains closer to the original state. The higher we make n the closer it stays to its original state. (Not too high, start with small n. The code suffers from combinatorial explosion.) Here's a paper describing the actual experiment. Who needs all that messing about with sensitive equipment when you have a computer? :-)

A state of the form P (Q a) is called a mixed state. Mixed states can get a bit hairy to deal with as you have this double level of uncertainty. It can get even trickier because you can sometimes observe just part of a quantum system rather than the whole system like oberve does. This inevitably leads mixed states. von Neumann came up with the notion of a density matrix to deal with this, although a P (Q a) works fine too. I also have a hunch there is an elegant way to handle them through an object of type P (Q (Q a)) that will eliminate the whole magnitude squared thing. However, I want to look at the quantum Zeno effect in another way that ultimately allows you deal with mixed states in another way. Unfortunately I don't have time to explain this elimination today, but we can look at the general approach.

In this version I'm going to consider a quantum system that consists of the logical state in the Zeno examples, but also include the state of the observer. Now standard dogma says you can't can't form quantum states out of observers. In other words, you can't form Q Observer where Observer is the state of the observer. It says you can only form P Observer. Whatever. I'm going to represent an experimenter using a list that representing the sequence of measurements they have made. Represent the complete system by a pair of type ([Bool],Bool). The first element of the pair is the experimenter's memory and the second element is the state of the boolean variable being studied. When our experimenter makes a measurement of the boolean variable, its value is simply prepended to his or her memory:

> zeno3 n = return ([],True) ->- repeatM n (
> \(m,s) -> do
> s' <- rotate (pi/fromInteger n) s
> return (s:m,s')
> ) ->- observe =>= snd ->- collect

Note how we now delay the final observation until the end when we observe both the experimenter and the poor boolean being experimented on. We want to know the probabilities for the final boolean state so we apply snd so as to discard the state of the observer's memory. Note how we get the same result as zeno2. (Note no mixed state, just an expanded quantum state that collapses to a classical probabilistic state.)

There's an interesting philosophical implication in this. If we model the environment (in this case the experimenter is part of that environment) as part of a quantum system, we don't need all the intermediate wavefunction collapses, just the final one at the end. So are the intermediate collapses real or not? The interaction with the environment is known as decoherence and some hope that wavefunction collapse can be explained away in terms of it.

Anyway, time for you to go and do something down-to-earth like gardening. Me, I'm washing the kitchen floor...


I must mention an important cheat I made above. When I model the experimenter's memory as a list I'm copying the state of the measured experiment into a list. But you can't simply copy data into a quantum register. One way to see this is that unitary processes are always invertible. Copy data into a register destroys the value that was there before and hence is not invertible. So instead, imagine that we really have an array that starts out zeroed out and that each time something is added to the list, the new result is xored into the next slot in the array. The list is just non-unitary looking, but convenient, shorthand for this unitary process.

Saturday, March 03, 2007

The allure of LaTeX

I'm moving! Get over here. I don't know if this will be a permanent move but I'll try it for a bit. Old posts haven't been imported too well so I may just keep the old posts here and delete the ones I imported there.

Monads in C pt. III

OK, the last C monad example:


typedef struct {
char *string;
int value;
} WriterInt;

WriterInt returnWriter(int i)
{
WriterInt r;
r.string = "";
r.value = i;
return r;
}

WriterInt bind(WriterInt (*f)(int),WriterInt x)
{
WriterInt y = f(x.value);
WriterInt z;
z.value = y.value;
int len = strlen(x.string)+strlen(y.string);
z.string = malloc(len+1);
strcpy(z.string,x.string);
strcat(z.string,y.string);
return z;
}

WriterInt print(int i)
{
WriterInt x;
x.string = malloc(32);
x.value = sprintf(x.string,"%d\n",i);
return x;
}

WriterInt printplus_bad(int i)
{
WriterInt x = print(i);
return print(x.value); /* cheating! */
}

WriterInt printplus(int i)
{
WriterInt x = print(i);
return bind(print,x);
}


This time, instead of printing, we build up a string as a side effect. printplus() is implemented exactly as before, without knowledge of how to handle side effects, and yet it correctly handles strings returned as a side effect and concatenates them together. The magic is bind() which allows a function expecting an int input to be applied to a WriterInt.

I'm hoping that programmers of languages such as C can now begin to see the pattern shared by these examples.

Anyway, this is all just stalling for time while I try to get my quantum computation code working.

Blog Archive