We start with tweaking probability theory a bit. One of the axioms of probability theory says that all probabilities must lie in the range zero to one. However, we could imagine relaxing this rule even though on the face of it it seems meaningless. For example, suppose we have a coin that has a 3/2 chance of landing heads and a -1/2 chance of landing tails. We can still reason that the chance of getting two heads in a row is 3/2×3/2=9/4 by the usual multiplication rule. But obviously no situation like this could ever arise in the real world because after tossing such a coin 10 times we'd expect to see -5 tails on average.
But what if we could contrive a system with some kind of internal state governed by negative probabilities even though we couldn't observe it directly? So consider this case: a machine produces boxes with (ordererd) pairs of bits in them, each bit viewable through its own door. Let's suppose the probability of each possible combination of two bits is given by the following table:
|First bit||Second bit||Probability|
Obviously if we were able to look through both doors we'd end up with the meaningless prediction that we'd expect to see 00 a negative number of times. But suppose that things are arranged so that we can only look through one door. Maybe the boxes self-destruct if one or other door is opened but you still get enough time to see what was behind the door. Now what happens?
If you look through the first door the probability of seeing 1 is P(10)+P(11)=1. We get the same result if we look through the second door. We only get probabilities in the range zero to one. As long as we're restricted to one door we get meaningful results.
If we were to perform this experiment repeatedly with different runs of the machine, each time picking a random door to look through, we'd eventually become very confident that every box contained 11. After all, if we freely choose which door to look through, and we always see 1, there's no place 0's could be 'hiding'.
But now suppose a new feature is added to the box that allows us to compare the two bits to see if they are equal. It reveals nothing about what the bits are, just their state of equality. And of course, after telling us, it self-destructs. We now find that the probability of the two bits being different is P(01)+P(10)=1. So if we randomly chose one of the three possible observations each time the machine produced the box we'd quickly run into the strange situation that the two bits both appear to be 1, and yet are different. But note that although the situation is weird, it's not meaningless. As long as we never get to see both bits at the same time we never directly observe a paradox. If we met such boxes in the real world we'd be forced to conclude that maybe the boxes knew which bit you were going to look at and changed value as a result, or that maybe you didn't have the free will to choose door that you thought you had, or maybe, even more radically, you'd conclude that the bits generated by the machine were described by negative probabilities.
That's all very well, but obviously the world doesn't really work like this and we never see boxes like this. Except that actually it does! The EPR experiment has many similarities to the scenario I described above. The numbers aren't quite the same, and we're not talking about bits in boxes, but we do end up with a scenario involving observations of bits that simply don't add up. If we do try to explain what's going on using probability theory, we either conclude there's something weird about our assumptions of locality or causality or we end up assigning negative probabilities to the internal states of our systems. In fact, you can read the details in an article by David Schneider. Being forced to conclude that we have negative probabilities in a physical system is usually taken as a sign that we have a contradiction. In the case of Bell's theorem it shows that we can't interpret what we see in terms of probability theory and hence that the weirdness of quantum mechanics can't be explained in terms of some hidden random variable that we can't see. QM simply doesn't obey the rules you'd expect of hidden variables we can't see.
But in a paper called Negative Probability, Feynman tried taking the idea of negative probabilities seriously. He showed that you could reformulate quantum mechanics completely in terms of them so that you no longer needed to think in terms of the complex number valued 'amplitudes' that physicists normally use. This means the above isn't just an analogy, it's actually a formal system within which you can do QM, although I haven't touched on the bit that refers to the dynamics of quantum systems. So if you can get your head around the ideas I've talked about above you're well on your way to understanding some reasons why quantum mechanics seems so paradoxical.
At this point you may be wondering how nature contrives to hide these negative probabilities from direct observation. Her trick is that making one kind of observation disturbs up the state of what you've observed so that you can't make the other kind of observation on a pristine state. You have to pick one kind of observation or the other. Electrons and photons really are a lot like the boxes I just described.
So why don't physicists use this formulation? Despite the fact that negative numbers seem simpler to most people than imaginary numbers, the negative number formulation of QM is much more complicated. What's more, because it makes exactly the same predictions as regular QM there's no compelling reason to switch to it. And anyway, it's not as if directly observing negative probabilities is any more intuitive or meaningful than imaginary ones. Once you've introduced negative ones, you might as well go all the way!
This all ties in with what I said a while back. The important thing about QM is that having two ways to do something can make it less likely to happen, not more.
For a different perspective this is an interesting comment.
Footnote: We can embed QM in negative probability theory. But can we do the converse? Can every negative probability distribution be physically realised in a quantum system? I've a hunch the answer is obvious but I'm too stupid to see it.