The idea behind the theory is simple. Suppose there is a set of M messages, one of which you want to send over a noisy communication channel. If you simply send one of the messages it might be corrupted and interpreted as another. So instead you embed the set M inside a larger set C and send an element of C as a message. If the message is corrupted then they might receive a message in C but not in M. They'll know right away that the message has been corrupted. But even better, if the embedding of M in C is designed cunningly enough they'll be able to make a good guess as to what the original message was assuming some probability model for the corruption. You pay a price of course, the elements of C typically take longer to transmit than the elements of M, reducing the rate of data transmission.

There have been lots of proposed schemes over the years since Shannon started the field. They use a variety of methods from discrete mathematics and algebraic geometry. There are nice connections between the sporadic groups and coding theory. The Mathieu groups arise directly as symmetries of the Golay code and the Golay code can itself be derived from the Leech lattice whose symmetries are described by the Monster. A common feature of these codes is that they are linear. The set C is a vector space over F

_{2}, the field with two elements. M is a subspace of this space. M is the kernel of an F

_{2}-linear map represented by a matrix called the

*syndrome*. If the syndrome, when applied to a message, gives zero, then it's likely that the message has arrived uncorrupted. Much of the work in coding theory has been about generating suitable syndrome matrices.

But in the sixties Robert Gallager looked at generating random sparse syndrome matrices and found that the resulting codes, called Low Density Parity Check (LDPC) codes, were good in the sense that they allowed messages to be transmitted at rates near the optimum rate found by Shannon - the so-called Shannon limit. Unfortunately the computers of the day weren't up to the task of finding the most likely element of M from a given element of C. But now they are. We now have near-optimal error correcting codes and the design of these codes is ridiculously simply. There was no need to use exotic mathematics, random matrices are as good as almost anything else. The past forty years of coding theory has been, more or less, a useless excursion. Any further research in the area can only yield tiny improvements.

A good place to learn about these codes is in the book by one of the 'rediscoverers' of Gallager's codes, David McKay. There's also the Wikipedia article. I originally heard of these codes from a friend of mine and was sceptical at first. Then I read this Science News article and I'm now a bit embarassed about my scepticism.

## 5 comments:

I was still skeptical, so I asked about this on Stack Exchange: http://math.stackexchange.com/questions/215938/has-error-correction-been-solved

@Snowball,

Excellent. That's the kind of thing that keeps me honest and makes sure I don't post complete BS :-)

@sigfpe:

In any case, I'm glad you posted this. I would not have learned about LDPC codes or MacKay's excellent book if you hadn't.

wow this was in 2006. it indeed was BS :).. coding theory is nowhere close to an end and we are in 2013 now

@dave

Coding theory is alive and kicking, but the article was about the death of the algebraic approach to coding theory. It gave us beautiful mathematics but I don't think it plays a big role in designing real world codes any more.

Post a Comment