Thursday, September 22, 2005
Are Men Innately Better than Women at Mathematics?
Wednesday, September 21, 2005
Tetris and matrices over semirings
X X
XXXX XXX X
We can describe this as a list of heights (1,1,2,1,0,1,1,2,0,1).
Here's a Tetris piece:
Y
YYY
Y
and here's what you get when you drop one on the other
Y
YYY
YX X
XXXX XXX X
We get a new list of heights (1,3,3,3,0,1,1,2,0,1). This has failed to capture the 'overhang' in the 4th column but lets just think about Tetris pieces dropped from the top so overhangs make no difference. What is the relationship between the old and new list of heights?
It turns out that you can get one from the other from a single matrix 'multiplication'  the catch is that we have to work over the semiring (R union infinity,min,+) instead of the more usual semiring (well, actually field) (R,+,*).
Let hn be the initial height in the nth column and hn' be the height after dropping the new piece. Think about how low the new piece made of Ys can fall. If the left most column of the Ys is in contact with the X's then it's lowest point ends up at height h2+1. If he middle column is in contact then the lowest point ends up at h3. And if the right column is in contact the lowest point ends up at height h4. So the lowest point ends up at height min(h2+1,h3,h4). So we can now write down expressions for the new height: h2' = min(h2+1,h3,h4)+1, h3' = min(h2+1,h3,h4)+1, h4' = min(h2+1,h3,h4)+2. We can rewrite this as:
h2' = min(h2+2,h3+1,h4+1)
h3' = min(h2+2,h3+1,h4+1)
h4' = min(h2+3,h3+2,h4+2)
Now let's switch to the semiring (R union infinity,min,+) (with 'additive' identity OO = infinity). Use the notation (+) = min and (*) = +. Then we can write:
h2' = 2(*)h2 (+) 1(*)h3 (+) (1)*(h4)
h3' = 2(*)h2 (+) 1(*)h3 (+) (1)*(h4)
h4' = 3(*)h2 (+) 2(*)h3 (+) (2)*(h4)
We can now write this in matrix form h' = Mh where h is
( 0 OO OO OO ...)
( OO 2 1 1 ...)
( OO 2 1 1 ...)
( OO 3 2 2 ...)
( ... ... ...)
( ... 0 )
So corresponding to each Tetris piece in a particular column we have a matrix. To find the effect of dropping multiple pieces we just multiply the matrices in the semiring in turn.
Turns out this has all types of applications  for example resource allocation in a queing system. I'll leave you to figure out how falling testris pieces might represent jobs submitted to a queing system and how the columns might represent types of resource.
Here's an easy exercise: If M is the matrix of a piece, what piece is the transpose M^{T} the matrix of?
I learnt this from this paper by Gaubert and Max Plus. (I think Max Plus is a relative of Bourbaki.)
Many algorithms are 'linear' in this semiring. Another example that seems quite different is the dynamic programming algorithm for approximate DNA string matching. I tried to work out what the transpose matrix did (ie. what the adjoint algorithm was) and then use the code reversal method I mentioned earlier to see what new algorithm I obtained was. It was another approximate string matcher but this time it worked through the strings in the opposite direction. (Note how each operation in this description is essentially of the form a = max(some terms added to constants).)
Tuesday, September 20, 2005
The 100 Greatest Theorems
Monday, September 19, 2005
A Certain 24th Root
The theta function of a lattice is simply the sum of x^{u.u} for all u in the lattice. Clearly it is a formal power series with integer coefficients if the lattice is such that u.u is always an integer. These are often pretty amazing functions, for example for certain types of lattice they are modular forms. It turns out that the theta function for the Leech lattice is a power of 24. More amazingly: this is a new result!
It's clear that the theta function is the generating function for the number of lattice elements of a given length. So if it's a 24th power you'd expect this to have a combinatorial explanation too. Unfortunately the proof isn't combinatorial. So even though the 24th root appears to be counting something, nobody knows what it is that is being counted!
Anyway, Sloane and friends had fun trawling through their sequence database trying to find other sequences that were powers. Another interesting one they found was the exponential generating function for the number of partitions of {1,...,n} into an odd number of ordered subsets. It turns out this is a perfect square. This is begging for a combinatorial explanation but it hasn't yet been found. I'll have to see if I can reproduce the square root of this series with my (unfinished) Haskell code for formal power series.
And of course this paper is interesting to me because of the prominent role of the number 24 :)
Monday, September 12, 2005
Of Partitions and Primes
First a digression: There's a bizarre formula for p(n) due to Rademacher and you can see it here. Note the n1/24 subexpression. You might wonder where that 1/24 comes comes from. Actually it has a neat physical interpretation. In bosonic string theory you have a string which can oscillate with different harmonics. Given a string with a certain amount of energy there are different ways the energy could be shared out between the different harmonics. In quantum mechanics the amount of energy that can go into each harmonic is 'quantised' so that it is proportional to an integer. In addition the energy of a single quantum of energy in the nth harmomic is proportional to n. What this means is that if you have m_{1} quanta in the lowest harmonic, m_{2} in the second and so on then the total energy in the system, in suitable units, is E=m_{1}+2*m_{2}+3*m_{3}+... In other words the number of different states with energy E is just the partition function p(E). In practice it's a little more complex because strings oscillate in a high dimensional background space so in fact there are harmonics in different directions. But essentially it's a similar counting problem. What I've actually described is a type of conformal field theory. Bosonic string theory (which is a 26D theory) is basically 24 copies of this system.
The interesting twist with quantum string theory is that when you consider the minimum energy you can put into each harmonic it isn't zero. You can argue this from the Hesenberg uncertainty principle, it can't be at rest because then you'd know its position and momentum simultaneously. So it must have nonzero energy. You can treat each harmonic independently so each harmonic has some energy. In fact they each have a minimum of half a quantum. (This is standard stuff.) So the correct formula is really E=(m1+1/2)+2*(m2+1/2)+3*(m3+1/2)+... The catch is that when you try to add up you get a divergent series. But this is no problem for physicists as you can sum the series using the methods outlined here. In fact, the 'zeropoint' energy is essentially 1/2(1+2+3+...) so zeta regularisation gives 1/24. So the partition function of n is actually a function of the energy of a conformal field theory in its nth excited state. As I've obsessed about before, this is the same 24 that is the dimensionality of the Leech lattice, Golay code and lots of other good stuff. It's all connected mysteriously! Somehow this information about the quantum theory of strings is already encoded in the properties of the partition function discovered 70 years ago even though the partition function is defined by the elementary properties of numbers. This is also why I think String Theory is cool even if it might be useless as a physical theory. Even if we wanted to give up the string theory 'research programme' today it'd still be there hidden away inside the mathematics we study, regardless of whether we think that mathematics has physical applications.
I'll stop there as I don't even pretend to understand the connections. But do check out the proof of Rademacher's formula via Farey sequences and Ford circles. It's in a few analytic number theory books (eg. Apostol's Springer text on the subject) and it's one of the more beautiful proofs out there.
Anyway, that's all irrelevant. What's relevant is that the partition function has neat congruence properties. Ramanujan proved p(5m+4) is a multiple of 5 for all natural numbers m. He also showed p(7m+5) is a multiple of 7 and p(11m+6) is a multiple of 11. Many years ago Dyson explained the 5 and 7 cases using what was called the "crank conjecture" and then later this was extended to 11. But in the last couple of months Mahlburg has proved that the crank conjecture applies for all primes giving a way to produce such congruences for any prime. This is a pretty major result. People have been investigating these congruences at least since Ramanujan's day but now we have a handle on all of them simultaneously. More detail is posted here and this brings to the end a long chapter in number theory. Note, by the way, how often the number 24 appears through out this paper. :)
Update: I've edited this so it makes slightly more sense.
Adjoints
OK, I will explain. Consider the line of code, in C++, a += b
. We can write this in matrix form as
(a) = (1 1) (a)
(b) (0 1) (b)
If we take the transpose of the matrix we get
(a) = (1 0) (a)
(b) (1 1) (b)
which is the same as the code
b += a
. More generaly, any block of code that performs linear updates to variables can be converted to its adjoint by writing the steps backwards and transforming each line of code like this. A more complex example (exercise) is a = 2*a+3*b4*c; b = 2*a
which maps to a += 2*b; b = 0; b += 3*a; c = 4*a; a *= 2
. If your code has loops, eg. iterates over arrays of variables, you just run the loops backwards. The application for me is that there is a published fast algorithm for some linear operator but I need the adjoint. Surprisingly the adjoint of the fast algorithm can actually turn out to be a whole new fast algorithm.
Incidentally, I asked the journal editor for the expected turnaround time for the decision on whether to publish. He responded with the mean, standard deviation, extrema and quartiles. If only people like plumbers, doctors and cable installers could respond in this way. "If I attempt to fix this leak it has a 50% chance of taking 1 hour to fix, a 25% chance of taking 2 and there's a 5% chance I won't get it done today" rather than "I'm charging by the hour and it'll take as long as it takes unless there's a complication in which case it'll take longer".
Tuesday, September 06, 2005
Infinitesimals
One of the earliest ways to construct the real numbers from the rationals was as an equivalence class of sequences of rational numbers. Two sequences are equivalent if the difference tends to zero. In NonStandard Analysis, developed by Robinson, this definition is tweaked slightly so that the sequences are equivalent if they are equal on a 'large' subset of the sequence. The trick is to define 'large' appropriately leading to the concept of an ultrafilter. This results in a set *R which is an extension of the usual set of reals R but also contains infinitely small and large numbers. There's also a nice 'metatheorem' that goes with this construction showing that 'reasonable' theorems about *R are also theorems about R. This basically gives us license to use infinitesimals in our proofs and still be sure that our results make sense as propositions about real numbers. So many of Leibniz's old 'proofs' using infinitesimals, say, can now be reinterpreted as perfectly valid proofs.
Anyway, you don't need all this machinery to do automatic differentiation, just an algebraic structure with elements whose square is zero.
Incidentally, Gödel used ultrafilters to 'prove' the existence of God.
Blog Archive
Some Links
About Me

Dan Piponi
 Blog: A Neighborhood of InfinityCode: GithubTwitter: sigfpeHome page: www.sigfpe.com