Thursday, August 25, 2005

Caustics and Catastrophes

Caustics, at least the optical ones, are the bright regions you see when light passes through a refracting medium. Examples include the patterns you see on the bottom of a pool, the bright patches you might see on a wall opposite a curved window, the focal point of a lens when a parallel beam falls onto it, or the familiar cardioid shape you see in a cup of coffee. The twinkling of stars are caustics caused by atmospheric turbulence passing across your retina. Caustics aren't always optical effects. For example rogue waves at sea can be caused by refraction-like effects as waves move between regions of sea with different currents.

The shape of a caustic is a complicated function of the physical system. But whatever the system you frequently end up with the same features. For example in my image above you see many regions where two twisty lines meet forming a kind of long twisty V shape where the sides of the V protrude a little past the intersection to make a kind of swallowtail shape. This shape is universal. You'll see it in caustics formed by all kinds of refracting media. You'll see it in the kind of caustics formed by the gravitational fields of astronomical bodies in microlensing events. There are other characteristic shapes that frequently come up and what makes them so interesting is that they are independent of the precise details of the physical system. The underlying explanation is catastrophe theory. Unfortunately, few people talk about catastrophe theory these days as it became tarnished through misapplication. I guess it was the chaos theory of the 70s. But like chaos theory, it has completely sound mathematical underpinnings.

Very roughly, in catastrophe theory we look at (infinitely) differentiable functions f:M->N where M and N are manifolds. The size of the set f-1(x) may vary with x depending on how f 'folds' M into N. Obviously the map from x to the size of f-1(x) is discontinuous if it's non-constant on a connected manifold N. Catastrophe theory is basically about the singular points where this set jumps in size. The interesting thing is that in low dimensions, then there are basically 7 types of 'interesting' singularity up to local diffeomorphism. This is known as Thom's theorem. These 7 types have interesting names from the simple 'fold', through the swallowtail to the parabolic umbilic. And you've seen the swallowtail already, it's what I pointed out in the caustic image above. The reason why these catastrophes appear in caustics is that you can think of an optical system mapping incoming rays to the destination surface on which they eventually impinge. For non-trivial systems this is a many-to-one function and it can fold back on itself in all sorts of interesting ways. Thom's Theorem therefore allows us to classify the types of shapes we may see in caustics without knowing specific details about the system.

Anyway, I was interested in finding a way to generate caustic-like images cheaply. Armed with this knowledge that caustics can be seen as folds of some manifold mapping to another I was able to produce I really cheap algorithm. Basically I take the plane. I deform it using a random function (eg. Perlin noise or something synthesised in the Fourier domain). I then render the deformed plane. I do this by tesselating the plane as triangles. Each triangle in the tesselation represents a beam with triangular cross section. So the key thing is to ensure that the amount of light going through each triangle is the same before and after deformation. In other words, if the deformation makes a triangle twice as large I make it half as bright and vice versa. That's it. Animate the deformation nicely and you get something that looks just like the patterns at the bottom of a pool rendered in real-time and without have to use any knowledge of optics. It essentially looks like caustics simply because of the universality of catastrophes meaning that a wide variety of randomly made up deformations look like the right sort of thing. This paper does something similar except that they attempt to approximate the correct caustic for a given water surface whereas I'm just trying to capture a look. The image above is an example.

Catastrophe was used in the 70s for things like modelling conflict, understanding the difference between erotica and pornography, predicting when prison riots will happen and so on. You can see why the subject fell out of favour. As I say, it was the Chaos theory of its day.

Incidentally, you may wonder what deeper significance the number 7 has here. It turns out the catastrophes are each closely tied to Dynkin Diagrams. Dynkin diagrams have a bizarre habit of popping up in all sorts of branches of mathematics - quivers being a popular one right now. But at this point I really have no clue what's going on as I've barely read a few pages of my Catastrophe Theory book.


Wednesday, August 24, 2005

The Unnatural Naturals

In a comment from my previous post, vito got me thinking about the naturals. Most mathematicians use the term 'natural' to refer to numbers in the set {1,2,3,...}. However, I prefer to think of the set {0,1,2,3,...} as the naturals. As the precise definition of 'natural' is still debatable I can get away with my usage, as long as I define exactly what I mean. My gripe with the more usual definition of 'natural' is that it's not natural at all.

Consider some of the natural ways to build the integers. For example there are the Peano axioms and the finite ordinals. These define sequences that start at zero, not one. In fact, both of those web pages define zero to be natural. Countless theorems are more naturally written as statements about {0,1,2,...} rather than {1,2,3,...}. I can't even see why the set {1,2,3,...} needs a name any more than the set {2,3,4,5,6,7,...}. Having said that, when I was a student at Cambridge there was a formal debate over the definition of the naturals. Things suddenly became more complicated when a faction suddenly appeared arguing for the set {2,3,4,...} on the grounds that if numbers are for counting then the number one doesn't exactly do much counting. I think I should have formed a pro-{3,4,5,6,...} faction at this point, after all, the ancient greeks, who are noted for their wisdom, only marked their nouns as plural for numbers strictly greater than 2.

Anyway, there is some discussion of the debate on Wikipedia and the conclusion is probably the same one I would come to: the set {0,1,2,3,...} is more popular among logicians and set theorists (I'd probably add category theorists) and {1,2,3,4,...} is more popular among number theorists.

And of course, when programming, I think arrays should start at zero. You won't catch me using Fortran or matlab. Come to think of it, as the first ordinal is zero, shouldn't we actually start counting at zero?


Sunday, August 21, 2005

The Alternative Numbers

I thought I'd recycle a couple of old posts I made to K5. There may be some broken links which I'll come back and fix. Here's the first:

Fed up of seeing the same numbers over and over again? Need a little change? Well look no further than here for a quick survey of alternative number systems and their applications.

I guess most people are pretty familiar with the natural numbers {0,1,2,3,...}, otherwise known collectively as N, the integers {...,-2,-1,0,1,2,...} called Z (from Ger. Zahl), the rationals (the fractions like 1/3 and -27/11) called Q (from Ger. Quotient) and the reals (because numbers like e and pi can't be written as fractions) called simply R. Of course quite a few of you will be pretty au fait with the complex numbers, C, too.
Start with the real numbers and throw in the number i with the property i2=-1. Some of you might debate all night over whether they exist but the rest of us will just get on and use them in applications such as electronic engineering, digital signal processing and computer graphics. Q, R and C are all fields - basically this means you can add, subtract, multiply and divide them in the familiar way and in other ways are very familiar too. And if you haven't seen it before, this formula due to Euler ought to blow your mind:


Talking of graphics, quaternions are pretty popular too. You've seen how to make the complex numbers by adding in a new number i so why not try our luck again and throw in j whose square is -1. The catch now is: what is ij? Well let's call it k.
In fact if we add the rules ij=ji=k, jk=-kj=i, j2=k2=-1, and a few more, I'm sure you're getting the idea by now, we end up with H, the quaternions (damn! notice Q's already taken!).
Anyway, any quaternion can be represented by 4 real numbers. For example 1+2i-4j+11k is represented by (1,2,-4,11). That means we can embed good old 4 dimensional spacetime in there.
In fact that's what Hamilton originally used them for.
Well, Einstein hadn't come along yet so actually he was only using the last three components to represent points in space but you can use the first for time too. Turns out the quaternions do a really sweet job of doing geometry which is why graphics people like to use them. If you thought Euler's formula was cool that's nothing compared to what you can do with these babies. In fact their main application is in eliminating all those complicated trig formulas you get when you try to carry out rotations in 3D.
Notice that we lost something by using these numbers. Two times three is three times two but i times j isn't j times i. We've sacrificed what's known in the game as commutativity.

Well there's a bit of a pattern going on. Start with R, move on to C and then Q. The quaternions were 4-dimensional. What about 3-dimensional? Or 5-dimensional? Well you can try making these things (exercise!) but you find they're not all that interesting. They just aren't well behaved.
That is until you come to 8 dimensions. Then you find you can make the octonions, otherwise known as the Cayley numbers and labelled O. Still, you have to make more sacrifices. When you multiply a by b and c it matters whether you do ab or bc first.
No longer are we guaranteed things like (2.3).5=2.(3.5). We have lost associativity. Still, there's fun to be had with octonions. You can discover the delights of spinors and triality, find that there is a well defined cross product in 7 dimensions and if you've enough stamina you'll see they have neat connections with string theory.

How long can we keep this up for? Well if you really insist you can make the 16-dimensional sedenions but they're so unruly that I just wouldn't bother if I were you!

But you might have noticed. All of these things were built from the real numbers. The whole point here is to get away from them. What other alternatives do we have? Well you could play with the integers modulo N. Just do normal math but every time you get an answer bigger than or equal to N divide by N and keep only the remainder. For example modulo 9, 8+7=6 and 3*3=...hmmm...0. That zero isn't too good, we don't want to end up with zero when we multiply perfectly respectable numbers. That's easy to fix: choose N to be prime. (Think about it!). Now we can even divide these numbers. In fact Z/NZ (that's their other name BTW) forms a field, just like R and C. That means you can add, subtract, multiply and divide by anything non-zero. Still, numbers that only go up to N-1. Sounds a bit useless. Au contraire! They're the at the heart of algorithms like RSA encryption.

But we're still not being alternative enough.
After all, numbers modulo N are still, well, numbers. Can we get a bit more radical?
Try this for size: you're used to decimal expansions that go on forever like 1/7=0.14285714285...
Maybe we can make digits go infinitely far to the left of the decimal point instead.
Consider a number like ...99999999. Looks pretty big. But try adding 1. Well 9+1=0, carry 1. Write down the zero. Now 9+1=0, carry 1. Hmmm...looks like the answer is going to be ...000000 with that 1 getting carried out to infinity and its final cashing out getting postponed forever. So what we have is ...9999999=-1. That might look familiar to some of you. I'll give you a clue: this is called tens complement arithmetic. Or at least I've just called it that. So it seems that arithmetic with infinite sequences of digits might just work out. The catch is that you can't always divide these things. However, if we work not with base 10, but base p digits it turns out that we can always divide by any non-zero number. These numbers are called the p-adics, or Zp. For each prime p the p-adics form a field. They're very useful for doing number theory and of course that means they're good for cryptography. BTW, going back to the 10-adics again, try calculating directly that ...9999992=1.

But what about infinity? Can't we make that a number? Turns out that there are lots of ways to introduce infinity to a number system. Let's start with the cardinals - numbers used to count how many elements there are in a set. For example {red,green,blue} has 3 elements meaning that the set of primary colors has three elements.
Here 3 is being used as a cardinal. But what about this perfectly respectable set {0,1,2,3,...}=N, what is the size of that? Well that's the first transfinite cardinal and it goes by the name of aleph0. (I apologize, I don't know how to do hebrew HTML.) The next biggest cardinal is called aleph1 and you can guess the rest of the pattern. But here's an interesting conundrum: how big is the set of real numbers R?
Well it's called c and it's bigger than aleph0. But is it bigger than aleph1? Turns out it's your choice!
Whichever way you go the rest of mathematics is happy to accommodate you.

There's more than one way to make an infinity. You can play with the transfinite ordinals too. Go back to the beginning when nobody had invented any mathematics. We had nothing, in other words the empty set {}, which we'll call 0. On the next day we can try making another set - after all we now have something to put in a set, the empty set itself. So define 1={0}. Now we can define 2={0,1} and I'm sure you see the pattern n={0,...,n-1}. These are the ordinals. But why stop there, how about {0,1,2,...}?
Well that's the first transfinite ordinal and we call it w (that's meant to be omega, I don't do Greek HTML either). Now we can make {0,1,2,...,w}. That's called w+1. How about {0,1,2,w,w+1,w+2,...}. Well that has a name too, 2w. Want to carry on? Don't let me stop you. But I'd better warn you that you might not be able to get all cardinals this way because I've only shown you how to get the accessible ones.

Maybe you're not interested in big. We could try small. In calculus you probably got fed up of always using the fact that as dx->0, dx2 could be ignored. Why don't you just set dx2 exactly equal to zero and be done with it. Well that's what nonstandard analysis is about. It introduces a new kind of infinitesimal number that has just this property. Useful if you're lazy about proofs.

But maybe mathematics isn't your thing and none of these systems looks like fun to you. Then there is one last thing I can throw at you.
Combinatorial games. What do games have to do with numbers?
Though invented by Conway (yes, that's the Conway), Don Knuth gave them another name: Surreal Numbers. Here are some surreal numbers: 0, *1, *2, *3... And here's how to add them: *a+*b=*(a xor b). That's the good
old xor that's written '^' in C. If you know how to win at Nim then you know why these are good for winning games. The *n are called nimbers and they are just a tiny part of the entire collection of surreal numbers, all of which represent positions in games. What's more, the surreal numbers include Z, Q and R. Hard to believe but real numbers are all positions in games. But that's not all, the surreal numbers include the transfinite ordinals, lots of different kinds of infinitesimals and countless other weird and wonderful numbers. If you want to beat your friends at Hackenbush, Nim, Mogul, and maybe even Go, then these are the numbers for you.


Tuesday, August 16, 2005

Toric Varieties

I keep meaning to read up on this subject but my algebraic geometry ability isn't quite there yet. One of the reasons I'd like to understand toric varieties is that they give you a concrete handle on certain types of algebraic curves and thereby give solid interpretations of results like the Riemann-Roch theorem in terms of counting features of simple geometrical objects.

Part of the idea is this: consider a convex polytope in Rn whose vertices all have integer coordinates. A simple example is the polytope in R2 with vertices (0,0), (1,0) and (0,1). This is also the list of integer points in this polytope. Now write out the monomials in x and y whose exponents are given by the coordinates of the integer points. In this case we get:


Use these to define a mapping from Cn to Cm where m is the number of integer points in the polytope. In this case we get

(x,y) -> (1,x,y)

This defines a map from Cn to CPm-1 by considering (1,x,y) to be homogeneous coordinates. The closure of the image of this map is an example of a toric variety. In this case it's just all of CP2. This looks like a highly contrived definition. But what's neat is that many natural properties of the toric variety are closely related to natural properties of the convex polytope. In particular, a result like Pick's Theorem and its higher dimensional analogues are consequences of the Riemann-Roch theorem applied to the toric variety. We get a dictionary converting statements of algebraic geometry, some of them quite tricky (for me) to grasp, to elementary statements about convex polytopes.

Here's another quick example of a toric variety to illustrate the above process. Consider the convex polytope with corners (0,0), (1,0), (0,1), (1,1). This gives monomials


so we have the mapping

(x,y) -> (p,q,r,s) = (1,x,y,xy)

It's not hard to see that the closure of the image of this map is the surface ps=qr in CP3. So this time we have a non-trivial variety.

But I don't yet understand how Pick's theorem comes from the Riemann-Roch theorem.

I'm also interested in the connection to the ubiquitous number 12. (That is the same 12 that appears in 1+2+3+...=-1/12 in an earlier posting of mine. Again, a result of the Riemann-Roch theorem.)


Thursday, August 11, 2005

Absence of Evidence is Evidence of Absence

I know Carl Sagan said the opposite, but he was clearly wrong.

Suppose that a priori X has probability p of being true. We now look for evidence for X of a certain type. Suppose that there is a probablity q that we find this evidence if X is true and probability q' that we find this evidence if X is false. We will assume p<1 (otherwise we wouldn't bother looking for evidence) and that q>q' (otherwise it couldn't be said that the evidence we're looking for is evidence for X).

So we have four possibilities:

  1. X is true and we find evidence for X: probability pq
  2. X is true and we don't find evidence for X: probability p(1-q)
  3. X is false and we find evidence for X: probability (1-p)q'
  4. X is false and we don't find evidence for X: probability (1-p)(1-q')

Under the hypotheses above, the conditional probability that X is true given that we failed to find the evidence is p(1-q)/(p(q'-q)+1-q').

Use Bayes' Theorem.

Some elementary rearrangement shows this is always less than p given the above hypotheses. It doesn't matter if we are unable to assign an a priori probability, this holds whatever value p has as long as it's less than 1. And if we don't know that q>q' then we shouldn't be in the business of looking for evidence. If the experiment we're doing is any good then q'=0 but as I have shown, the result holds even if we relax this condition.

So clearly failing to find evidence for X should lower our estimate of the probability that X is true.

I wonder what made Sagan say this. I think that maybe he meant to say "absence of evidence is not proof of absence". The theorem shows that under the original hypotheses the conditional probability is never 1, and so while we have evidence of absence, we don't have a proof. But if we can look for enough independent types of evidence it's quite possible for the conditional probability to get close to 1.

Update: The first paragraph was badly written and so I've edited it slightly. The main argument is unchanged. The argument stands or falls regardless of what Sagan actually said, but read the comments below on some context for this quotation. (02 Jan 09)

Wednesday, August 10, 2005

A Neighbourhood of Infinity

A neighbourhood of infinity is a subset of the Riemann sphere that contains an open set containing the point at infinity. But it's also a song by The Fall, from their album Perverted by Language, that I was listening to long before I did my first course on Riemann surfaces. Try as I might I can't find any mathematical references in the lyrics and as far as I know no other songs by The Fall have mathematical references in their titles.

Tuesday, August 09, 2005

ars mathematica

I'm posting stuff over at ars mathematica for a bit. Of course the stuff that's not fit to print will get posted here.

Monday, August 08, 2005

The Use and Abuse of Godel's Theorem

I just finished a first reading of Gödel's Theorem: An Incomplete Guide to its Use and Abuse by Torkel Franzén.

Gödel's theorem has a habit of appearing in all sorts of literature. Hofstadter makes it a central part of his discussion of AI in the classic Gödel, Escher, Bach. Penrose uses it to argue for the specialness of human brains. Chaitin uses it as part of various discussions about irreducible complexity. It has been invoked by theologists and literary critics. And yet, it's hard to find work on Gödel's theorem that deals with these issues and yet is written by a logician with a good grasp of the subject. This book fills that gap. It looks closely at the various ways people have abused the theorem. And I say 'abused' because all of the extra-mathematical accounts he quote are abuses.

One of the ways in which the theorem is abused is the way it is applied to all kinds of formal and non-formal systems regardless of applicability. Gödel's theorem is explicitly about formal systems that are capable of formalizing 'part of arithmetic'. It's not about complexity. As Franzén points out, there are very complex systems that are incapable of being used for arithmetic and yet there are also some very simple systems that can be used for it. Additionally, Gödel's thoerem deals only with the arithmetic component of such systems. For example if we find a way to axiomatise physics it is likely we will be able to use some of those axioms for doing arithmetic. As a result the entire system will be incomplete. Freeman Dyson has been using this to argue that physics is 'inexhaustible'. But the incompleteness might only be a feature of the arithmetic part of this system and have nothing to do with the physical part. Interestingly Dyson conceded this point when it was raised by Solomon Feferman in the New York Review of Books.

Franzén makes some very simple statements that show the falsity of some of the claims made about the theorem. For example if G is undecidable in formal system P then clearly it can be proved in P+G. So statements like "Godel's theorem demonstrates the existence of unprovable propositions" that abound in the literature are shown to be trivially false. He similarly dismisses some statements about the complexity of axiom systems made by Gregory Chaitin. Chaitin has done some great work, in particular he has a beautiful proof of Gödel's theorem using Kolmogorov complexity. But a quick visit to his web site reveals that he has a slight problem with understanding the importance of his own work. Franzén does a great job of trying to understand which parts of Chaitin's work really are significant.

Importantly Franzén is pedantic in just the right measure. When working in logic it is important to be clear about details. For example it's easy to make mistakes keeping track of which formal systems we have used to prove which results and we often need to make clear whether we use the word 'proof' to mean a string of symbols or something that convinces a mathematician (or both). But Franzén also understands that many people use Gödel's theorem, not literally, but as a metaphor or inspiration in other fields.

Franzén also devotes a good amount of space to the issue of Gödel's theorem and mathematical skepticism. He shows that the theorem can do nothing to help skeptical arguments. After all, if a formal system were able to prove its own consistency would we trust it more?

There's a reason why I mention "first reading" above. Among the statements about Gödel's theorem that Franzén dismantles are statements that I myself have been making [1]. So I now feel less confident of my understanding of the theorem than before I started the book. This is definitely not an easy book for non-specialists who really want to make sure they understand every statement made in it and it is going to require repeated readings. But any book that I immediately want to reread on completing it can't be all that bad.

[1] I've always understood Gödel's theorem as requiring a step outside of the formal system in which we interpret the symbols of a formal system as a tool for making predictions about what happens when we manipulate symbols, allowing us to turn a formal system in on itself. Franzén explicitly argues that this view is incorrect. This has left me a littlle confused.


Saturday, August 06, 2005

The Resurrection of God Incarnate

You may think this is a little off topic. But I want to point out the existence of this book: The Resurrection of God Incarnate by Richard Swinburne. What makes it on topic is that he uses Bayes' Theorem to conclude that there is a 97% chance that Jesus was God incarnate who rose from the dead. I also found this paper on ths subject and some related work here. There has been a long tradition of using logic and mathematics to argue for the existence of God - the most well known being Pascal's wager, but these look like the most sophisticated attempts I have seen so far. I haven't yet taken the time to study either this book or the papers and so I make no comment. But maybe you'd like to.

Thursday, August 04, 2005

Wavelet Noise

I'm attending a conference this week: SIGGRAPH 2005. It's nice to see that every year the papers get a little bit more mathematical. My favourite so far was on Wavelet Noise.

For those who don't work in graphics, a 'noise' function is a pseudo-random continuous, and usually differentiable, function. The most fanous example is Perlin noise. Typically a noise function is first defined by data defined on a lattice in R^n, eg. through some kind of hash function (ie. a function whose value is 'hard' to predict) and then extended to all of R^n via some kind of filter - for example fitting a smooth Catmull-Rom spline through the data points. These types of function are ubiquitous in computer graphics where they can be used to synthesise anything from the shape of a simulated tornado to the distribution of dirt on a car body.

Unfortunately one of the problems with most noise functions is that they are hard to 'anti-alias'. When you render a 3D image you typically need to colour an individual pixel by integrating over all of the light that falls onto it. The light that falls on a single pixel is typically the integral of the colour of an object over a small part of its surface - and this colour is often given by a noise function of some sort. The problem is that integration is expensive and so people often just 'fire' a bunch of rays from the pixel into the object and average the colour of the ray-object intersections. But we know from the Nyquist
that if you don't sample the colour enough you may end up getting high frequency garbage 'aliasing' as a low frequency signal. With regular colour patterns like checkerboards you get moire patterns but with noise functions you get unwanted low frequency noise. What we really need is a noise function that is band limited so that the high frequency components simply aren't present to cause aliasing. Perlin noise isn't band limited so that if you try to reduce the amount of it to suppress aliasing you end up not having any pseudo-random detail at all. In Wavelet Noise the authors explicitly construct a noise function that has a hard frequency cutoff. It becomes a true band limited function and hence we can get a reasonable approximation to its integral by sampling.

The great thing about this paper was its presentation. It had a plot. As the publication deadline approached the authors realises their noise function had a flaw. Typically we're not interested in sampling the value of the noise function in R^3 but over an R^2 surface embedded in R^3. The catch is that such a 'slice' of a band limited 3D function is not a 2D band-limited function. This meant that their noise function couldn't actually be used to colour the surfaces of objects! Fortunately they figured out the solution: instead of colouring the surface simply by pulling back the colour from the value of noise in R^3, you instead integrate along the normal to the R^2 surface. The Fourier transform of such an integral is simply a slice through the Fourier transform of the original function on R^3 and hence is band limited. In the nick of time the authors figured this out and then amazingly the authors were able to implement this with a small piece of code that runs reasonably fast.

By the way, this is also where the paper on Dual Photography was presented. I mentioned that a few months ago as I copied one of their results here.

Apologies: this entry probably has many typos and other kinds of error. For some reason the Safari web browser refuses to work at my hotel and for some reason I'm having trouble editing text in Firefox.

Monday, August 01, 2005

The Product of All Primes is 4π2

Here's the paper. A neat result that can go down alongside

1+2+3+4+5+... = -1/12

and = sqrt(2π)

Before you think I've lost my sanity, these results are examples of zeta regularisations which are frequently used in theoretical physics to 'renormalise' the infinite sums that have a habit of appearing there. Although their use in physics isn't terribly well justified, the zeta regularisation, due to Hawking, is well defined as a mathematical operation.

Methods for finding 'sums' of divergent series have a long pedigree. In fact, Hardy wrote an entire book on the subject. The techniques he uses range from Cesaro means to Borel summation.

Zeta regularised sums arise in the selection of the 'critical' dimensions in which various string theories work. For example the fact that bosonic string theory works in 26 dimensions can be seen as coming from 1+2+3+...=-1/12. Coming back down to Earth - a similar computation arises when computing the Casimir force, something that's actually measurable in a lab.