Saturday, October 10, 2009

Vectors, Invariance, and Math APIs

Many software libraries, especially those for physics or 3D graphics, are equipped with tools for working with vectors. I'd like to point out how in these libraries the functions for manipulating vectors sometimes have special and useful properties that make it worthwhile to separate them out into their own interface.

Two types of Function


Suppose an object of mass m is moving with velocity v and we apply force f to it for time t. What is the final velocity? This is given by g:

g(m,t,f,v) = v+(t/m)*f

Now suppose that R is a rotation operation, typically represented by a matrix. What happens if we apply it to both of the vector arguments of g?

g(m,t,Rf,Rv) = Rv+(t/m)*Rf = Rg(m,t,f,v)

In other words, rotating the vector arguments is the same as rotating the vector result.

Another example: Consider the function that gives the force on an electric charge as a function of its velocity and the magnetic field:

f(e,v,B) = ev×B

It's essentially just the cross product. If you rotate both of the arguments to the cross product then the result is rotated too. The result is that

f(e,Rv,RB) = Rf(e,v,B)


On the other hand, many 3D APIs come with a function to perform componentwise multiplication of vectors. Write vectors x as triples (x0,x1,x2), and so on, we can write such a function as:

f(x,y) = (x0y0,x1y1,x2y2)

You can show that this doesn't have a similar property.

Rotational Invariance


To make things easy, let's restrict ourselves to functions of scalars and vectors. And when I say vector, I'm talking strictly about vectors representing magnitude and direction, but not positions. Examples of such vectors are velocities, accelerations, angular velocities, magnetic fields, and the difference between two positions. A function is said to be rotationally invariant if applying a rotation R to all of its vector arguments results in the same thing as applying R to all of the vectors in its value. This allows you to have a function that returns multiple vectors, like a tuple or array.

The first two functions I described above were rotationally invariant but the third wasn't. Notice how the first two examples also described physical processes. This is the important point: as far as we know, all of the laws of physics are rotationally invariant. If you write down an equation describing a physical process then replacing all of the vectors in it by their rotated counterparts must also result in a valid equation. So if you're describing a physical process with a computer program, and you end up with a function that isn't rotationally invariant, you've made a mistake somewhere.

Vector APIs


Vector APIs frequently come with all manner of functions. Some have the invariance property and some don't. If you write code that you'd like to be rotationally invariant, but it turns out that it isn't, you usually have to examine the code to find the bug. But if you separate the invariant functions into their own interface, and then write code using just that interface, the code is guaranteed to be invariant. If your programming language has reasonably strict types then you may even be able to arrange things so that the type signature of the function alone is enough to tell you that the function is invariant. In effect you are able to make the compiler prove that your function is invariant.

(As an aside, this is an example of why a good type system does much more than you might at first have guessed. They don't just stop you making typos, they can do things like prove that your programs satisfy certain geometrical properties.)

So what functions would you have in such an API? Among the essential rotationally invariant functions are:

1. Multiplcation of a vector by a scalar
2. Addition of vectors
3. Dot product
4. Cross product

In terms of these you can build functions such as

1. Vector length
2. Vector normalization
3. Rotation of one vector around an axis specified by another vector
4. Linear interpolation between vectors

What kinds of functions would be excluded?

1. Constructing a vector from three scalars, ie. f(x,y,z) = (x,y,z).
2. Constructing a vector form a single scalar, ie. f(x) = (x,x,x).
3. Extracting the ith component of a vector, ie. f(i,(x0,x1,x2)) = xi.
4. Pointwise multiplication of vectors.
5. Computing the elementwise cosine of a vector.

On seeing the first excluded example above you might ask "how am I supposed to construct vectors?" The point is that you don't program exclusively with an invariant API, you simply use it whenever you need to prove invariance.

Coordinate System Invariance


One purpose of writing to a particular interface is that it allows the API to hide implementation details from the user. Using a rotationally invariant API has a role to serve here. For example, many 3D renderers allow you to write shaders. These are essentially functions that compute the colour of a piece of geometry that needs rendering. You write a shader and the renderer then calls your shader as needed when a fragment of geometry passes through its pipeline. Frequently these are used for lighting calculations but there are all kinds of other things that may be computed in shaders.

In a 3D renderer different parts of the computation are often performed in different coordinate systems. For example it may be convenient to perform lighting calculations in a coordinate system oriented with the direction of the light. But the author of a renderer doesn't want to be committed to a particular choice. In order to do this, it is essential to be able to write shaders that are agnostic about which coordinate system is being used. If we work with rotationally invariant functions, our shaders are guaranteed to be agnostic in this way (assuming that the only kind of coordinate change that takes place is a rotation).

Lots More Types


I've concentrated on just one type of invariance, rotational invariance. If we consider more types of invariance then more types of interface naturally emerge. It would take too long to cover all of the details here so I'm just going to briefly sketch the beginnings of the more general case. So just read this section as a list of pointers to further reading.

For example, some functions are invariant under translations. These can be thoght of as functions of points in space. If we allow more general linear transformations then we find that some common functions transform 'oppositely' to vectors. In particular, normals to surfaces transform in this way. In fact, Pixar's Renderman has three distinct types, vectors, points and normals that captures these different invariances.

If we go back to rotations again but now extend these by allowing reflections then we find an interesting new phenomenon. For example, consider the result of reflecting in the x-y-plane, followed by reflecting in the y-z-plane followed by reflecting in the x-z-plane. This simply multiplies vectors by -1. Dot product is invariant under this: (-x)·(-y)=x·y. But cross product isn't because (-x)×(-y)=x×y. Even though the cross product is apparently vector, it doesn't get multiplied by -1. When we start considering invariance under reflection we find that some vectors behave differently. These are the pseudovectors and in effect they have their own separate type and interface. Interestingly, nature likes to keep pseudovectors and vectors separate except in parity violating phenomena. There are even pseudoscalars.

Incidentally, if you consider invariance under scaling you're led to the idea of encoding dimensions in your types.

Conclusion


If you're writing a vector API think about the invariance properties that your functions may have. If any are useful then it may be worth placing those in a separate interface. The more distinct types you have, the more properties you can make your compiler prove. Obviously this needs to be balanced against practicality, complexity for users and what you actually need. To some extent, many existing APIs make some of these distinctions with varying degrees of strictness. The main point I want to make clear is that these distinctions are based on invariance properties, something that not all developers of such APIs are aware of.

At some point I hope to return to this topic and enumerate all of the common vector-like types in a single framework. Unfortunately it's a big topic and I've only been able to scratch the surface here. In particular there are some subtle interplays between dimensions and types.

On a deeper level, I think there must be some type theoretical framework in which these invariance properties are free theorems.

Update: I believe some of this material is covered in Jimm Blinn's Vectors and Geometry and Objects, Oh My!, but I don't have access to that. I suspect that there is one big difference in my presentation: I'm not so interested here in vectors (or normals or whatever) in themselves but as defining interfaces to functions with invariance properties. Like the way category theorists shift the focus from objects to arrows. It makes a difference because it immediately gives theorems that our code is guaranteed to satisfy. It's the invariance property of the cross product (say) that is useful here, not the fact that the components of a vector transform a certain way when we change coordinates (because I might not even want to refer to coordinates in my code).

Example Code



To show that I'm talking about something very simple, but still powerful, here's some Haskell code:


> data Triple = T Float Float Float deriving Show

> class Vector v where
> (.+) :: v -> v -> v
> (.*) :: Float -> v -> v
> dot :: v -> v -> Float
> cross :: v -> v -> v

> instance Vector Triple where
> T x y z .+ T u v w = T (x+u) (y+v) (z+w)
> a .* T x y z = T (a*x) (a*y) (a*z)
> dot (T x y z) (T u v w) = x*u+y*v+z*w
> cross (T x y z) (T u v w) = T
> (y*w-v*z)
> (z*u-x*w)
> (x*v-y*u)


You can freely apply the four primitive functions to elements of type Triple but if you have a function of, say, signature


> f :: Vector v => (v,v,Float) -> [(v,v)]


you are guaranteed it is invariant.

4 comments:

Derek Elkins said...

As I'm sure you well know, the geometric algebra and particularly the conformal geometric algebra makes it easier and clearer to recognize and maintain these invariants. For example, in any geometric algebra the distinction between a free vector and a normal vector is clear as normals are represented by tangent bivectors and which automatically transform appropriately under reflections. Similarly, the conformal geometric algebra distinguishes free vectors, tangent vectors, position vectors and normal "vectors" readily. Usually when I write my own 3D code, I make a Point type distinct from a Vector type, and also, Bivector, Spinor, and Pseudoscalar types.

Any rotationally invariant function is a natural transformation between (products of) the inclusion of the category of vectors and orthogonal transformations into the category of vectors and general linear transformations (or more or less general categories, e.g. it could be the identity functor).

Call the inclusion I, then g : I -> I means for a vector space V and an orthogonal transformation on it, R, we have g(IRv) = IRg(v).

Anonymous said...

Amazing post as usual! I'll soon learn Haskell just to try out your mind warping examples. :)

Andrew Sackville-West said...

Isn't this really just a distinction between linear versus non-linear transformations? I think as a matter of practicality, lots of people stick all sorts of vector operations into libraries but don't make the distinction that certain operations are most definitely *not* linear transformations.

And there is probably a lot of misunderstanding among users of libraries as to what rules apply to linear or non-linear transformations.

Rotation is a linear transformation and thus preserves vector addition and scalar multiplication. So you can do all sorts of stuff with it and the order of the operations doesn't really matter.

The component-wise multiplication of vectors is not a linear transformation and so none of those rules apply.

I think you are right, that the segregation of APIs makes some sense, but even more important is getting the user's of APIs to understand the distinctions between different types of transformations. (But maybe I'm missing your point and being too simplistic).

gelisam said...

This comment spells out the free theorem instantiation in which invariance properties are free theorems.

First, I do not think that the rotational invariance of (.+), (.*), dot and cross are free theorems. Rather, for all other operations made out of those primitives, the fact that it too is rotation invariant is a free theorem.

With a class annotation like (Vector v =>), the free relation corresponding to the type v can be any relation as long as it satisfies the free theorems corresponding to the types of the class methods. In our case, we wish to use the relation (Rx == x'), which needs to satisfy the following requirements:

R (x + y) == Rx + Ry
R (a * x) == a * Rx
dot x y == dot Rx Ry
R (cross x y) == cross Rx Ry

Those requirements are specifically stating that the (.+), (.*), dot and cross primitives are rotation invariant.

Now if those requirements are satisfied (which they are), we can compute the free relation corresponding to the right-hand side of (Vector v =>) and we will obtain a free theorem. With the f example at the end of the post, for example, the free theorem is

map rotpair (f (x,y,a)) == f (Rx, Ry, a)
where
rotpair (s, t) = (Rs, Rt)

which, as desired, states that f is rotation invariant.

Blog Archive