Many software libraries, especially those for physics or 3D graphics, are equipped with tools for working with vectors. I'd like to point out how in these libraries the functions for manipulating vectors sometimes have special and useful properties that make it worthwhile to separate them out into their own interface.
Two types of Function
Suppose an object of mass m is moving with velocity v
and we apply force f
to it for time t. What is the final velocity? This is given by g:
g(m,t,f,v) = v+(t/m)*f
Now suppose that R is a rotation operation, typically represented by a matrix. What happens if we apply it to both of the vector arguments of g?
g(m,t,Rf,Rv) = Rv+(t/m)*Rf = Rg(m,t,f,v)
In other words, rotating the vector arguments is the same as rotating the vector result.
Another example: Consider the function that gives the force on an electric charge as a function of its velocity and the magnetic field:
f(e,v,B) = ev×B
It's essentially just the cross product. If you rotate both of the arguments to the cross product then the result is rotated
too. The result is that
f(e,Rv,RB) = Rf(e,v,B)
On the other hand, many 3D APIs come with a function to perform componentwise multiplication of vectors. Write vectors x
as triples (x0
), and so on, we can write such a function as:
f(x,y) = (x0y0,x1y1,x2y2)
You can show that this doesn't have a similar property.
To make things easy, let's restrict ourselves to functions of scalars and vectors. And when I say vector, I'm talking strictly about vectors representing magnitude and direction, but not positions. Examples of such vectors are velocities, accelerations, angular velocities, magnetic fields, and the difference between two positions. A function is said to be rotationally invariant
if applying a rotation R to all of its vector arguments results in the same thing as applying R to all of the vectors in its value. This allows you to have a function that returns multiple vectors, like a tuple or array.
The first two functions I described above were rotationally invariant but the third wasn't. Notice how the first two examples also described physical processes. This is the important point: as far as we know, all of the laws of physics are rotationally invariant
. If you write down an equation describing a physical process then replacing all of the vectors in it by their rotated counterparts must also result in a valid equation. So if you're describing a physical process with a computer program, and you end up with a function that isn't rotationally invariant, you've made a mistake somewhere.
Vector APIs frequently come with all manner of functions. Some have the invariance property and some don't. If you write code that you'd like to be rotationally invariant, but it turns out that it isn't, you usually have to examine the code to find the bug. But if you separate the invariant functions into their own interface, and then write code using just that interface, the code is guaranteed to be invariant. If your programming language has reasonably strict types then you may even be able to arrange things so that the type signature of the function alone is enough to tell you that the function is invariant. In effect you are able to make the compiler prove that your function is invariant.
(As an aside, this is an example of why a good type system does much more than you might at first have guessed. They don't just stop you making typos, they can do things like prove that your programs satisfy certain geometrical properties.)
So what functions would you have in such an API? Among the essential rotationally invariant functions are:
1. Multiplcation of a vector by a scalar
2. Addition of vectors
3. Dot product
4. Cross product
In terms of these you can build functions such as
1. Vector length
2. Vector normalization
3. Rotation of one vector around an axis specified by another vector
4. Linear interpolation between vectors
What kinds of functions would be excluded?
1. Constructing a vector from three scalars, ie. f(x,y,z) = (x,y,z).
2. Constructing a vector form a single scalar, ie. f(x) = (x,x,x).
3. Extracting the ith component of a vector, ie. f(i,(x0
)) = xi
4. Pointwise multiplication of vectors.
5. Computing the elementwise cosine of a vector.
On seeing the first excluded example above you might ask "how am I supposed to construct vectors?" The point is that you don't program exclusively with an invariant API, you simply use it whenever you need to prove invariance.
Coordinate System Invariance
One purpose of writing to a particular interface is that it allows the API to hide implementation details from the user. Using a rotationally invariant API has a role to serve here. For example, many 3D renderers allow you to write shaders
. These are essentially functions that compute the colour of a piece of geometry that needs rendering. You write a shader and the renderer then calls your shader as needed when a fragment of geometry passes through its pipeline. Frequently these are used for lighting calculations but there are all kinds of other things that may be computed in shaders.
In a 3D renderer different parts of the computation are often performed in different coordinate systems. For example it may be convenient to perform lighting calculations in a coordinate system oriented with the direction of the light. But the author of a renderer doesn't want to be committed to a particular choice. In order to do this, it is essential to be able to write shaders that are agnostic about which coordinate system is being used. If we work with rotationally invariant functions, our shaders are guaranteed to be agnostic in this way (assuming that the only kind of coordinate change that takes place is a rotation).
Lots More Types
I've concentrated on just one type of invariance, rotational invariance. If we consider more types of invariance then more types of interface naturally emerge. It would take too long to cover all of the details here so I'm just going to briefly sketch the beginnings of the more general case. So just read this section as a list of pointers to further reading.
For example, some functions are invariant under translations. These can be thoght of as functions of points in space. If we allow more general linear transformations then we find that some common functions transform 'oppositely' to vectors. In particular, normals to surfaces transform in this way. In fact, Pixar's Renderman
has three distinct types, vectors, points and normals that captures these different invariances.
If we go back to rotations again but now extend these by allowing reflections then we find an interesting new phenomenon. For example, consider the result of reflecting in the x-y-plane, followed by reflecting in the y-z-plane followed by reflecting in the x-z-plane. This simply multiplies vectors by -1. Dot product is invariant under this: (-x
. But cross product isn't because (-x
. Even though the cross product is apparently vector, it doesn't get multiplied by -1. When we start considering invariance under reflection we find that some vectors behave differently. These are the pseudovectors
and in effect they have their own separate type and interface. Interestingly, nature likes to keep pseudovectors and vectors separate except in parity violating
phenomena. There are even pseudoscalars
Incidentally, if you consider invariance under scaling you're led to the idea of encoding dimensions
in your types
If you're writing a vector API think about the invariance properties that your functions may have. If any are useful then it may be worth placing those in a separate interface. The more distinct types you have, the more properties you can make your compiler prove. Obviously this needs to be balanced against practicality, complexity for users and what you actually need. To some extent, many existing APIs make some of these distinctions with varying degrees of strictness. The main point I want to make clear is that these distinctions are based on invariance
properties, something that not all developers of such APIs are aware of.
At some point I hope to return to this topic and enumerate all of the common vector-like types in a single framework. Unfortunately it's a big topic and I've only been able to scratch the surface here. In particular there are some subtle interplays between dimensions and types.
On a deeper level, I think there must be some type theoretical framework in which these invariance properties are free theorems
Update: I believe some of this material is covered in Jimm Blinn's Vectors and Geometry and Objects, Oh My!
, but I don't have access to that. I suspect that there is one big difference in my presentation: I'm not so interested here in vectors (or normals or whatever) in themselves but as defining interfaces to functions with invariance properties. Like the way category theorists shift the focus from objects to arrows. It makes a difference because it immediately gives theorems that our code is guaranteed to satisfy. It's the invariance property of the cross product (say) that is useful here, not the fact that the components of a vector transform a certain way when we change coordinates (because I might not even want to refer to coordinates in my code).
To show that I'm talking about something very simple, but still powerful, here's some Haskell code:
> data Triple = T Float Float Float deriving Show
> class Vector v where
> (.+) :: v -> v -> v
> (.*) :: Float -> v -> v
> dot :: v -> v -> Float
> cross :: v -> v -> v
> instance Vector Triple where
> T x y z .+ T u v w = T (x+u) (y+v) (z+w)
> a .* T x y z = T (a*x) (a*y) (a*z)
> dot (T x y z) (T u v w) = x*u+y*v+z*w
> cross (T x y z) (T u v w) = T
You can freely apply the four primitive functions to elements of type Triple
but if you have a function of, say, signature
> f :: Vector v => (v,v,Float) -> [(v,v)]
you are guaranteed it is invariant.