Inside the Matrix
by Enno Cramer • Math • Computer Graphics • Linear Algebra
Table of Contents
Vectors
A vector is simply a tuple of numbers, either written in a row as
v = (v1, v2, ..., vn),
or in a column as
/ v1 \ | v2 | v = | ... |. \ vn /
For brevity, I'll mostly stick to the row format in this text, but they are absolutely equivalent.
In computer graphics, we can use vectors to hold directions and
distances in euclidean space. A vector then contains one number for
each dimension of space, representing the distance along a single
axis. Thus v = (x, y, z)
is only a short form of writing v = x*a_x + y*a_y + z*a_z
, where a_x
, a_y
, and a_z
are the three axes of
space.
We can also use vectors to represent positions in space, by using a
reference point O
and storing the distance between a point and the
reference point. However, distances and points now look the same,
which is a Bad Thing, because they aren't. This is where homogeneous
coordinates come into play. We can solve this ambiguity by adding
another element, the homogenous coordinate, to the vector. For
distances, this coordinate is 0, for points it is 1. (Actually, for
points it must only be non-0, but I won't go into all the details of
homogeneous coordinates. Suffice to say that a vector v = (x, y, z, w)
with w != 0
represents the same point as v' = (x/w, y/w, z/w, 1)
.)
With this addition, the vector v = (x, y, z, w)
is a short form of
writing v = x*a_x + y*a_y + z*a_z + w*O
. Vectors are always
relative to a /coordinate system/, defined by the three axes a_x
,
a_y
, and a_z
, and the reference point O
, also called the
/origin/.
Transformations
Affine transformations, such as scaling, rotation, translation, shearings, and any combination thereof, can be expressed by defining a new coordinate system. Scalings are simply changes to the length of the axes, rotations change the orientation of the axes, translations move the reference point, and shearing change the angles between the axes.
For example, an object in a coordinate system whose axes all have a length of two units, is twice as big as the same object in a coordinate system whose axes all have a length of one unit.
Now, if we express the axes and origin of the transformed coordinate system in terms of the original coordinate system, we can apply the transformation to any vector (point or distance) by simply evaluating the equation given above.
We'll call the transformed coordinate system T
, with axes T_x
,
T_y
, and T_z
, and origin T_O
, and the original coordinate system
O
, with axes O_x
, O_y
, and O_z
, and origin O_O
. The
transformed coordinate system T
is defined relative to the original
coordinate system O
:
T_x = (T_xx, T_xy, T_xz, 0) T_y = (T_yx, T_yy, T_yz, 0) T_z = (T_zx, T_zy, T_zz, 0) T_O = (T_Ox, T_Oy, T_Oz, 1)
(Notice how the axes all have a homogeneous coordinate of 0, as they are distances, and the origin has a homogeneous coordinate of 1, as it is a point.)
With these definitions, we get
v' = (x', y', z', w') = x*T_x + y*T_y + z*T_z + w*T_O = x * (T_xx * O_x + T_xy * O_y + T_xz * O_z + 0 * O_O) + y * (T_yx * O_x + T_yy * O_y + T_yz * O_z + 0 * O_O) + z * (T_zx * O_x + T_zy * O_y + T_zz * O_z + 0 * O_O) + w * (T_Ox * O_x + T_Oy * O_y + T_Oz * O_z + 1 * O_O) = (x*T_xx + y*T_yx + z*T_zx + w*T_Ox) * O_x + (x*T_xy + y*T_yy + z*T_zy + w*T_Oy) * O_y + (x*T_xz + y*T_yz + z*T_zz + w*T_Oz) * O_z + (x* 0 + y* 0 + z* 0 + w* 1) * O_O.
Now, this looks rather complicated at first sight, but if you look closely, you'll notice a certain pattern.
Enter the Matrix
People familiar with linear algebra will probably recognize the pattern. It look suspiciously like the product of matrizes, and indeed, it can be written as such.
We have to combine the axes and origin of the transformed coordinate
system into a 4x4 matrix M
. This can be done in more than one way,
but I'll stick to the convention used by OpenGL. The other
possibilities lead to different multiplication orders (remember matrix
multiplication is not commutative) and vector notations.
If we combine the axes and origin such that each vector occupies one column of the final matrix, and consider the vector as a column matrix (a matrix with only a single column, much like the column notation of vectors), the transformation can be expressed as a simple
v' = M * v / T_xx T_yx T_zx T_Ox \ / x \ | T_xy T_yy T_zy T_Oy | | y | = | T_xz T_yz T_zz T_Oz | * | z |. \ 0 0 0 1 / \ w /
Working with Transformations
Now, that we have shown how affine transformations can be expressed as matrix multiplications, it is trivial to show how to combine transformations.
Suppose we want to move and scale an object. We have the scaling
transformation stored in the matrix S
, and the translation in the
matrix T
.
v' = T * (S * v) = (T * S) * v = TS * v
Thus, we can combine both transformations into a single matrix, simply
by multiplying the two transformation matrizes. Note, however, that
the order is important. T * S
moves the scaled object, whereas S * T
scales the already moved object, /thus amplifying the movement/.
When using OpenGL, transformations are accumulated from left to right.
Thus, to produce the transform T * S
, one has to call the
gl-functions in the order
glTranslate(...); glScale(...);
Or, put another way, the first transformation affecting an object, is the transformation last executed.