Tags Math

We'll start with a visual and intuitive representation of what a projection is. In the following diagram, we have vector b in the usual 3-dimensional space and two possible projections - one onto the z axis, and another onto the x,y plane.

If we think of 3D space as spanned by the usual basis vectors, a projection onto the z axis is simply:

A couple of intuitive ways to think about what a projection means:

• The projection of b on the z axis is a vector in the direction of the z axis that's closest to b.
• The projection of b on the z axis is the shadow cast by b when a flashlight is pointed at it in the direction of the z axis.

We'll see a more formal definition soon. A projection onto the x,y plane is similarly easy to express.

## Projection onto a line

Projecting onto an axis is easy - as the diagram shows, it's simply taking the vector component in the direction of the axis. But how about projections onto arbitrary lines?

In vector space, a line is just all possible scalings of some vector [1].

Speaking more formally now, we're interested in the projection of onto , where the arrow over a letter means it's a vector. The projection (which we call ) is the closest vector to in the direction of . In other words, the dotted line in the diagram is at a right angle to the line a; therefore, the error vector is orthogonal to .

This orthogonality gives us the tools we need to find the projection. We'll want to find a constant c such that:

is orthogonal to , meaning that their dot product is zero: . We'll use the distributive property of the dot product in what follows:

Note that is the squared magnitude of ; for a unit vector this would be 1. This is why it doesn't matter if is a unit vector or not - we normalize it anyway.

We have a formula for c now - we can find it given and . To prepare for what comes next, however, we'll switch notations. We'll use matrix notation, in which vectors are - by convention - column vectors, and a dot product can be expressed by a matrix multiplication between a row and a column vector. Therefore:

## Projection matrix

Since the fraction representing c is a constant, we can switch the order of the multiplication by a, and then use the fact that matrix multiplication is associative to write:

In our case, since a is a 3D vector, is a 3x3 matrix [2], while is a scalar. Thus we get our projection matrix - call it P:

A recap: given some vector , we can construct a projection matrix P. This projection matrix can take any vector and help us calculate its projection onto by means of a simple matrix multiplication!

### Example of line projection

Consider our original example - projection on the z axis. First, we'll find a vector that spans the subspace represented by the z axis: a trivial vector is the unit vector:

What's the projection matrix corresponding to this vector?

Now, given any arbitrary vector we can find its projection onto the z axis by multiplying with P. For example:

Another example - less trivial this time. Say we want to project vectors onto the line spanned by the vector:

Let's compute the projection matrix:

Now we'll use it to calculate the projection of onto this line:

To verify this makes sense, we can calculate the error vector :

And check that it's indeed orthogonal to :

## Projection onto a vector subspace

A subspace of a vector space is a subset of vectors from the vector space that's closed under vector addition and scalar multiplication. For , some common subspaces include lines that go through the origin and planes that go through the origin.

Therefore, the projection onto a line scenario we've discussed so far is just a special case of a projection onto a subspace. We'll look at the general case now.

Suppose we have an m-dimensional vector space , and a set of n linearly independent vectors . We want to find a combination of these vectors that's closest to some target vector - in other words, to find the projection of onto the subspace spanned by .

Arbitrary m-dimensional vectors are difficult to visualize, but the derivation here follows exactly the path we've taken for projections onto lines in 3D. There, we were looking for a constant c such that was the closest vector to . Now, we're looking for a vector which represents a linear combination of that is closest to a target .

If we organize as columns into a matrix called A, we can express this as:

This is a matrix multiplication: is a list of coefficients that describes some linear combination of the columns of A. As before, we want the error vector to be orthogonal to the subspace onto which we're projecting: this means it's orthogonal to every one of . The fact that vectors are orthogonal to can be expressed as [3]:

This is a system of linear equations, and thus it can be represented as a matrix multiplication by a matrix with vectors in its rows; this matrix is just :

But , so:

Since the columns of A are linearly independent, is an invertible matrix [4], so we can isolate c:

Then the projection is:

Similarly to the line example, we can also define a projection matrix as:

Given a vector , P projects it onto the subspace spanned by the vectors :

Let's make sure the dimensions work out. Recall that A consists of n columns, each with m rows. So we have:

Since the vector is m-dimensional, is valid and the result is another m-dimensional vector - the projection .

### Example of subspace projection

At the beginning of this post there's a diagram showing the projection of an arbitrary vector onto a line and onto a plane. We'll find the projection matrix for the plane case now. The projection is onto the xy plane, which is spanned by these vectors:

Collecting them into a single matrix A, we get:

To find P, let's first calculate :

This happens to be the identity matrix, so its inverse is itself. Thus, we get:

We can now project an arbitrary vector onto this plane by multiplying it with this P:

Granted, this is a fairly trivial example - but it works in the general case. As an exercise, pick a different pair of independent vectors and find the projection matrix onto the plane spanned by them; then, verify that the resulting error is orthogonal to the plane.

## Properties of projection matrices

Projection matrices have some interesting properties that are educational to review.

First, projection matrices are symmetric. To understand why, first recall how a transpose of a matrix product is done:

As a warm-up, we can show that is symmetric:

Now, let's transpose P:

Here we've used the fact that the inverse of a symmetric matrix is also symmetric, and we see that indeed .

Second, projection matrices are idempotent: ; this isn't hard to prove either:

Intuitive explanation: think about what a projection does - given some , it calculates the closest vector to it in the desired subspace. If we try to project this projection again - what will we get? Well, still the closest vector in that subspace - itself! In other words:

## Projections onto orthogonal subspaces

There's another special case of projections that is interesting to discuss: projecting a vector onto orthogonal subspaces. We'll work through this using an example.

Consider the vector:

We'll find the projection matrix for this vector:

Now, consider the following vector, which is orthogonal to :

Its projection matrix is:

It's trivial to check that both and satisfy the properties of projective matrices; what's more interesting is that does as well - so it's also a proper projection matrix!

To take it a step further, consider yet another vector:

The vectors are all mutually orthogonal, and thus form an orthogonal basis for . We can calculate in the usual way, and get:

Not only is is a projection matrix, it's a very familiar matrix in general:

This is equivalent to saying that for any vector :

Hopefully this makes intuitive sense because it's just expressing in an alternative basis for [5].

 [1] We're dealing with vector spaces, where we don't really have lines - only vectors. A line is just a visual way to think about certain subspaces of the vector space \mathbb{R}^3. Specifically, a line through the origin (lines that don't go through the origin belong in affine spaces) is a way to represent \forall c, c\vec{a} where \vec{a} is a vector in the same direction as this line and c is a constant; in other words it's the subspace of \mathbb{R}^3 spanned by \vec{a}.
 [2] By the rules of matrix multiplication: we're multiplying a column vector (a 3x1 matrix) by a row vector (a 1x3 matrix). The multiplication is allowed because the inner dimensions match, and the result is a 3x3 matrix.
 [3] Recall from the earlier example: we're dropping the explicit vector markings to be able to write matrix arithmetic more naturally. By default vectors are column vectors, so v^T w expresses the dot product between vectors and \vec{w}.
 [4] It's possible to prove this statement, but this post is already long enough.
 [5] This is a special case of a change of basis, in which the basis is orthogonal.