Understand the orthogonal decomposition of a vector with respect to a subspace.

Understand the relationship between orthogonal decomposition and orthogonal projection.

Understand the relationship between orthogonal decomposition and the closest vector on / distance to a subspace.

Learn the basic properties of orthogonal projections as linear transformations and as matrix transformations.

Recipes: orthogonal projection onto a line, orthogonal decomposition by solving a system of equations, orthogonal projection via a complicated matrix product.

Let be a subspace of and let be a vector in In this section, we will learn to compute the closest vector to in The vector is called the orthogonal projection of onto

Subsection7.3.1Orthogonal Decomposition

We begin by fixing some notation.

Notation

Let be a subspace of and let be a vector in We denote the closest vector to on by

To say that is the closest vector to on means that the difference is orthogonal to the vectors in

In other words, if then we have where is in and is in The first order of business is to prove that the closest vector always exists.

Theorem(Orthogonal decomposition)

Let be a subspace of and let be a vector in Then we can write uniquely as

Let so by this fact in Section 7.2. Let be a basis for and let be a basis for We showed in the proof of this fact in Section 7.2 that is linearly independent, so it forms a basis for Therefore, we can write

where and Since is orthogonal to the vector is the closest vector to on so this proves that such a decomposition exists.

As for uniqueness, suppose that

for in and in Rearranging gives

Since and are subspaces, the left side of the equation is in and the right side is in Therefore, is in and in so it is orthogonal to itself, which implies Hence and which proves uniqueness.

Definition

Let be a subspace of and let be a vector in The expression

for in and in is called the orthogonal decomposition of with respect to and the closest vector is the orthogonal projection of onto

Since is the closest vector on to the distance from to the subspace is the length of the vector from to i.e., the length of To restate:

Closest vector and distance

Let be a subspace of and let be a vector in

The orthogonal projection is the closest vector to in

Now we turn to the problem of computing and Of course, since really all we need is to compute The following theorem gives a method for computing the orthogonal projection in terms of a spanning set.

Theorem

Let be a subspace of let be a spanning set for (e.g., a basis), and let be the matrix with columns

Let be a vector in Then the matrix equation in the unknown vector is consistent, and for any solution

Since is in we can write for some scalars Let be the vector with entries Then so

This proves that the matrix equation is consistent, and that for a solution

Example(Orthogonal projection onto a line)

Let be a line in and let be a vector in By the theorem, to find we must solve the matrix equation where we regard as an matrix. But and so is a solution of and hence

When has dimension greater than one, computing the orthogonal projection of onto means solving the matrix equation where has columns In other words, we can compute the closest vector by solving a system of linear equations. To be explicit, we state the theorem as a recipe:

Recipe: Compute an orthogonal decomposition

Let and let be the matrix with columns Here is a method to compute the orthogonal decomposition of a vector with respect to

Compute the matrix and the vector

Form the augmented matrix for the matrix equation in the unknown vector and row reduce.

This equation is always consistent; choose one solution Then

In the context of the above theorem, if we start with a basis of then it turns out that the square matrix is automatically invertible! (It is always the case that is square and the equation is consistent, but need not be invertible in general.)

Corollary

Let be a subspace of let be a basis for and let be the matrix with columns

Then the matrix is invertible, and for all vectors in we have

Proof

We will show that which implies invertibility by the invertible matrix theorem in Section 6.1. Suppose that Then so by the theorem. But (the orthogonal decomposition of the zero vector is just so and therefore is in Since the columns of are linearly independent, we have so as desired.

Let be a vector in and let be a solution of Then so

In this subsection, we change perspective and think of the orthogonal projection as a function of This function turns out to be a linear transformation with many nice properties, and is a good example of a linear transformation which is not originally defined as a matrix transformation.

We have to verify the defining properties of linearity in Section 4.3. Let be vectors in and let and be their orthogonal decompositions. Since and are subspaces, the sums and are in and respectively. Therefore, the orthogonal decomposition of is so
Now let be a scalar. Then is in and is in so the orthogonal decomposition of is and therefore,
Since satisfies the two defining properties in Section 4.3, it is a linear transformation.

Any vector in is in the range of because for such vectors. On the other hand, for any vector in the output is in so is the range of

We compute the standard matrix of the orthogonal projection in the same way as for any other transformation: by evaluating on the standard coordinate vectors. In this case, this means projecting the standard coordinate vectors onto the subspace.

For the final assertion, we showed in the proof of this theorem that there is a basis of of the form where is a basis for and is a basis for Each is an eigenvector of indeed, for we have

because is in and for we have

because is in Therefore, we have found a basis of eigenvectors, with associated eigenvalues ( ones and zeros). Now we use the diagonalization theorem in Section 6.4.

As we saw in this example, if you are willing to compute bases for and then this provides a third way of finding the standard matrix for projection onto indeed, if is a basis for and is a basis for then

where the middle matrix in the product is the diagonal matrix with ones and zeros on the diagonal. However, since you already have a basis for it is faster to multiply out the expression as in the corollary.