Diagonal matrices are the easiest kind of matrices to understand: they just scale the coordinate directions by their diagonal entries. This section is devoted to the question: “When is a matrix similar to a diagonal matrix?” We will see that the algebra and geometry of such a matrix is relatively easy to understand.
Subsection5.4.1Diagonalizability
First we make precise what we mean when we say two matrices are “similar”.
Definition
Two matrices and are similar if there exists an invertible matrix such that
A fundamental question about a matrix is whether or not it is diagonalizable. The following is the primary criterion for diagonalizability. It shows that diagonalizability is an eigenvalue problem.
Diagonalization Theorem
An matrix is diagonalizable if and only if has linearly independent eigenvectors.
In this case, for
where are linearly independent eigenvectors, and are the corresponding eigenvalues, in the same order.
Therefore, the columns of are multiples of the standard coordinate vectors:
Now suppose that where has columns and is diagonal with diagonal entries Since is invertible, its columns are linearly independent. We have to show that is an eigenvector of with eigenvalue We know that the standard coordinate vector is an eigenvector of with eigenvalue so:
By this fact in Section 5.1, if an matrix has distinct eigenvalues then a choice of corresponding eigenvectors is automatically linearly independent.
An matrix with distinct eigenvalues is diagonalizable.
We saw in the above example that changing the order of the eigenvalues and eigenvectors produces a different diagonalization of the same matrix. There are generally many different ways to diagonalize a matrix, corresponding to different orderings of the eigenvalues of that matrix. The important thing is that the eigenvalues and eigenvectors have to be listed in the same order.
There are other ways of finding different diagonalizations of the same matrix. For instance, you can scale one of the eigenvectors by a constant
you can find a different basis entirely for an eigenspace of dimension at least etc.
In the above example, the (non-invertible) matrix is similar to the diagonal matrix Since is not invertible, zero is an eigenvalue by the invertible matrix theorem, so one of the diagonal entries of is necessarily zero. Also see this example below.
The following point is often a source of confusion.
Diagonalizability has nothing to do with invertibility
Of the following matrices, the first is diagonalizable and invertible, the second is diagonalizable but not invertible, the third is invertible but not diagonalizable, and the fourth is neither invertible nor diagonalizable, as the reader can verify:
As in the above example, one can check that the matrix
is not diagonalizable for any number We claim that any non-diagonalizable matrix with a real eigenvalue is similar to Therefore, up to similarity, these are the only such examples.
To prove this, let be such a matrix. Let be an eigenvector with eigenvalue and let be any vector in that is not collinear with so that forms a basis for Let be the matrix with columns and consider We have and so and We can compute the first column of as follows:
Therefore, has the form
Since is similar to it also has only one eigenvalue since is upper-triangular, this implies so
As is not diagonalizable, we know is not diagonal ( is similar to ), so Now we observe that
We have shown that is similar to which is similar to so is similar to by this example.
Subsection5.4.2The Geometry of Diagonalizable Matrices¶ permalink
A diagonal matrix is easy to understand geometrically, as it just scales the coordinate axes:
A daigonalizable matrix is not much harder to understand geometrically. Indeed, if are linearly independent eigenvectors of an matrix then scales the -direction by the eigenvalue in other words, Since the vectors form a basis of this determines the action of on any vector in
Example
Consider the matrices
One can verify that see this example. Let and the columns of These are eigenvectors of with corresponding eigenvalues and
The matrix is diagonal: it scales the -direction by a factor of and the -direction by a factor of
If we write a vector in terms of the basis say, then it is easy to compute
Here we have used the fact that are eigenvectors of Since the resulting vector is still expressed in terms of the basis we can visualize what does to the vector it scales the “-coordinate” by and the “-coordinate” by
For instance, let We see from the grid on the right in the picture below that so
The picture illustrates the action of on the plane in the usual basis, and the action of on the plane in the -basis.
Now let We see from the grid on the right in the picture below that so
This is illustrated in the picture below.
In the following examples, we visualize the action of a diagonalizable matrix in terms of its dynamics. In other words, we start with a collection of vectors (drawn as points), and we see where they move when we multiply them by repeatedly.
Subsection5.4.3Algebraic and Geometric Multiplicity
In this subsection, we give a variant of the diagonalization theorem that provides another criterion for diagonalizability. It is stated in the language of multiplicities of eigenvalues.
In algebra, we define the multiplicity of a root of a polynomial to be the number of factors of that divide For instance, in the polynomial
the root has multiplicity and the root has multiplicity
Definition
Let be an matrix, and let be an eigenvalue of
The algebraic multiplicity of is its multiplicity as a root of the characteristic polynomial of
The geometric multiplicity of is the dimension of the -eigenspace.
Since the -eigenspace of is its dimension is the number of free variables in the system of equations i.e., the number of columns without pivots in the matrix
We saw in the above examples that the algebraic and geometric multiplicities need not coincide. However, they do satisfy the following fundamental inequality, the proof of which is beyond the scope of this text.
Theorem(Algebraic and Geometric Multiplicity)
Let be a square matrix and let be an eigenvalue of Then
In particular, if the algebraic multiplicity of is equal to then so is the geometric multiplicity.
If has an eigenvalue with algebraic multiplicity then the -eigenspace is a line.
The sum of the geometric multiplicities of the eigenvalues of is equal to
The sum of the algebraic multiplicities of the eigenvalues of is equal to and for each eigenvalue, the geometric multiplicity equals the algebraic multiplicity.
We will show First suppose that is diagonalizable. Then has linearly independent eigenvectors This implies that the sum of the geometric multiplicities is at least for instance, if have the same eigenvalue then the geometric multiplicity of is at least (as the -eigenspace contains three linearly independent vectors), and so on. But the sum of the algebraic multiplicities is greater than or equal to the sum of the geometric multiplicities by the theorem, and the sum of the algebraic multiplicities is at most because the characteristic polynomial has degree Therefore, the sum of the geometric multiplicities equals
Now suppose that the sum of the geometric multiplicities equals As above, this forces the sum of the algebraic multiplicities to equal as well. As the algebraic multiplicities are all greater than or equal to the geometric multiplicities in any case, this implies that they are in fact equal.
Finally, suppose that the third condition is satisfied. Then the sum of the geometric multiplicities equals Suppose that the distinct eigenvectors are and that is a basis for the -eigenspace, which we call We claim that the collection of all vectors in all of the eigenspace bases is linearly independent. Consider the vector equation
Grouping the eigenvectors with the same eigenvalues, this sum has the form
Since eigenvectors with distinct eigenvalues are linearly independent, each “something in ” is equal to zero. But this implies that all coefficients are equal to zero, since the vectors in each are linearly independent. Therefore, has linearly independent eigenvectors, so it is diagonalizable.
The first part of the third statement simply says that the characteristic polynomial of factors completely into linear polynomials over the real numbers: in other words, there are no complex (non-real) roots. The second part of the third statement says in particular that for any diagonalizable matrix, the algebraic and geometric multiplicities coincide.
Let be a square matrix and let be an eigenvalue of If the algebraic multiplicity of does not equal the geometric multiplicity, then is not diagonalizable.
The examples at the beginning of this subsection illustrate the theorem. Here we give some general consequences for diagonalizability of and matrices.
Diagonalizability of Matrices
Let be a matrix. There are four cases:
has two different eigenvalues. In this case, each eigenvalue has algebraic and geometric multiplicity equal to one. This implies is diagonalizable. For example:
has one eigenvalue of algebraic and geometric multiplicity To say that the geometric multiplicity is means that i.e., that every vector in is in the null space of This implies that is the zero matrix, so that is the diagonal matrix In particular, is diagonalizable. For example:
has one eigenvalue of algebraic multiplicity and geometric multiplicity In this case, is not diagonalizable, by part 3 of the theorem. For example, a shear:
has no eigenvalues. This happens when the characteristic polynomial has no real roots. In particular, is not diagonalizable. For example, a rotation: