Skip to main content

Section5.4Diagonalization

Objectives
  1. Learn two main criteria for a matrix to be diagonalizable.
  2. Develop a library of examples of matrices that are and are not diagonalizable.
  3. Recipes: diagonalize a matrix, quickly compute powers of a matrix by diagonalization.
  4. Pictures: the geometry of diagonal matrices, why a shear is not diagonalizable.
  5. Theorem: the diagonalization theorem (two variants).
  6. Vocabulary words: diagonalizable, algebraic multiplicity, geometric multiplicity.

Diagonal matrices are the easiest kind of matrices to understand: they just scale the coordinate directions by their diagonal entries. This section is devoted to the question: “When is a matrix similar to a diagonal matrix?” We will see that the algebra and geometry of such a matrix is relatively easy to understand.

Subsection5.4.1Diagonalizability

First we make precise what we mean when we say two matrices are “similar”.

Definition

Two n × n matrices A and B are similar if there exists an invertible n × n matrix C such that A = CBC 1 .

If two matrices are similar, then their powers are similar as well.

Proof

First note that

A 2 = AA =( CBC 1 )( CBC 1 )= CB ( C 1 C ) BC 1 = CBI n BC 1 = CB 2 C 1 .

Next we have

A 3 = A 2 A =( CB 2 C 1 )( CBC 1 )= CB 2 ( C 1 C ) BC 1 = CB 3 C 1 .

The pattern is clear.

In this chapter, we will determine when a matrix is similar to a diagonal matrix. This property is important enough to deserve its own name.

Definition

An n × n matrix A is diagonalizable if it is similar to a diagonal matrix: that is, if there exists an invertible n × n matrix C and a diagonal matrix D such that

A = CDC 1 .
Example

Any diagonal matrix is D is diagonalizable because it is similar to itself. For instance,

C 100020003 D = I 3 C 100020003 D I 13 .
Example

If a matrix A is diagonalizable, and if B is similar to A , then B is diagonalizable as well. Indeed, if A = CDC 1 for D diagonal, and B = EAE 1 , then

B = EAE 1 = E ( CDC 1 ) E 1 =( EC ) D ( EC ) 1 ,

so B is similar to D .

Powers of diagonalizable matrices

Multiplying diagonal matrices together just multiplies their diagonal entries:

C x 1 000 x 2 000 x 3 DC y 1 000 y 2 000 y 3 D = C x 1 y 1 000 x 2 y 2 000 x 3 y 3 D .

Therefore, it is easy to take powers of a diagonal matrix:

C x 000 y 000 z D n = C x n 000 y n 000 z n D .

By this fact, if A = CDC 1 then A n = CD n C 1 , so it is also easy to take powers of diagonalizable matrices. This is often very important in applications.

Recipe: Compute powers of a diagonalizable matrix

If A = CDC 1 , where D is a diagonal matrix, then A n = CD n C 1 :

A = C C x 000 y 000 z D C 1 = A n = C C x n 000 y n 000 z n D C 1 .

A fundamental question about a matrix is whether or not it is diagonalizable. The following is the primary criterion for diagonalizability. It shows that diagonalizability is an eigenvalue problem.

Proof

By this fact in Section 5.1, if an n × n matrix A has n distinct eigenvalues λ 1 , λ 2 ,..., λ n , then a choice of corresponding eigenvectors v 1 , v 2 ,..., v n is automatically linearly independent.

An n × n matrix with n distinct eigenvalues is diagonalizable.

Non-Uniqueness of Diagonalization

We saw in the above example that changing the order of the eigenvalues and eigenvectors produces a different diagonalization of the same matrix. There are generally many different ways to diagonalize a matrix, corresponding to different orderings of the eigenvalues of that matrix. The important thing is that the eigenvalues and eigenvectors have to be listed in the same order.

A = C ||| v 1 v 2 v 3 ||| DC λ 1 000 λ 2 000 λ 3 DC ||| v 1 v 2 v 3 ||| D 1 = C ||| v 3 v 2 v 1 ||| DC λ 3 000 λ 2 000 λ 1 DC ||| v 3 v 2 v 1 ||| D 1 .

There are other ways of finding different diagonalizations of the same matrix. For instance, you can scale one of the eigenvectors by a constant c :

A = C ||| v 1 v 2 v 3 ||| DC λ 1 000 λ 2 000 λ 3 DC ||| v 1 v 2 v 3 ||| D 1 = C ||| cv 1 v 2 v 3 ||| DC λ 1 000 λ 2 000 λ 3 DC ||| cv 1 v 2 v 3 ||| D 1 ,

you can find a different basis entirely for an eigenspace of dimension at least 2, etc.

In the above example, the (non-invertible) matrix A = 1 3 A 2 4 24 B is similar to the diagonal matrix D = A 0002 B . Since A is not invertible, zero is an eigenvalue by the invertible matrix theorem, so one of the diagonal entries of D is necessarily zero. Also see this example below.

Here is the procedure we used in the above examples.

Recipe: Diagonalization

Let A be an n × n matrix. To diagonalize A :

  1. Find the eigenvalues of A using the characteristic polynomial.
  2. For each eigenvalue λ of A , compute a basis B λ for the λ -eigenspace.
  3. If there are fewer than n total vectors in all of the eigenspace bases B λ , then the matrix is not diagonalizable.
  4. Otherwise, the n vectors v 1 , v 2 ,..., v n in the eigenspace bases are linearly independent, and A = CDC 1 for
    C = C ||| v 1 v 2 ··· v n ||| D and D = EIIG λ 1 0 ··· 00 λ 2 ··· 0............00 ··· λ n FJJH ,
    where λ i is the eigenvalue for v i .

We will justify the linear independence assertion in part 4 in the proof of this theorem below.

The following point is often a source of confusion.

Diagonalizability has nothing to do with invertibility

Of the following matrices, the first is diagonalizable and invertible, the second is diagonalizable but not invertible, the third is invertible but not diagonalizable, and the fourth is neither invertible nor diagonalizable, as the reader can verify:

K 1001 LK 1000 LK 1101 LK 0100 L .

Subsection5.4.2The Geometry of Diagonalizable Matrices

A diagonal matrix is easy to understand geometrically, as it just scales the coordinate axes:

C 100020003 DC 100 D = 1 · C 100 DC 100020003 DC 010 D = 2 · C 010 D
C 100020003 DC 001 D = 3 · C 001 D .

A daigonalizable matrix is not much harder to understand geometrically. Indeed, if v 1 , v 2 ,..., v n are linearly independent eigenvectors of an n × n matrix A , then A scales the v i -direction by the eigenvalue λ i : in other words, Av i = λ i v i . Since the vectors v 1 , v 2 ,..., v n form a basis of R n , this determines the action of A on any vector in R v .

Example

Consider the matrices

A = K 1 / 23 / 23 / 21 / 2 L D = K 200 1 L C = K 111 1 L .

One can verify that A = CDC 1 : see this example. Let v 1 = A 11 B and v 2 = A 1 1 B , the columns of C . These are eigenvectors of A , with corresponding eigenvalues 2 and 1.

The matrix D is diagonal: it scales the x -direction by a factor of 2 and the y -direction by a factor of 1.

e 1 e 2 De 1 De 2 D

If we write a vector in terms of the basis v 1 , v 2 , say, x = a 1 v 1 + a 2 v 2 , then it is easy to compute Ax :

Ax = A ( a 1 v 1 + a 2 v 2 )= a 1 Av 1 + a 2 Av 2 = 2 a 1 v 1 a 2 v 2 .

Here we have used the fact that v 1 , v 2 are eigenvectors of A . Since the resulting vector is still expressed in terms of the basis v 1 , v 2 , we can visualize what A does to the vector x : it scales the v 1 -coordinate” by 2 and the v 2 -coordinate” by 1.

For instance, let x = A 0 2 B . We see from the grid on the right in the picture below that x = v 1 + v 2 , so

Ax = A ( v 1 + v 2 )= Av 1 + Av 2 = 2 v 1 v 2 = 2 K 11 L K 1 1 L = K 3 1 L .

The picture illustrates the action of D on the plane in the usual basis, and the action of A on the plane in the v 1 , v 2 -basis.

actionof D e 1 e 2 y Dy scale e 1 by2scale e 2 by 1 scale v 1 by2scale v 2 by 1 actionof A v 1 v 2 x Ax

Now let x = 1 2 A 5 3 B . We see from the grid on the right in the picture below that x = 1 2 v 1 + 2 v 2 , so

Ax = A K 1 2 v 1 + 2 v 2 L = 1 2 Av 1 + 2 Av 2 = v 1 2 v 2 = K 11 L 2 K 1 1 L = K 13 L .

This is illustrated in the picture below.

actionof D e 1 e 2 y Dy scale e 1 by2scale e 2 by 1 scale v 1 by2scale v 2 by 1 actionof A v 1 v 2 x Ax
Figure32The matrix A scales the v 1 -direction (violet) by a factor of 2 and the v 2 -direction (green) by a factor of 1. You should be able to visualize where Ax will be given the location of x .

In the following examples, we visualize the action of a diagonalizable matrix A in terms of its dynamics. In other words, we start with a collection of vectors (drawn as points), and we see where they move when we multiply them by A repeatedly.

Subsection5.4.3Algebraic and Geometric Multiplicity

In this subsection, we give a variant of the diagonalization theorem that provides another criterion for diagonalizability. It is stated in the language of multiplicities of eigenvalues.

In algebra, we define the multiplicity of a root λ 0 of a polynomial f ( λ ) to be the number of factors of λ λ 0 that divide f ( λ ) . For instance, in the polynomial

f ( λ )= λ 3 + 4 λ 2 5 λ + 2 = ( λ 1 ) 2 ( λ 2 ) ,

the root λ 0 = 2 has multiplicity 1, and the root λ 0 = 1 has multiplicity 2.

Definition

Let A be an n × n matrix, and let λ be an eigenvalue of A .

  1. The algebraic multiplicity of λ is its multiplicity as a root of the characteristic polynomial of A .
  2. The geometric multiplicity of λ is the dimension of the λ -eigenspace.

Since the λ -eigenspace of A is Nul ( A λ I n ) , its dimension is the number of free variables in the system of equations ( A λ I n ) x = 0, i.e., the number of columns without pivots in the matrix A λ I n .

We saw in the above examples that the algebraic and geometric multiplicities need not coincide. However, they do satisfy the following fundamental inequality, the proof of which is beyond the scope of this text.

In particular, if the algebraic multiplicity of λ is equal to 1, then so is the geometric multiplicity.

If A has an eigenvalue λ with algebraic multiplicity 1, then the λ -eigenspace is a line.

We can use the theorem to give another criterion for diagonalizability (in addition to the diagonalization theorem).

Proof

The first part of the third statement simply says that the characteristic polynomial of A factors completely into linear polynomials over the real numbers: in other words, there are no complex (non-real) roots. The second part of the third statement says in particular that for any diagonalizable matrix, the algebraic and geometric multiplicities coincide.

Let A be a square matrix and let λ be an eigenvalue of A . If the algebraic multiplicity of λ does not equal the geometric multiplicity, then A is not diagonalizable.

The examples at the beginning of this subsection illustrate the theorem. Here we give some general consequences for diagonalizability of 2 × 2 and 3 × 3 matrices.

Diagonalizability of 2 × 2 Matrices

Let A be a 2 × 2 matrix. There are four cases:

  1. A has two different eigenvalues. In this case, each eigenvalue has algebraic and geometric multiplicity equal to one. This implies A is diagonalizable. For example:
    A = K 1702 L .
  2. A has one eigenvalue λ of algebraic and geometric multiplicity 2. To say that the geometric multiplicity is 2 means that Nul ( A λ I 2 )= R 2 , i.e., that every vector in R 2 is in the null space of A λ I 2 . This implies that A λ I 2 is the zero matrix, so that A is the diagonal matrix λ I 2 . In particular, A is diagonalizable. For example:
    A = K 1001 L .
  3. A has one eigenvalue λ of algebraic multiplicity 2 and geometric multiplicity 1. In this case, A is not diagonalizable, by part 3 of the theorem. For example, a shear:
    A = K 1101 L .
  4. A has no eigenvalues. This happens when the characteristic polynomial has no real roots. In particular, A is not diagonalizable. For example, a rotation:
    A = K 1 111 L .