The Spectral Theorem

Linear Algebra
Author

Quasar

Published

July 23, 2024

Spectral Theorem

Every real, symmetric matrix is orthogonally diagonalizable.

Theorem 1 (Spectral Theorem) Every real symmetric matrix is diagonalizable.

Let \(A\) be a \(n \times n\) real symmetric matrix. Then,

  1. The eigenvalues of \(A\) are real.
  2. There exists an orthonormal basis \(\{\mathbf{q}_1,\ldots,\mathbf{q}_n\}\) for \(\mathbb{R}^n\) consisting of the eigenvectors of \(A\). That is, there is an orthogonal matrix \(Q\) so that \(A = QAQ^{-1}\).
Tip 1: Spectral values

The term spectrum refers to the eigenvalues of a matrix, or more, generally a linear operator. In Physics, the spectral energy lines of atoms (e.g. Balmer lines of the Hydrogen atom), are characterized as the eigenvalues of the governing quantum mechanical Schrodinger operator.

Proof.

Claim. The eigenvalues of \(A\) are real.

\[ \begin{align*} \langle A\mathbf{x}, \mathbf{y} \rangle &= (A \mathbf{x})' \mathbf{y}\\ &= \mathbf{x}'A' \mathbf{y}\\ &= \langle \mathbf{x},A'\mathbf{y}\rangle \end{align*} \]

Since, for a symmetric matrix \(A\), \(A = A'\), it follows that:

\[ \langle A\mathbf{x},\mathbf{y}\rangle = \langle \mathbf{x}, A\mathbf{y} \rangle \]

Or using the dot-product notation, we could write:

\[ (A\mathbf{x})\cdot \mathbf{y} = \mathbf{x}\cdot (A\mathbf{y}) \tag{1}\]

Suppose \(\mathbf{v}\neq\mathbf{0}\) is a non-zero vector in \(\mathbf{R}^n\) such that there exists a complex scalar \(\lambda\), satisfying:

\[ A \mathbf{v} = \lambda \mathbf{v} \tag{2}\]

We can now take the complex conjugate of the eigenvalue equation. Remember that \(A\) is a real matrix, so \(\bar{A} = A\). Thus, we have the conjugated version of the eigenvalue equation:

\[ \overline{(A\mathbf{v})}=\overline{A}\overline{\mathbf{v}} = A\overline{\mathbf{v}} = \overline{\lambda \mathbf{v}} = \overline{\lambda}\overline{\mathbf{v}} \tag{3}\]

Using the eigenvalue equation (Equation 2), we can write:

\[ (A \mathbf{v})\cdot \overline{\mathbf{v}} = (\lambda \mathbf{v}) \cdot \overline{\mathbf{v}} = \lambda (\mathbf{v} \cdot \overline{\mathbf{v}}) \]

Alternatively, using Equation 1 and Equation 3, we have:

\[ (A \mathbf{v})\cdot \overline{\mathbf{v}} = \mathbf{v} \cdot (A\overline{\mathbf{v}}) = \mathbf{v} \cdot (\overline{\lambda} \overline{\mathbf{v}}) = \overline{\lambda} (\mathbf{v} \cdot \overline{\mathbf{v}}) \]

Consequently,

\[ (\lambda - \overline{\lambda})(\mathbf{v}\cdot \overline{\mathbf{v}}) = 0 \]

Since, \(\mathbf{v} \neq \mathbf{0}\), \(\lambda = \overline{\lambda}\). Therefore, \(\lambda \in \mathbb{R}\).

Claim. \(A\) is orthogonally diagonalizable.

We proceed by induction.

For \(n=1\), \(A\) and \(v\) are scalars, so \(Av = \lambda v\), where \(\lambda = A\). Thus, we can pick any non-zero scalar \(v\) to form a basis in \(\mathbf{R}\). And \(A=P^{-1}\Lambda P\), where \(P=I\) and \(\Lambda = A\).

Inductive hypotheis. Every \(k \times k\) matrix is diagonalisable for \(k=1,2,3,\ldots,n-1\).

Claim. Let \(A \in \mathbf{R}^{n \times n}\) be symmetric. Then, we are interested to prove that \(A\) is diagonalizable. We break the induction part into 3 steps.

Every square matrix \(A\) has atleast one eigenvalue. Suppose \(\lambda_{1}\) is an eigenvalue of the matrix \(A\) and has a corresponding eigenvector \(\mathbf{v}_1\). By part (I), we know that \(\lambda_{1}\in\mathbf{R}\). We can normalize \(\mathbf{v}_1\) as \(\mathbf{q}_{1} = \mathbf{v}_1/||\mathbf{v}_1||\), so that it is an eigenvector with eigenvalue \(\lambda_{1}\). (Obviously, this is no problem, since if \(A\mathbf{v}_1 = \lambda_1 \mathbf{v}_1\), it implies \(A (\mathbf{v}_1/||\mathbf{v}_1||) = \lambda_1 (\mathbf{v}_1/||\mathbf{v}_1||)\). It follows that, \(A \mathbf{q}_1 = \lambda_1 \mathbf{q}_1\). )

Now, we can extend this to a basis \(\{\mathbf{q}_{1},\mathbf{w}_{2},\ldots,\mathbf{w}_{n}\}\) of \(\mathbf{R}^n\). By the Gram-Schmidt orthogonalization algorithm, given the basis \(\{\mathbf{q}_{1},\mathbf{w}_{2},\ldots,\mathbf{w}_{n}\}\), we can find a corresponding orthonormal basis \(\{\mathbf{q}_{1},\mathbf{q}_{2},\ldots,\mathbf{q}_{n}\}\) of \(\mathbf{R}^n\).

Now, we huddle these basis vectors together as column-vectors of a matrix and formulate the matrix \(P\).

\[ \begin{align*} P & =\left[\begin{array}{cccc} \mathbf{\mathbf{q}_{1}} & \mathbf{q}_{2} & \ldots & \mathbf{q}_{n}\end{array}\right] \end{align*} \]

By definition, \(P\) is an orthogonal matrix. So, \(P^{-1} = P^T\).

Define

\[ \begin{align*} B & =P^{-1}AP \end{align*} \]

Step I. \(B\) is symmetric.

We have:

\[ \begin{align*} B^{T} & =(P^{-1}AP)^{T}\\ & =(P^{T}AP)^{T} & \{P^{-1}=P^{T}\}\\ & =P^{T}A^{T}(P^{T})^{T}\\ & =P^{T}A^{T}P\\ & =P^{T}AP & \{A\text{ is symmetric}\}\\ & =B \end{align*} \]

We are now going to try and write \(B\) in the block form to try to see the structure that this matrix must have and hope that it looks like, it is going to be diagonal.

Step II. The structure of \(B\).

The way we do this, is to consider the matrix \(B\) post-multiplied by \(\mathbf{e}_{1}\). Consider \(B\mathbf{e}_{1}\). This should actually give us the first column of \(B\). Now, we also know that \(B=P^{T}AP\). So, we could actually say, well,

\[ \begin{align*} P^{T}AP\mathbf{e}_{1} & =P^{T}A\mathbf{q}_{1} \end{align*} \]

Now, remember that \(\mathbf{q}_{1}\) is the normalized eigenvector corresponding to the eigenvalue \(\lambda_{1}\). So, \(A\mathbf{q}_{1}=\lambda_{1}\mathbf{q}_{1}\). That means, this is equal to:

\[\begin{align*} P^{T}A\mathbf{q}_{1} & =P^{T}\lambda_{1}\mathbf{q}_{1}\\ & =\lambda_{1}P^{t}\mathbf{q}_{1}\\ & =\lambda_{1}\left[\begin{array}{c} \mathbf{q}_{1}^{T}\\ \mathbf{q}_{2}^{T}\\ \vdots\\ \mathbf{q}_{n}^{T} \end{array}\right]\mathbf{q}_{1}\\ & =\lambda_{1}\left[\begin{array}{c} \mathbf{q}_{1}^{T}\mathbf{q}_{1}\\ \mathbf{q}_{2}^{T}\mathbf{q}_{1}\\ \vdots\\ \mathbf{q}_{n}^{T}\mathbf{q}_{1} \end{array}\right]\\ & =\lambda_{1}\left[\begin{array}{c} 1\\ 0\\ \vdots\\ 0 \end{array}\right]\\ & =\left[\begin{array}{c} \lambda_{1}\\ 0\\ 0\\ 0 \end{array}\right] \end{align*}\]

This is the first column of the matrix \(B\). Since \(B=B^{T}\), the first row should also be

\[ \begin{bmatrix} \lambda_1 & 0 & 0 & \ldots & 0 \end{bmatrix} \]

So, we can write the matrix \(B\) in the block form:

\[\begin{align*} B & =\left[\begin{array}{cc} \lambda_{1} & O\\ O & C \end{array}\right] \end{align*}\]

The first row and the first column are satisying the need to be diagonal.

Step III.

We know that \(C\) is a \(n-1\times n-1\) symmetric matrix. By the induction hypothesis, there exists an orthogonal matrix \(Q\) such that \(D=Q^{-1}CQ = Q^T C Q\).

Now, define the matrix \(R\) as:

\[\begin{equation} R:=P\left[\begin{array}{cc} 1 & 0_{1\times n-1}\\ 0_{n-1\times1} & Q \end{array}\right] \end{equation}\]

Claim. Our claim is that \(R\) is orthogonal and \(R^{-1}AR\) is diagonal.

  1. We have:

\[\begin{align*} R^{-1} & =\left[\begin{array}{cc} 1 & 0_{1\times n-1}\\ 0_{n-1\times1} & Q^{-1} \end{array}\right]P^{-1} & \{\text{Reverse order law}\}\\ & =\left[\begin{array}{cc} 1 & 0_{1\times n-1}\\ 0_{n-1\times1} & Q^{T} \end{array}\right]P^{T} & \{P\text{ and }Q\text{ are orthogonal}\} \end{align*}\]

But,

\[\begin{align*} R^{T} & =\left[\begin{array}{cc} 1 & 0_{1\times n-1}\\ 0_{n-1\times1} & Q^{T} \end{array}\right]P^{T} \end{align*}\]

So,

\[\begin{align*} R^{T} & =R^{-1} \end{align*}\]

Thus, \(R\) is orthogonal.

  1. Well, let’s compute \(R^{-1}AR\).

\[\begin{align*} R^{-1}AR & =R^{T}AR & \{R\text{ is orthogonal}\}\\ & =\left[\begin{array}{cc} 1 & 0_{1\times n-1}\\ 0_{n-1\times1} & Q^{T} \end{array}\right]P^{T}AP\left[\begin{array}{cc} 1 & 0_{1\times n-1}\\ 0_{n-1\times1} & Q \end{array}\right]\\ & =\left[\begin{array}{cc} 1 & 0_{1\times n-1}\\ 0_{n-1\times1} & Q^{T} \end{array}\right]B\left[\begin{array}{cc} 1 & 0_{1\times n-1}\\ 0_{n-1\times1} & Q \end{array}\right]\\ & =\left[\begin{array}{cc} 1 & 0_{1\times n-1}\\ 0_{n-1\times1} & Q^{T} \end{array}\right]\left[\begin{array}{cc} \lambda_{1} & 0_{1\times n-1}\\ 0_{n-1\times1} & C \end{array}\right]\left[\begin{array}{cc} 1 & 0_{1\times n-1}\\ 0_{n-1\times1} & Q \end{array}\right]\\ & =\left[\begin{array}{cc} \lambda_{1} & 0_{1\times n-1}\\ 0_{n-1\times1} & Q^{T}C \end{array}\right]\left[\begin{array}{cc} 1 & 0_{1\times n-1}\\ 0_{n-1\times1} & Q \end{array}\right]\\ & =\left[\begin{array}{cc} \lambda_{1} & 0_{1\times n-1}\\ 0_{n-1\times1} & Q^{T}CQ \end{array}\right] \end{align*}\]

Since \(Q^{T}CQ\) is diagonal, it follows that \(R^{-1}AR\) is diagonal. This closes the proof. \(\blacksquare\)