Eigenvalues and Eigenvectors

An eigenvalue and an eigenvector for a linear transformation are a real or complex value λ and a nonzero vector x such that:

(1)

$$ \begin{align} T \,\vec{x} = \lambda \vec{x} \end{align} $$

Recognizing that a vector is an eigenvector allows us to replace n² multiplications with n multiplications.

In the case where T is defined on a finite dimensional space, we can choose a coordinate system in which T is represented by the matrix A. We can find eigenvalues by solving

(2)

$$ \begin{align} (A - \lambda I ) \vec{x} = 0 \end{align} $$

Gaussian elimination and backsubstitution yield an n-degree polynomial in λ, the characteristic polynomial, where n is the dimension of the domain of T. The Fundamental Theorem of Algebra guarantees that there is a (possibly complex valued) eigenvalue for T.

Finding Eigenvalues and Eigenvectors

The MATLAB eig function returns two matrices. The first has columns of eigenvectors, and the second has eigenvalues on the diagonal.

>> [V, D] = eig([1 2 3; 4 5 6; 7 8 9])

V =

   -0.2320   -0.7858    0.4082
   -0.5253   -0.0868   -0.8165
   -0.8187    0.6123    0.4082


D =

   16.1168         0         0
         0   -1.1168         0
         0         0   -0.0000

The R eigen function returns a dictionary type which R calls a list:

> A = matrix(seq(1, 9), nrow=3, byrow=T)
> A
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9

> eigen(A)$values
[1]  1.611684e+01 -1.116844e+00 -1.303678e-15

> eigen(A)$vectors
           [,1]        [,2]       [,3]
[1,] -0.2319707 -0.78583024  0.4082483
[2,] -0.5253221 -0.08675134 -0.8164966
[3,] -0.8186735  0.61232756  0.4082483

> names(eigen(A))
[1] "values"  "vectors"

NumPy:

>>> import numpy as np
>>> A = np.array([[1,2,3],[4,5,6],[7,8,9]])

>>> np.linalg.eigvals(A)
array([  1.61168440e+01,  -1.11684397e+00,  -1.30367773e-15])

>>> np.linalg.eig(A)[1]
array([[-0.23197069, -0.78583024,  0.40824829],
       [-0.52532209, -0.08675134, -0.81649658],
       [-0.8186735 ,  0.61232756,  0.40824829]])

The Mathematica Eigenvectors function returns the eigenvectors as rows of a matrix.

> A := {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}}

> Eigenvalues[A] // N
{16.1168, -1.11684, 0.}

> Eigenvectors[A] // N
{{0.283349, 0.641675, 1.}, {-1.28335, -0.141675, 1.}, {1., -2., 1.}}

An example of using Pari/GP to compute the eigenvalues and eigenvectors with 1000 signficant digits:

? \p 1000

? [vals, vecs] = mateigen([1, 2, 3; 4, 5, 6; 7, 8, 9], flag=1)

Diagonal Matrix

The diagonal entries of a matrix are the entries where the row and column indices are equal. A diagonal matrix has zeroes on the off-diagonal entries.

The eigenvalues of a diagonal matrix are the values on the diagonal. Moreover the eigenvectors are the standard basis vectors, and are thus an orthonormal basis for the vector space.

The question of when a coordinate system exists so that a linear transformation can be represented by a diagonal matrix is answered by the Spectral Theorem.

Matrices A and B are similar if there exists invertible P such that A = PBP^-1.

A matrix is diagonalizable if it is similar to a diagonal matrix.

To multiply two n x n matrices, one must in general perform n³ scalar multiplications. If one matrix is diagonal, only n² scalar multiplications are needed, and if both are diagonal, only n scalar multiplications.

eigendecomposition; finding the matrix which diagonalizes.

Exponentation

Change of Basis Invariants

An invertible matrix can be regarded as a change of basis. Eigenvalues, but not eigenvectors, are invariants under change of basis.

The trace and determinant of a matrix are also invariants under change of basis. However, the trace is the sum of the eigenvalues, and the determinant is the product of the eigenvalues.

For k in 0, 1, ..., n we can define the elementary symmetric polynomial of degree k as

(3)

$$ \begin{align} e_k(x_1, ..., x_n) = \sum_{1 \leq j_1 \leq j_2 \cdots \leq j_k \leq n} x_{j_1} \cdots x_{j_k} \end{align} $$

Then the trace is e₁ of the eigenvalues, and the determinant is e_n of the eigenvalues. The other symmetric polynomials can be used to construct more invariant functions.

Inverses

If T has zero as an eigenvalue, then there is a nonzero x such that

(4)

$$ \begin{align} T\,\vec{x} = 0 \end{align} $$

In other words, the null space of T has dimension greater than zero, and T is not invertible.

If λ is an eigenvalue of A, then λ^-1 is an eigenvalue of A^-1 with the same eigenvectors:

(5)

$$ \begin{align} A^{-1} x = \lambda^{-1} \lambda A^{-1} x = \lambda^{-1} A^{-1} \lambda x = \lambda^{-1} A^{-1} A x = \lambda^{-1} x \end{align} $$

Roots of Polynomials

Computing the characteristic polynomial and finding its roots is not a good way, as far as I know, to numerically find eigenvalues. See the QR Algorithm, below.

Finding the characteristic polynomial will probably use Gaussian elimination, which has O(n³) complexity.

The Fundamental Theorem of Algebra guarantees that there are n possibly complex roots of the polynomial, counting multiplicity.

To find roots, one can use the quadratic formula if the polynomial has degree 2. There are impractical formulas if the polynomial has degree 3 or 4.

If the matrix entries are known exactly and are rational, then we can put the characteristic polynomial in a form where all the coefficients are integers. The Rational Root Test gives a finite number of possibilities for any rational roots of the polynomial. If a root is of the form r = ±p/q, then p evenly divides the constant term coefficient of the polynomial and q divides the leading term coefficient of the polynomial. The Rational Root Test does not guarantee that there are any rational roots, however.

It is also worth remembering that if a polynomial has real coefficients, the conjugate of any complex root is also a root.

If a polynomial has real coefficients, and if we can find x,,1,, and x,,2,, such that p(x,,1,,) > 0 and p(x,,2,,) < 0, then we know there is a root between x₁ and x₂ and we can use bisection to find it.

Descartes rule of signs can be used to put an upper bound on the number of real roots. The number of positive real roots is no greater than the number of sign changes between successive coefficients. The number of negative real roots is no greater than the number of sign changes of the coefficients of the polynomial p(-x).

If the coefficients of a polynomial p are denoted a₀ through a_n, where a₀ is the constant term and a_n is the leading coefficient, then all roots of the polynomial can be found in the disk of radius ρ centered at the origin where:

(6)

$$ \begin{align} \rho = 1 + |a_n|^{-1} \max_{0\leq k<n} |a_k| \end{align} $$

If we compute ρ for the polynomial

(7)

$$ \begin{align} s(z) = z^n p(1/z) \end{align} $$

then all of the nonzero roots of p are outside of the disk of radius ρ.

A fast and general purpose technique for finding the roots is to use Newton's method to find a root, and then use polynomial division to create a new polynomial containing just the unfound roots. Netwon's method is not guaranteed to convert to a root, so one may occasionally need to give up and pick a new starting point.

Spectral Theorem

Self-adjoint (aka Hermitian) operators are equal to their adjoint.

In the case of a matrix, the adjoint is the conjugate transpose.

In the case of an operator on an inner product space, for any w in the space, T^*w is defined as the unique vector such that

(8)

$$ \begin{align} \langle Tv, w \rangle = \langle v, T^*w \rangle \end{align} $$

for all v. This uses the fact that for any linear functional φ and any vector u, there is a unique vector v such that

(9)

$$ \begin{align} \phi(u) = \langle u, v \rangle \end{align} $$

A normal operator commutes with its adjoint.

The complex spectral theorem states that a linear transformation T on a complex inner-product space V has a set of eigenvectors which form an orthonormal basis of V if and only if T is normal.

The real spectral theorem (V is a real inner-product space) states that the orthonormal basis of eigenvectors exist if and only if T is self-adjoint.

Positive Operators and Isometries

Real isometries are also called orthogonal matrices.

Complex isometries are also called unitary matrices.

Transpose of an isometry is its inverse.

and polar decomposition.

Generalized Eigenvector

The linear transformation

(10)

$$ \begin{align} T(w, v) = (v, 0) \end{align} $$

has only one eigenvalue, zero. Moreover the eigenvectors are all of the form (v, 0) and form a one dimensional subspace. Thus the eigenvectors of T do not span the domain of T.

A vector x is a generalized eigenvector of a transformation T if for some eigenvalue λ and some positive integer j:

(11)

$$ \begin{align} (T-\lambda I)^j \vec{x} = 0 \end{align} $$

Example of a generalized eigenvector which isn't an eigenvector.

The generalized eigenvectors of T corresponding to an eigenvalue λ form a subspace, and domain of T can be written as a direct sum of those subspaces. In particular, the generalized eigenvectors of T always span the domain of T.

Power Method

This MATLAB code implements the power method for a randomly generated matrix A.

A = randn(10);
x = randn(10, 1);
n = 1000

for i = 1:n
    y = A * x;
    r = y(1) / x(1);
    x = y / norm(y, 1);
end

The final value of r and x are the largest eigenvalue (in magnitude) and an eigenvector.

The algorithm makes two assumptions: that there is only one eigenvalue with largest modulus, and that the eigenspace spans the vector space.

If a single eigenvalue with largest modulus exists, it is called the principal eigenvalue, and any eigenvectors associated with it the principal eigenvectors.

With these two assumptions, the initial randomly generated vector can be expressed as a linear combination of the eigenvectors, with hopefully a non-zero component for the eigenvector that goes with the largest eigenvalue. If not, it will converge to one of the smaller eigenvalues.

If there are multiple eigenvalues with the same modulus, then algorithm might not converge.

inverse power method...

QR Algorithm

A matrix is upper triangular if entries a,,ij,, are zero when i > j.

Schur's Theorem states that every square matrix is unitarily similar to an upper triangular matrix. That is,

(12)

$$ \begin{align} A = UTU^* \end{align} $$

Gershgorin's Theorem

The eigenvalues of an n x n matrix A are contained in the union of the following n disks D_i:

(13)

$$ \begin{align} D_i = \bigg\{ z \in \mathbb{C}: |z-a_{ii}| \leq \sum_{\substack{j = 1 \\ j \neq i}}^n |a_{ij}| \bigg\} \;\;\;(1 \leq i \leq n) \end{align} $$

Singular Value Decomposition

Any m x n matrix A can be factored as

(14)

$$ \begin{align} A = PDQ \end{align} $$

where P is m x m and unitary, D is m x n and diagonal, and Q is n x n and unitary.

The values on the diagonal of D are the singular values. They are non-negative, and customarily they are ordered in descending order of magnitude on the diagonal. If this order is observed, then the decomposition is unique.

For a real, symmetric matrix with non-negative eigenvalues, the singular values and the eigenvalues coincide, but in general this is not the case.

The values on the diagonal of D are equal to the square root of the eigenvalues of AA^T:

(15)

$$ \begin{align} AA^T = PDQ(PDQ)^T = PDQQ^TDP^T = PDDP^T \end{align} $$

We can read the rank of the matrix off from the SVD decomposition, since it is the number of nonzero entries in D.

Using the SVD to compute the null space and the range.

How to compute the SVD decomposition.

Finding the Singular Value Decomposition

MATLAB:

>> [U, S, V] = svd(randn(4))

U =

    0.4940    0.8364    0.2332   -0.0461
   -0.5897    0.2749    0.1147   -0.7507
   -0.3209    0.4271   -0.7951    0.2870
   -0.5525    0.2062    0.5479    0.5933


S =

    4.2043         0         0         0
         0    2.3052         0         0
         0         0    0.3534         0
         0         0         0    0.1403


V =

   -0.1562    0.7390   -0.3115    0.5766
   -0.5436    0.3502   -0.2381   -0.7247
    0.2314    0.5265    0.7977   -0.1812
   -0.7915   -0.2324    0.4582    0.3309

> A = matrix(rnorm(16), nrow=4)
> svd(A)
$d
[1] 4.06701690 1.37408238 1.21791164 0.04339907

$u
          [,1]        [,2]         [,3]       [,4]
[1,] 0.0109744 -0.10666316  0.916172128 -0.3861750
[2,] 0.8414392  0.47057833  0.139871033  0.2257703
[3,] 0.3319563 -0.06808106 -0.375515751 -0.8626459
[4,] 0.4262214 -0.87323764 -0.007255794  0.2360905

$v
           [,1]         [,2]       [,3]        [,4]
[1,]  0.1273354  0.652847181 -0.7466289 -0.01102775
[2,]  0.2683684  0.176505427  0.1863908  0.92848408
[3,]  0.1702112 -0.736618433 -0.6182398  0.21494401
[4,] -0.9395702  0.005447525 -0.1599478  0.30264630

NumPy:

>>> np.linalg.svd(np.random.randn(16).reshape(4, 4))
(array([[ 0.42188321, -0.776728  , -0.28222992, -0.37290004],
        [ 0.59844734, -0.08941838,  0.73437841,  0.30749551],
        [ 0.59708118,  0.62280516, -0.22016383, -0.45512162],
        [-0.32767271, -0.02849019,  0.57668234, -0.74783446]]),
 array([ 3.7186746 ,  2.76172207,  1.02523947,  0.23191521]),
 array([[ 0.13491856,  0.13298406,  0.47767336,  0.85786968],
        [-0.05355081,  0.47772678, -0.79237368,  0.37557074],
        [ 0.36729537, -0.80343326, -0.37784435,  0.27716918],
        [ 0.91870746,  0.32952596,  0.03472403, -0.21490342]]))

Mathematica

> SingularValueDecomposition[{{1, 2, 3}, {4, 5, 6}, {7, 8, 9}}] // N
{{{0.214837, 0.887231, 0.408248}, {0.520587,
   0.249644, -0.816497}, {0.826338, -0.387943, 0.408248}}, {{16.8481,
   0., 0.}, {0., 1.06837, 0.}, {0., 0., 0.}}, {{0.479671, -0.776691,
   0.408248}, {0.572368, -0.0756865, -0.816497}, {0.665064, 0.625318,
   0.408248}}}

Pseudoinverse

The pseudoinverse A⁺ of a matrix A is always defined, even if the matrix is not square. It has these four properties:

AA⁺A = A
A⁺AA⁺ = A⁺
(AA⁺)* = AA⁺
(A⁺A)* = A⁺AA

Using the SVD to compute the pseudoinverse.

Using MATLAB to get a pseudoinverse:

>> pinv([1 2 3; 4 5 0; 0 0 0])

ans =

   -0.0397    0.1111         0
    0.0317    0.1111         0
    0.3254   -0.1111         0

Principal Component Analysis

An SVD decomposition can be used to approximate a matrix A with a matrix of lower rank. If the desired rank is r, in the decomposition A = PDQ, one computes A' = PD'Q, where one sets all but the first r diagonal values of D to zero.

This yields the closest matrix to A of rank r according to the Frobenius norm, which is the square root of the sum of the squares of the matrix entries.

CLARK GRUBB