Jordan normal form

In linear algebra, a Jordan normal form, also known as a Jordan canonical form (JCF),^[1]^[2] is an upper triangular matrix of a particular form called a Jordan matrix representing a linear operator on a finite-dimensional vector space with respect to some basis. Such a matrix has each non-zero off-diagonal entry equal to 1, immediately above the main diagonal (on the superdiagonal), and with identical diagonal entries to the left and below them.

Let V be a vector space over a field K. Then a basis with respect to which the matrix has the required form exists if and only if all eigenvalues of the matrix lie in K, or equivalently if the characteristic polynomial of the operator splits into linear factors over K. This condition is always satisfied if K is algebraically closed (for instance, if it is the field of complex numbers). The diagonal entries of the normal form are the eigenvalues (of the operator), and the number of times each eigenvalue occurs is called the algebraic multiplicity of the eigenvalue.^[3]^[4]^[5]

If the operator is originally given by a square matrix M, then its Jordan normal form is also called the Jordan normal form of M. Any square matrix has a Jordan normal form if the field of coefficients is extended to one containing all the eigenvalues of the matrix. In spite of its name, the normal form for a given M is not entirely unique, as it is a block diagonal matrix formed of Jordan blocks, the order of which is not fixed; it is conventional to group blocks for the same eigenvalue together, but no ordering is imposed among the eigenvalues, nor among the blocks for a given eigenvalue, although the latter could for instance be ordered by weakly decreasing size.^[3]^[4]^[5]

The Jordan–Chevalley decomposition is particularly simple with respect to a basis for which the operator takes its Jordan normal form. The diagonal form for diagonalizable matrices, for instance normal matrices, is a special case of the Jordan normal form.^[6]^[7]^[8]

The Jordan normal form is named after Camille Jordan, who first stated the Jordan decomposition theorem in 1870.^[9]

Overview edit

Notation edit

Some textbooks have the ones on the subdiagonal; that is, immediately below the main diagonal instead of on the superdiagonal. The eigenvalues are still on the main diagonal.^[10]^[11]

Motivation edit

An n × n matrix A is diagonalizable if and only if the sum of the dimensions of the eigenspaces is n. Or, equivalently, if and only if A has n linearly independent eigenvectors. Not all matrices are diagonalizable; matrices that are not diagonalizable are called defective matrices. Consider the following matrix:

A=\left[{\begin{array}{*{20}{r}}5&4&2&1\\[2pt]0&1&-1&-1\\[2pt]-1&-1&3&0\\[2pt]1&1&-1&2\end{array}}\right].

Including multiplicity, the eigenvalues of A are λ = 1, 2, 4, 4. The dimension of the eigenspace corresponding to the eigenvalue 4 is 1 (and not 2), so A is not diagonalizable. However, there is an invertible matrix P such that J = P⁻¹AP, where

J={\begin{bmatrix}1&0&0&0\\[2pt]0&2&0&0\\[2pt]0&0&4&1\\[2pt]0&0&0&4\end{bmatrix}}.

The matrix $J$ is almost diagonal. This is the Jordan normal form of A. The section Example below fills in the details of the computation.

Complex matrices edit

In general, a square complex matrix A is similar to a block diagonal matrix

J={\begin{bmatrix}J_{1}&\;&\;\\\;&\ddots &\;\\\;&\;&J_{p}\end{bmatrix}}

where each block J_i is a square matrix of the form

J_{i}={\begin{bmatrix}\lambda _{i}&1&\;&\;\\\;&\lambda _{i}&\ddots &\;\\\;&\;&\ddots &1\\\;&\;&\;&\lambda _{i}\end{bmatrix}}.

So there exists an invertible matrix P such that P⁻¹AP = J is such that the only non-zero entries of J are on the diagonal and the superdiagonal. J is called the Jordan normal form of A. Each J_i is called a Jordan block of A. In a given Jordan block, every entry on the superdiagonal is 1.

Assuming this result, we can deduce the following properties:

Counting multiplicities, the eigenvalues of J, and therefore of A, are the diagonal entries.
Given an eigenvalue λ_i, its geometric multiplicity is the dimension of ker(A − λ_iI), where I is the identity matrix, and it is the number of Jordan blocks corresponding to λ_i.^[12]
The sum of the sizes of all Jordan blocks corresponding to an eigenvalue λ_i is its algebraic multiplicity.^[12]
A is diagonalizable if and only if, for every eigenvalue λ of A, its geometric and algebraic multiplicities coincide. In particular, the Jordan blocks in this case are 1 × 1 matrices; that is, scalars.
The Jordan block corresponding to λ is of the form λI + N, where N is a nilpotent matrix defined as N_ij = δ_i_,j−1 (where δ is the Kronecker delta). The nilpotency of N can be exploited when calculating f(A) where f is a complex analytic function. For example, in principle the Jordan form could give a closed-form expression for the exponential exp(A).
The number of Jordan blocks corresponding to λ_i of size at least j is dim ker(A − λ_iI)^j − dim ker(A − λ_iI)^j−1. Thus, the number of Jordan blocks of size j is
$2\dim \ker(A-\lambda _{i}I)^{j}-\dim \ker(A-\lambda _{i}I)^{j+1}-\dim \ker(A-\lambda _{i}I)^{j-1}$
Given an eigenvalue λ_i, its multiplicity in the minimal polynomial is the size of its largest Jordan block.

Example edit

Consider the matrix $A$ from the example in the previous section. The Jordan normal form is obtained by some similarity transformation:

P^{-1}AP=J;

that is,

AP=PJ.

Let $P$ have column vectors $p_{i}$ , $i=1,\ldots ,4$ , then

A{\begin{bmatrix}p_{1}&p_{2}&p_{3}&p_{4}\end{bmatrix}}={\begin{bmatrix}p_{1}&p_{2}&p_{3}&p_{4}\end{bmatrix}}{\begin{bmatrix}1&0&0&0\\0&2&0&0\\0&0&4&1\\0&0&0&4\end{bmatrix}}={\begin{bmatrix}p_{1}&2p_{2}&4p_{3}&p_{3}+4p_{4}\end{bmatrix}}.

We see that

(A-1I)p_{1}=0

(A-2I)p_{2}=0

(A-4I)p_{3}=0

(A-4I)p_{4}=p_{3}.

For $i=1,2,3$ we have $p_{i}\in \ker(A-\lambda _{i}I)$ , that is, $p_{i}$ is an eigenvector of $A$ corresponding to the eigenvalue $\lambda _{i}$ . For $i=4$ , multiplying both sides by $(A-4I)$ gives

(A-4I)^{2}p_{4}=(A-4I)p_{3}.

But $(A-4I)p_{3}=0$ , so

(A-4I)^{2}p_{4}=0.

Thus, $p_{4}\in \ker(A-4I)^{2}.$

Vectors such as $p_{4}$ are called generalized eigenvectors of A.

Example: Obtaining the normal form edit

This example shows how to calculate the Jordan normal form of a given matrix.

Consider the matrix

A=\left[{\begin{array}{rrrr}5&4&2&1\\0&1&-1&-1\\-1&-1&3&0\\1&1&-1&2\end{array}}\right]

which is mentioned in the beginning of the article.

The characteristic polynomial of A is

{\begin{aligned}\chi (\lambda )&=\det(\lambda I-A)\\&=\lambda ^{4}-11\lambda ^{3}+42\lambda ^{2}-64\lambda +32\\&=(\lambda -1)(\lambda -2)(\lambda -4)^{2}.\,\end{aligned}}

This shows that the eigenvalues are 1, 2, 4 and 4, according to algebraic multiplicity. The eigenspace corresponding to the eigenvalue 1 can be found by solving the equation Av = λv. It is spanned by the column vector v = (−1, 1, 0, 0)^T. Similarly, the eigenspace corresponding to the eigenvalue 2 is spanned by w = (1, −1, 0, 1)^T. Finally, the eigenspace corresponding to the eigenvalue 4 is also one-dimensional (even though this is a double eigenvalue) and is spanned by x = (1, 0, −1, 1)^T. So, the geometric multiplicity (that is, the dimension of the eigenspace of the given eigenvalue) of each of the three eigenvalues is one. Therefore, the two eigenvalues equal to 4 correspond to a single Jordan block, and the Jordan normal form of the matrix A is the direct sum

J=J_{1}(1)\oplus J_{1}(2)\oplus J_{2}(4)={\begin{bmatrix}1&0&0&0\\0&2&0&0\\0&0&4&1\\0&0&0&4\end{bmatrix}}.

There are three Jordan chains. Two have length one: {v} and {w}, corresponding to the eigenvalues 1 and 2, respectively. There is one chain of length two corresponding to the eigenvalue 4. To find this chain, calculate

\ker(A-4I)^{2}=\operatorname {span} \,\left\{{\begin{bmatrix}1\\0\\0\\0\end{bmatrix}},\left[{\begin{array}{r}1\\0\\-1\\1\end{array}}\right]\right\}

where I is the 4 × 4 identity matrix. Pick a vector in the above span that is not in the kernel of A − 4I; for example, y = (1,0,0,0)^T. Now, (A − 4I)y = x and (A − 4I)x = 0, so {y, x} is a chain of length two corresponding to the eigenvalue 4.

The transition matrix P such that P⁻¹AP = J is formed by putting these vectors next to each other as follows

P=\left[{\begin{array}{c|c|c|c}v&w&x&y\end{array}}\right]=\left[{\begin{array}{rrrr}-1&1&1&1\\1&-1&0&0\\0&0&-1&0\\0&1&1&0\end{array}}\right].

A computation shows that the equation P⁻¹AP = J indeed holds.

P^{-1}AP=J={\begin{bmatrix}1&0&0&0\\0&2&0&0\\0&0&4&1\\0&0&0&4\end{bmatrix}}.

If we had interchanged the order in which the chain vectors appeared, that is, changing the order of v, w and {x, y} together, the Jordan blocks would be interchanged. However, the Jordan forms are equivalent Jordan forms.

Generalized eigenvectors edit

Given an eigenvalue λ, every corresponding Jordan block gives rise to a Jordan chain of linearly independent vectors p_i, i = 1, ..., b, where b is the size of the Jordan block. The generator, or lead vector, p_b of the chain is a generalized eigenvector such that (A − λI)^bp_b = 0. The vector p₁ = (A − λI)^b−1p_b is an ordinary eigenvector corresponding to λ. In general, p_i is a preimage of p_i−1 under A − λI. So the lead vector generates the chain via multiplication by A − λI.^[13]^[2] Therefore, the statement that every square matrix A can be put in Jordan normal form is equivalent to the claim that the underlying vector space has a basis composed of Jordan chains.

A proof edit

We give a proof by induction that any complex-valued square matrix A may be put in Jordan normal form. Since the underlying vector space can be shown^[14] to be the direct sum of invariant subspaces associated with the eigenvalues, A can be assumed to have just one eigenvalue λ. The 1 × 1 case is trivial. Let A be an n × n matrix. The range of A − λI, denoted by Ran(A − λI), is an invariant subspace of A. Also, since λ is an eigenvalue of A, the dimension of Ran(A − λI), r, is strictly less than n, so, by the inductive hypothesis, Ran(A − λI) has a basis {p₁, ..., p_r} composed of Jordan chains.

Next consider the kernel, that is, the subspace ker(A − λI). If

\operatorname {Ran} (A-\lambda I)\cap \ker(A-\lambda I)=\{0\},

the desired result follows immediately from the rank–nullity theorem. (This would be the case, for example, if A were Hermitian.)

Otherwise, if

Q=\operatorname {Ran} (A-\lambda I)\cap \ker(A-\lambda I)\neq \{0\},

let the dimension of Q be s ≤ r. Each vector in Q is an eigenvector, so Ran(A − λI) must contain s Jordan chains corresponding to s linearly independent eigenvectors. Therefore the basis {p₁, ..., p_r} must contain s vectors, say {p_r−s+1, ..., p_r}, that are lead vectors of these Jordan chains. We can "extend the chains" by taking the preimages of these lead vectors. (This is the key step.) Let q_i be such that

\;(A-\lambda I)q_{i}=p_{i}{\mbox{ for }}i=r-s+1,\ldots ,r.

The set {q_i}, being preimages of the linearly independent set {p_i} under A − λ I, is also linearly independent. Clearly no non-trivial linear combination of the q_i can lie in ker(A − λI), for {p_i}_{i=r−s+1, ..., r} is linearly independent. Furthermore, no non-trivial linear combination of the q_i can belong to Ran(A − λ I) because it would then be a linear combination of the basic vectors p₁, ..., p_r, and this linear combination would have a contribution of basic vectors not in ker(A − λI) because otherwise it would belong to ker(A − λI). The action of A − λI on both linear combinations would then produce an equality of a non-trivial linear combination of lead vectors and such a linear combination of non-lead vectors, which would contradict the linear independence of (p₁, ..., p_r).

Finally, we can pick any linearly independent set {z₁, ..., z_t} whose projection spans

\ker(A-\lambda I)/Q.

Each z_i forms a Jordan chain of length 1. By construction, the union of the three sets {p₁, ..., p_r}, {q_{r−s +1}, ..., q_r}, and {z₁, ..., z_t} is linearly independent, and its members combine to form Jordan chains. Finally, by the rank–nullity theorem, the cardinality of the union is n. In other words, we have found a basis composed of Jordan chains, and this shows A can be put in Jordan normal form.

Uniqueness edit

It can be shown that the Jordan normal form of a given matrix A is unique up to the order of the Jordan blocks.

Knowing the algebraic and geometric multiplicities of the eigenvalues is not sufficient to determine the Jordan normal form of A. Assuming the algebraic multiplicity m(λ) of an eigenvalue λ is known, the structure of the Jordan form can be ascertained by analyzing the ranks of the powers (A − λI)^m(λ). To see this, suppose an n × n matrix A has only one eigenvalue λ. So m(λ) = n. The smallest integer k₁ such that

(A-\lambda I)^{k_{1}}=0

is the size of the largest Jordan block in the Jordan form of A. (This number k₁ is also called the index of λ. See discussion in a following section.) The rank of

(A-\lambda I)^{k_{1}-1}

is the number of Jordan blocks of size k₁. Similarly, the rank of

(A-\lambda I)^{k_{1}-2}

is twice the number of Jordan blocks of size k₁ plus the number of Jordan blocks of size k₁ − 1. The general case is similar.

This can be used to show the uniqueness of the Jordan form. Let J₁ and J₂ be two Jordan normal forms of A. Then J₁ and J₂ are similar and have the same spectrum, including algebraic multiplicities of the eigenvalues. The procedure outlined in the previous paragraph can be used to determine the structure of these matrices. Since the rank of a matrix is preserved by similarity transformation, there is a bijection between the Jordan blocks of J₁ and J₂. This proves the uniqueness part of the statement.

Real matrices edit

If A is a real matrix, its Jordan form can still be non-real. Instead of representing it with complex eigenvalues and ones on the superdiagonal, as discussed above, there exists a real invertible matrix P such that P⁻¹AP = J is a real block diagonal matrix with each block being a real Jordan block.^[15] A real Jordan block is either identical to a complex Jordan block (if the corresponding eigenvalue $\lambda _{i}$ is real), or is a block matrix itself, consisting of 2×2 blocks (for non-real eigenvalue $\lambda _{i}=a_{i}+ib_{i}$ with given algebraic multiplicity) of the form

C_{i}=\left[{\begin{array}{rr}a_{i}&-b_{i}\\b_{i}&a_{i}\\\end{array}}\right]

and describe multiplication by $\lambda _{i}$ in the complex plane. The superdiagonal blocks are 2×2 identity matrices and hence in this representation the matrix dimensions are larger than the complex Jordan form. The full real Jordan block is given by

J_{i}={\begin{bmatrix}C_{i}&I&&\\&C_{i}&\ddots &\\&&\ddots &I\\&&&C_{i}\end{bmatrix}}.

This real Jordan form is a consequence of the complex Jordan form. For a real matrix the nonreal eigenvectors and generalized eigenvectors can always be chosen to form complex conjugate pairs. Taking the real and imaginary part (linear combination of the vector and its conjugate), the matrix has this form with respect to the new basis.

Matrices with entries in a field edit

Jordan reduction can be extended to any square matrix M whose entries lie in a field K. The result states that any M can be written as a sum D + N where D is semisimple, N is nilpotent, and DN = ND. This is called the Jordan–Chevalley decomposition. Whenever K contains the eigenvalues of M, in particular when K is algebraically closed, the normal form can be expressed explicitly as the direct sum of Jordan blocks.

Similar to the case when K is the complex numbers, knowing the dimensions of the kernels of (M − λI)^k for 1 ≤ k ≤ m, where m is the algebraic multiplicity of the eigenvalue λ, allows one to determine the Jordan form of M. We may view the underlying vector space V as a K[x]-module by regarding the action of x on V as application of M and extending by K-linearity. Then the polynomials (x − λ)^k are the elementary divisors of M, and the Jordan normal form is concerned with representing M in terms of blocks associated to the elementary divisors.

The proof of the Jordan normal form is usually carried out as an application to the ring K[x] of the structure theorem for finitely generated modules over a principal ideal domain, of which it is a corollary.

Consequences edit

One can see that the Jordan normal form is essentially a classification result for square matrices, and as such several important results from linear algebra can be viewed as its consequences.

Spectral mapping theorem edit

Using the Jordan normal form, direct calculation gives a spectral mapping theorem for the polynomial functional calculus: Let A be an n × n matrix with eigenvalues λ₁, ..., λ_n, then for any polynomial p, p(A) has eigenvalues p(λ₁), ..., p(λ_n).

Characteristic polynomial edit

The characteristic polynomial of $A$ is $p_{A}(\lambda )=\det(\lambda I-A)$ . Similar matrices have the same characteristic polynomial. Therefore, ${\textstyle p_{A}(\lambda )=p_{J}(\lambda )=\prod _{i}(\lambda -\lambda _{i})^{m_{i}}}$ , where $\lambda _{i}$ is the ith root of ${\textstyle p_{J}}$ and $m_{i}$ is its multiplicity, because this is clearly the characteristic polynomial of the Jordan form of A.

Cayley–Hamilton theorem edit

The Cayley–Hamilton theorem asserts that every matrix A satisfies its characteristic equation: if $p$ is the characteristic polynomial of $A$ , then $p_{A}(A)=0$ . This can be shown via direct calculation in the Jordan form, since if $\lambda _{i}$ is an eigenvalue of multiplicity $m$ , then its Jordan block $J_{i}$ clearly satisfies $(J_{i}-\lambda _{i}I)^{m_{i}}=0$ . As the diagonal blocks do not affect each other, the ith diagonal block of $(A-\lambda _{i}I)^{m_{i}}$ is $(J_{i}-\lambda _{i}I)^{m_{i}}=0$ ; hence ${\textstyle p_{A}(A)=\prod _{i}(A-\lambda _{i}I)^{m_{i}}=0}$ .

The Jordan form can be assumed to exist over a field extending the base field of the matrix, for instance over the splitting field of $p$ ; this field extension does not change the matrix $p (A)$ in any way.

Minimal polynomial edit

The minimal polynomial P of a square matrix A is the unique monic polynomial of least degree, m, such that P(A) = 0. Alternatively, the set of polynomials that annihilate a given A form an ideal I in C[x], the principal ideal domain of polynomials with complex coefficients. The monic element that generates I is precisely P.

Let λ₁, ..., λ_q be the distinct eigenvalues of A, and s_i be the size of the largest Jordan block corresponding to λ_i. It is clear from the Jordan normal form that the minimal polynomial of A has degree $Σ$ s_i.

While the Jordan normal form determines the minimal polynomial, the converse is not true. This leads to the notion of elementary divisors. The elementary divisors of a square matrix A are the characteristic polynomials of its Jordan blocks. The factors of the minimal polynomial m are the elementary divisors of the largest degree corresponding to distinct eigenvalues.

The degree of an elementary divisor is the size of the corresponding Jordan block, therefore the dimension of the corresponding invariant subspace. If all elementary divisors are linear, A is diagonalizable.

Invariant subspace decompositions edit

The Jordan form of a n × n matrix A is block diagonal, and therefore gives a decomposition of the n dimensional Euclidean space into invariant subspaces of A. Every Jordan block J_i corresponds to an invariant subspace X_i. Symbolically, we put

\mathbb {C} ^{n}=\bigoplus _{i=1}^{k}X_{i}

where each X_i is the span of the corresponding Jordan chain, and k is the number of Jordan chains.

One can also obtain a slightly different decomposition via the Jordan form. Given an eigenvalue λ_i, the size of its largest corresponding Jordan block s_i is called the index of λ_i and denoted by v(λ_i). (Therefore, the degree of the minimal polynomial is the sum of all indices.) Define a subspace Y_i by

Y_{i}=\ker(\lambda _{i}I-A)^{v(\lambda _{i})}.

This gives the decomposition

\mathbb {C} ^{n}=\bigoplus _{i=1}^{l}Y_{i}

where l is the number of distinct eigenvalues of A. Intuitively, we glob together the Jordan block invariant subspaces corresponding to the same eigenvalue. In the extreme case where A is a multiple of the identity matrix we have k = n and l = 1.

The projection onto Y_i and along all the other Y_j ( j ≠ i ) is called the spectral projection of A at v_i and is usually denoted by P(λ_i ; A). Spectral projections are mutually orthogonal in the sense that P(λ_i ; A) P(v_j ; A) = 0 if i ≠ j. Also they commute with A and their sum is the identity matrix. Replacing every v_i in the Jordan matrix J by one and zeroing all other entries gives P(v_i ; J), moreover if U J U⁻¹ is the similarity transformation such that A = U J U⁻¹ then P(λ_i ; A) = U P(λ_i ; J) U⁻¹. They are not confined to finite dimensions. See below for their application to compact operators, and in holomorphic functional calculus for a more general discussion.

Comparing the two decompositions, notice that, in general, l ≤ k. When A is normal, the subspaces X_i's in the first decomposition are one-dimensional and mutually orthogonal. This is the spectral theorem for normal operators. The second decomposition generalizes more easily for general compact operators on Banach spaces.

It might be of interest here to note some properties of the index, ν(λ). More generally, for a complex number λ, its index can be defined as the least non-negative integer ν(λ) such that

\ker(A-\lambda I)^{\nu (\lambda )}=\ker(A-\lambda I)^{m},\;\forall m\geq \nu (\lambda ).

So ν(v) > 0 if and only if λ is an eigenvalue of A. In the finite-dimensional case, ν(v) ≤ the algebraic multiplicity of v.

Plane (flat) normal form edit

The Jordan form is used to find a normal form of matrices up to conjugacy such that normal matrices make up an algebraic variety of a low fixed degree in the ambient matrix space.

Sets of representatives of matrix conjugacy classes for Jordan normal form or rational canonical forms in general do not constitute linear or affine subspaces in the ambient matrix spaces.

Vladimir Arnold posed^[16] a problem: Find a canonical form of matrices over a field for which the set of representatives of matrix conjugacy classes is a union of affine linear subspaces (flats). In other words, map the set of matrix conjugacy classes injectively back into the initial set of matrices so that the image of this embedding—the set of all normal matrices, has the lowest possible degree—it is a union of shifted linear subspaces.

It was solved for algebraically closed fields by Peteris Daugulis.^[17] The construction of a uniquely defined plane normal form of a matrix starts by considering its Jordan normal form.

Matrix functions edit

Iteration of the Jordan chain motivates various extensions to more abstract settings. For finite matrices, one gets matrix functions; this can be extended to compact operators and the holomorphic functional calculus, as described further below.

The Jordan normal form is the most convenient for computation of the matrix functions (though it may be not the best choice for computer computations). Let f(z) be an analytical function of a complex argument. Applying the function on a n×n Jordan block J with eigenvalue λ results in an upper triangular matrix:

f(J)={\begin{bmatrix}f(\lambda )&f'(\lambda )&{\tfrac {f''(\lambda )}{2}}&\cdots &{\tfrac {f^{(n-1)}(\lambda )}{(n-1)!}}\\0&f(\lambda )&f'(\lambda )&\cdots &{\tfrac {f^{(n-2)}(\lambda )}{(n-2)!}}\\\vdots &\vdots &\ddots &\ddots &\vdots \\0&0&0&f(\lambda )&f'(\lambda )\\0&0&0&0&f(\lambda )\end{bmatrix}},

so that the elements of the k-th superdiagonal of the resulting matrix are ${\tfrac {f^{(k)}(\lambda )}{k!}}$ . For a matrix of general Jordan normal form the above expression shall be applied to each Jordan block.

The following example shows the application to the power function f(z) = zⁿ:

{\begin{bmatrix}\lambda _{1}&1&0&0&0\\0&\lambda _{1}&1&0&0\\0&0&\lambda _{1}&0&0\\0&0&0&\lambda _{2}&1\\0&0&0&0&\lambda _{2}\end{bmatrix}}^{n}={\begin{bmatrix}\lambda _{1}^{n}&{\tbinom {n}{1}}\lambda _{1}^{n-1}&{\tbinom {n}{2}}\lambda _{1}^{n-2}&0&0\\0&\lambda _{1}^{n}&{\tbinom {n}{1}}\lambda _{1}^{n-1}&0&0\\0&0&\lambda _{1}^{n}&0&0\\0&0&0&\lambda _{2}^{n}&{\tbinom {n}{1}}\lambda _{2}^{n-1}\\0&0&0&0&\lambda _{2}^{n}\end{bmatrix}},

where the binomial coefficients are defined as ${\textstyle {\binom {n}{k}}=\prod _{i=1}^{k}{\frac {n+1-i}{i}}}$ . For integer positive n it reduces to standard definition of the coefficients. For negative n the identity ${\textstyle {\binom {-n}{k}}=(-1)^{k}{\binom {n+k-1}{k}}}$ may be of use.

Compact operators edit

A result analogous to the Jordan normal form holds for compact operators on a Banach space. One restricts to compact operators because every point x in the spectrum of a compact operator T is an eigenvalue; The only exception is when x is the limit point of the spectrum. This is not true for bounded operators in general. To give some idea of this generalization, we first reformulate the Jordan decomposition in the language of functional analysis.

Holomorphic functional calculus edit

Let X be a Banach space, L(X) be the bounded operators on X, and σ(T) denote the spectrum of T ∈ L(X). The holomorphic functional calculus is defined as follows:

Fix a bounded operator T. Consider the family Hol(T) of complex functions that is holomorphic on some open set G containing σ(T). Let Γ = {γ_i} be a finite collection of Jordan curves such that σ(T) lies in the inside of Γ, we define f(T) by

f(T)={\frac {1}{2\pi i}}\int _{\Gamma }f(z)(z-T)^{-1}\,dz.

The open set G could vary with f and need not be connected. The integral is defined as the limit of the Riemann sums, as in the scalar case. Although the integral makes sense for continuous f, we restrict to holomorphic functions to apply the machinery from classical function theory (for example, the Cauchy integral formula). The assumption that σ(T) lie in the inside of Γ ensures f(T) is well defined; it does not depend on the choice of Γ. The functional calculus is the mapping Φ from Hol(T) to L(X) given by

\;\Phi (f)=f(T).

We will require the following properties of this functional calculus:

Φ extends the polynomial functional calculus.
The spectral mapping theorem holds: σ(f(T)) = f(σ(T)).
Φ is an algebra homomorphism.

The finite-dimensional case edit

In the finite-dimensional case, σ(T) = {λ_i} is a finite discrete set in the complex plane. Let e_i be the function that is 1 in some open neighborhood of λ_i and 0 elsewhere. By property 3 of the functional calculus, the operator

e_{i}(T)

is a projection. Moreover, let ν_i be the index of λ_i and

f(z)=(z-\lambda _{i})^{\nu _{i}}.

The spectral mapping theorem tells us

f(T)e_{i}(T)=(T-\lambda _{i})^{\nu _{i}}e_{i}(T)

has spectrum {0}. By property 1, f(T) can be directly computed in the Jordan form, and by inspection, we see that the operator f(T)e_i(T) is the zero matrix.

By property 3, f(T) e_i(T) = e_i(T) f(T). So e_i(T) is precisely the projection onto the subspace

\operatorname {Ran} e_{i}(T)=\ker(T-\lambda _{i})^{\nu _{i}}.

The relation

\sum _{i}e_{i}=1

implies

\mathbb {C} ^{n}=\bigoplus _{i}\;\operatorname {Ran} e_{i}(T)=\bigoplus _{i}\ker(T-\lambda _{i})^{\nu _{i}}

where the index i runs through the distinct eigenvalues of T. This is the invariant subspace decomposition

\mathbb {C} ^{n}=\bigoplus _{i}Y_{i}

given in a previous section. Each e_i(T) is the projection onto the subspace spanned by the Jordan chains corresponding to λ_i and along the subspaces spanned by the Jordan chains corresponding to v_j for j ≠ i. In other words, e_i(T) = P(λ_i;T). This explicit identification of the operators e_i(T) in turn gives an explicit form of holomorphic functional calculus for matrices:

For all f ∈ Hol(T),

f(T)=\sum _{\lambda _{i}\in \sigma (T)}\sum _{k=0}^{\nu _{i}-1}{\frac {f^{(k)}}{k!}}(T-\lambda _{i})^{k}e_{i}(T).

Notice that the expression of f(T) is a finite sum because, on each neighborhood of v_i, we have chosen the Taylor series expansion of f centered at v_i.

Poles of an operator edit

Let T be a bounded operator λ be an isolated point of σ(T). (As stated above, when T is compact, every point in its spectrum is an isolated point, except possibly the limit point 0.)

The point λ is called a pole of operator T with order ν if the resolvent function R_T defined by

R_{T}(\lambda )=(\lambda -T)^{-1}

has a pole of order ν at λ.

We will show that, in the finite-dimensional case, the order of an eigenvalue coincides with its index. The result also holds for compact operators.

Consider the annular region A centered at the eigenvalue λ with sufficiently small radius ε such that the intersection of the open disc B_ε(λ) and σ(T) is {λ}. The resolvent function R_T is holomorphic on A. Extending a result from classical function theory, R_T has a Laurent series representation on A:

R_{T}(z)=\sum _{-\infty }^{\infty }a_{m}(\lambda -z)^{m}

where

a_{-m}=-{\frac {1}{2\pi i}}\int _{C}(\lambda -z)^{m-1}(z-T)^{-1}dz

and C is a small circle centered at λ.

By the previous discussion on the functional calculus,

a_{-m}=-(\lambda -T)^{m-1}e_{\lambda }(T)

where

e_{\lambda }

is 1 on

B_{\varepsilon }(\lambda )

and 0 elsewhere.

But we have shown that the smallest positive integer m such that

a_{-m}\neq 0

and

a_{-l}=0\;\;\forall \;l\geq m

is precisely the index of λ, ν(λ). In other words, the function R_T has a pole of order ν(λ) at λ.

Numerical analysis edit

If the matrix A has multiple eigenvalues, or is close to a matrix with multiple eigenvalues, then its Jordan normal form is very sensitive to perturbations. Consider for instance the matrix

A={\begin{bmatrix}1&1\\\varepsilon &1\end{bmatrix}}.

If ε = 0, then the Jordan normal form is simply

{\begin{bmatrix}1&1\\0&1\end{bmatrix}}.

However, for ε ≠ 0, the Jordan normal form is

{\begin{bmatrix}1+{\sqrt {\varepsilon }}&0\\0&1-{\sqrt {\varepsilon }}\end{bmatrix}}.

This ill conditioning makes it very hard to develop a robust numerical algorithm for the Jordan normal form, as the result depends critically on whether two eigenvalues are deemed to be equal. For this reason, the Jordan normal form is usually avoided in numerical analysis; the stable Schur decomposition^[18] or pseudospectra^[19] are better alternatives.

Notes edit

^ Shilov defines the term Jordan canonical form and in a footnote says that Jordan normal form is synonymous. These terms are sometimes shortened to Jordan form. (Shilov) The term Classical canonical form is also sometimes used in the sense of this article. (James & James, 1976)
^ ^a ^b Holt & Rumynin (2009, p. 9)
^ ^a ^b Beauregard & Fraleigh (1973, pp. 310–316)
^ ^a ^b Golub & Van Loan (1996, p. 355)
^ ^a ^b Nering (1970, pp. 118–127)
^ Beauregard & Fraleigh (1973, pp. 270–274)
^ Golub & Van Loan (1996, p. 353)
^ Nering (1970, pp. 113–118)
^ Brechenmacher, "Histoire du théorème de Jordan de la décomposition matricielle (1870-1930). Formes de représentation et méthodes de décomposition", Thesis, 2007
^ Cullen (1966, p. 114)
^ Franklin (1968, p. 122)
^ ^a ^b Horn & Johnson (1985, §3.2.1)
^ Bronson (1970, pp. 189, 194)
^ Roe Goodman and Nolan R. Wallach, Representations and Invariants of Classical Groups, Cambridge UP 1998, Appendix B.1.
^ Horn & Johnson (1985, Theorem 3.4.5)
^ Arnold, Vladimir I, ed. (2004). Arnold's problems. Springer-Verlag Berlin Heidelberg. p. 127. doi:10.1007/b138219. ISBN 978-3-540-20748-1.
^ Peteris Daugulis (2012). "A parametrization of matrix conjugacy orbit sets as unions of affine planes". Linear Algebra and Its Applications. 436 (3): 709–721. arXiv:1110.0907. doi:10.1016/j.laa.2011.07.032. S2CID 119649768.
^ See Golub & Van Loan (2014), §7.6.5; or Golub & Wilkinson (1976) for details.
^ See Golub & Van Loan (2014), §7.9

References edit

Beauregard, Raymond A.; Fraleigh, John B. (1973), A First Course In Linear Algebra: with Optional Introduction to Groups, Rings, and Fields, Boston: Houghton Mifflin Co., ISBN 0-395-14017-X
Bronson, Richard (1970), Matrix Methods: An Introduction, New York: Academic Press, LCCN 70097490
Cullen, Charles G. (1966), Matrices and Linear Transformations, Reading: Addison-Wesley, LCCN 66021267
Dunford, N.; Schwartz, J. T. (1958), Linear Operators, Part I: General Theory, Interscience
Finkbeiner II, Daniel T. (1978), Introduction to Matrices and Linear Transformations (3rd ed.), W. H. Freeman and Company
Franklin, Joel N. (1968), Matrix Theory, Englewood Cliffs: Prentice-Hall, LCCN 68016345
Golub, Gene H.; Van Loan, Charles F. (1996), Matrix Computations (3rd ed.), Baltimore: Johns Hopkins University Press, ISBN 0-8018-5414-8
Golub, Gene H.; Wilkinson, J. H. (1976). "Ill-conditioned eigensystems and the computation of the Jordan normal form". SIAM Review. 18 (4): 578–619. doi:10.1137/1018113.
Holt, Derek; Rumynin, Dmitriy (2009), Algebra I – Advanced Linear Algebra (MA251) Lecture Notes (PDF)
Horn, Roger A.; Johnson, Charles R. (1985), Matrix Analysis, Cambridge University Press, ISBN 978-0-521-38632-6
James, Glenn; James, Robert C. (1976), Mathematics Dictionary (2nd ed.), Van Nostrand Reinhold
MacLane, Saunders; Birkhoff, Garrett (1967), Algebra, Macmillan Publishers
Michel, Anthony N.; Herget, Charles J. (1993), Applied Algebra and Functional Analysis, Dover Publications
Nering, Evar D. (1970), Linear Algebra and Matrix Theory (2nd ed.), New York: Wiley, LCCN 76091646
Shafarevich, I. R.; Remizov, A. O. (2012), Linear Algebra and Geometry, Springer, ISBN 978-3-642-30993-9
Shilov, Georgi E. (1977), Linear Algebra, Dover Publications
Jordan Canonical Form article at mathworld.wolfram.com

[1] Shilov defines the term Jordan canonical form and in a footnote says that Jordan normal form is synonymous. These terms are sometimes shortened to Jordan form. (Shilov) The term Classical canonical form is also sometimes used in the sense of this article. (James & James, 1976)

[Holt_2009_9-2] Holt & Rumynin (2009, p. 9)

[Beauregard_1973_310–316-3] Beauregard & Fraleigh (1973, pp. 310–316)

[Golub_1996_354-4] Golub & Van Loan (1996, p. 355)

[Nering_1970_118–127-5] Nering (1970, pp. 118–127)

[6] Beauregard & Fraleigh (1973, pp. 270–274)

[7] Golub & Van Loan (1996, p. 353)

[8] Nering (1970, pp. 113–118)

[Brechenmacher-thesis-9] Brechenmacher, "Histoire du théorème de Jordan de la décomposition matricielle (1870-1930). Formes de représentation et méthodes de décomposition", Thesis, 2007

[10] Cullen (1966, p. 114)

[11] Franklin (1968, p. 122)

[HJp321-12] Horn & Johnson (1985, §3.2.1)

[13] Bronson (1970, pp. 189, 194)

[14] Roe Goodman and Nolan R. Wallach, Representations and Invariants of Classical Groups, Cambridge UP 1998, Appendix B.1.

[15] Horn & Johnson (1985, Theorem 3.4.5)

[16] Arnold, Vladimir I, ed. (2004). Arnold's problems. Springer-Verlag Berlin Heidelberg. p. 127. doi:10.1007/b138219. ISBN 978-3-540-20748-1.

[originalpaper-17] Peteris Daugulis (2012). "A parametrization of matrix conjugacy orbit sets as unions of affine planes". Linear Algebra and Its Applications. 436 (3): 709–721. arXiv:1110.0907. doi:10.1016/j.laa.2011.07.032. S2CID 119649768.

[18] See Golub & Van Loan (2014), §7.6.5; or Golub & Wilkinson (1976) for details.

[19] See Golub & Van Loan (2014), §7.9

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]