Trace (linear algebra)

From Wikipedia, the free encyclopedia

In linear algebra, the trace of an n-by-n square matrix A is defined to be the sum of the elements on the main diagonal (the diagonal from the upper left to the lower right) of A, i.e.,

\mathrm{tr}(A) = a_{11} + a_{22} + \dots + a_{nn}=\sum_i a_{i i} \,

where aij represents the entry on the ith row and ith column of A. Equivalently, the trace of a matrix is the sum of its eigenvalues, making it an invariant with respect to chosen basis.

The use of the term trace arises from the German term Spur (cognate with the English spoor), which, as a function in mathematics, is often abbreviated to "Sp".

Contents

[edit] Properties

The trace is a linear map. That is,

tr(A + B) = tr(A) + tr(B)
tr(rA) = r tr(A)

for all square matrices A and B, and all scalars r. Note that the trace is only defined for a square matrix (i.e. n×n).

Since the principal diagonal is invariant under transposition, a matrix and its transpose have the same trace:

tr(A) = tr(AT).

If A is an m×n matrix and B is an n×m matrix, then both products AB and BA are square, and

tr(AB) = tr(BA).

We prove this by invoking the definition of matrix multiplication:

\mathrm{tr}(AB) = \sum_{i=1}^m (AB)_{ii} = \sum_{i=1}^m \sum_{j=1}^n A_{ij} B_{ji} = \sum_{j=1}^n \sum_{i=1}^m B_{ji} A_{ij} = \sum_{j=1}^n (BA)_{jj} = \mathrm{tr}(BA).

In particular, when both A and B are n by n, the trace vanishes on the derived algebra: tr([A,B]) = 0, and the trace gives a map of Lie algebras \mathfrak{gl}_{n} \to k (where k is the scalar field, with the commutative Lie algebra structure).

Using the commutativity of trace, we can deduce that the trace of a product of square matrices is equal to the trace of any cyclic permutation of the product, a fact known as the cyclic property of the trace. For example, with three matrices A, B, and C, shaped so that ABC, CAB, and BCA all exist,

tr(ABC) = tr(CAB) = tr(BCA).

However, even if A, B, and C are square matrices of the same dimension, then the traces of their products does depend on the order of the product; i.e., not all permutations of the three letters are allowed. An example would be


A = \left(\begin{array}{cc}1&0\\1&0\end{array}\right)\qquad
B = \left(\begin{array}{cc}2&1\\1&0\end{array}\right)\qquad
C = \left(\begin{array}{cc}0&2\\1&1\end{array}\right).

Then \mathrm{tr}(ABC)=\mathrm{tr}\left(\begin{array}{cc}1&5\\1&5\end{array}\right)=6 and \mathrm{tr}(BAC)=\mathrm{tr}\left(\begin{array}{cc}0&6\\0&2\end{array}\right)=2.

For four or more matrices, any cyclic permutation is allowed; thus, for example, tr(ABCDE) = tr(EABCD).

However, if products of three symmetric matrices are considered, any permutation is allowed. (Proof: tr(ABC) = tr(AT BT CT) = tr((CBA)T) = tr(CBA).) For more than three factors this is not true.

The trace is similarity-invariant, which means that A and P−1AP (P invertible) have the same trace, though there exist matrices which have the same trace but are not similar. This can be verified using the cyclic property above:

tr(P−1AP) = tr(PP−1A) = tr(A)

Given some linear map f : VV (V is a finite-dimensional vector space) generally, we can define the trace of this map by considering the trace of matrix representation of f, that is, choosing a basis for V and describing f as a matrix relative to this basis, and taking the trace of this square matrix. The result will not depend on the basis chosen, since different bases will give rise to similar matrices, allowing for the possibility of a basis independent definition for the trace of a linear map.

Such a definition can be given using the canonical isomorphism between the space End(V) of linear maps on V and VV*, where V* is the dual space of V. Let v be in V and let f be in V*. Then the trace of the decomposable element vf is defined to be f(v); the trace of a general element is defined by linearity. Using an explicit basis for V and the corresponding dual basis for V*, one can show that this gives the same definition of the trace as given above.

If A and B are positive semi-definite matrices of the same order then:

 0 \leq \mathrm{tr}(AB)^n \leq \mathrm{tr}(A)^n \mathrm{tr}(B)^n

[edit] Eigenvalue relationships

If A is a square n-by-n matrix with real or complex entries and if λ1,...,λn are the (complex and distinct) eigenvalues of A (listed according to their algebraic multiplicities), then

tr(A) = ∑ λi.

This follows from the fact that A is always similar to its Jordan form, an upper triangular matrix having λ1,...,λn on the main diagonal.

[edit] Derivatives

The trace is the derivative of the determinant: it is the Lie algebra analog of the (Lie group) map of the determinant.

This is made precise in Jacobi's formula for the derivative of the determinant (see under determinant).

As a particular case, \operatorname{tr}=\operatorname{det}'_I: the trace is the derivative of the determinant at the identity.

From this (or from the connection between the trace and the eigenvalues), one can derive a connection between the trace function, the exponential map between a Lie algebra and its Lie group (or concretely, the matrix exponential function), and the determinant:

det(exp(A)) = exp(tr(A)).

For example, consider the one-parameter family of linear transformations given by rotation through angle θ,


R_{\theta} = \left(\begin{array}{cc}\cos \theta & -\sin \theta\\\sin \theta&\cos \theta\end{array}\right)

These transformations all have determinant 1, so they preserve area. The derivative of this family at θ = 0 is the antisymmetric matrix


A = \left(\begin{array}{cc}0 & -1\\1&0\end{array}\right)

which clearly has trace zero, indicating that this matrix represents an infinitesimal transformation which preserves area.

A related characterization of the trace applies to linear vector fields. Given a matrix A, define a vector field F on Rn by F(x) = Ax. The components of this vector field are all linear functions (given by the rows of A). The divergence div F is a constant function, whose value is equal to tr(A). By the divergence theorem, one can interpret this in terms of flows: if F(x) represents the velocity of a fluid at the location x, and U is a region in Rn, the net flow of the fluid out of U is given by tr(A)· vol(U), where vol(U) is the volume of U.

The trace is a linear operator, hence its derivative is constant:

 {\rm d}  {\rm tr} ( {\mathbf X} ) = {\rm tr}({\rm d} {\mathbf X})

[edit] Applications

The trace is used to define characters of group representations. Given two representations A(x) and B(x), they are equivalent if tr A(x) = tr B(x).

The trace also plays a central role in the distribution of quadratic forms.

[edit] Lie algebra

A matrix whose trace is zero is said to be traceless or tracefree, and these matrices form the simple Lie algebra sln, which is the Lie algebra of the special linear group of matrices with determinant 1. The special linear group consists of the matrices which do not change volume, while the special linear algebra is the matrices which infinitesimally do not change volume.

[edit] Inner product

For an m-by-n matrix A with complex (or real) entries and * being the conjugate transpose, we have

tr(A*A) ≥ 0

with equality only if A = 0. The assignment

\langle A, B\rangle = \operatorname{tr}(A^*B)

yields an inner product on the space of all complex (or real) m-by-n matrices.

If m=n then the norm induced by the above inner product is called the Frobenius norm of a square matrix. Indeed it is simply the Euclidean norm if the matrix is considered as a vector of length n2.

[edit] Generalization

The concept of trace of a matrix is generalised to the trace class of compact operators on Hilbert spaces, and the analog of the Frobenius norm is called the Hilbert-Schmidt norm.

The partial trace is another generalization of the trace that is operator-valued.

If A is a general associative algebra over a field k, then a trace on A is often defined to be any map tr: Ak which vanishes on commutators: tr([a,b]) = 0 for all a,b in A. Such a trace is not uniquely defined; it can always at least be modified by multiplication by a nonzero scalar.

A supertrace is the generalization of a trace to the setting of superalgebras.

The operation of tensor contraction generalizes the trace to arbitrary tensors.

[edit] See also