Unitary Matrices and Hermitian Matrices

Recall that the conjugate of a complex number $a + bi$ is $a - bi$ . The conjugate of $a + bi$ is denoted $\overline{a + bi}$ or $(a + bi)^*$ .

In this section, I'll use $\overline{(\hphantom{x}\vphantom{x})}$ for complex conjugation of numbers of matrices. I want to use $(\hphantom{x})^*$ to denote an operation on matrices, the conjugate transpose.

Thus,

$$\conjugate{3 + 4i} = 3 - 4i, \quad \conjugate{5 - 6i} = 5 + 6i, \quad \conjugate{7i} = -7i, \quad \conjugate{10} = 10.$$

Complex conjugation satisfies the following properties:

(a) If $z \in
   \complex$ , then $z = \conjugate{z}$ if and only if z is a real number.

(b) If $z_1, z_2
   \in \complex$ , then

$$\conjugate{z_1 + z_2} = \conjugate{z_1} + \conjugate{z_2}.$$

(c) If $z_1, z_2
   \in \complex$ , then

$$\conjugate{z_1\cdot z_2} = \conjugate{z_1} \cdot \conjugate{z_2}.$$

The proofs are easy; just write out the complex numbers (e.g. $z_1 = a + bi$ and $z_2 = c + di$ ) and compute.

The conjugate of a matrix A is the matrix $\conjugate{A}$ obtained by conjugating each element: That is,

$$(\conjugate{A})_{ij} = \conjugate{A_{ij}}.$$

You can check that if A and B are matrices and $k \in \complex$ , then

$$\conjugate{k A + B} = \conjugate{k} \cdot \conjugate{A} + \conjugate{B} \quad\hbox{and}\quad \conjugate{A B} = \conjugate{A} \cdot \conjugate{B}.$$

You can prove these results by looking at individual elements of the matrices and using the properties of conjugation of numbers given above.

Definition. If A is a complex matrix, $A^*$ is the conjugate transpose of A:

$$A^* = \conjugate{A^T}.$$

Note that the conjugation and transposition can be done in either order: That is, $\conjugate{A^T} = (\conjugate{A})^T$ . To see this, consider the $(i, j)^{\rm th}$ element of the matrices:

$$[\conjugate{(A^T)}]_{ij} = \conjugate{(A^T)_{ij}} = \conjugate{A_{ji}} = (\conjugate{A})_{ji} = [(\conjugate{A})^T]_{ij}.$$


Example. If

$$A = \left[\matrix{ 1 + 2i & 2 - i & 3i \cr 4 & -2 + 7i & 6 + 6i \cr}\right], \quad\hbox{then}\quad A^* = \left[\matrix{ 1 - 2i & 4 \cr 2 + i & -2 - 7i \cr -3i & 6 - 6i \cr}\right].$$

Since the complex conjugate of a real number is the real number, if B is a real matrix, then $B^* = B^T$ .


Remark. Most people call $A^*$ the adjoint of A --- though, unfortunately, the word "adjoint" has already been used for the transpose of the matrix of cofactors in the determinant formula for $A^{-1}$ . (Sometimes people try to get around this by using the term "classical adjoint" to refer to the transpose of the matrix of cofactors.) In modern mathematics, the word "adjoint" refers to a property of $A^*$ that I'll prove below. This property generalizes to other things which you might see in more advanced courses.

The $(\hphantom{x})^*$ operation is sometimes called the Hermitian --- but this has always sounded ugly to me, so I won't use this terminology.

Since this is an introduction to linear algebra, I'll usually refer to $A^*$ as the conjugate transpose, which at least has the virtue of saying what the thing is.

Proposition. Let U and V be complex matrices, and let $k \in \complex$ .

(a) $(U^*)^* =
   U$ .

(b) $(k U + V)^*
   = \conjugate{k} U^* + V^*$ .

(c) $(U V)^* =
   V^* U^*$ .

(d) If $u, v \in
   \complex^n$ , their dot product is given by

$$u \cdot v = v^* u.$$

Proof. I'll prove (a), (c), and (d).

For (a), I use the fact noted above that $\conjugate{(\hphantom{x})}$ and $(\hphantom{x})^T$ can be done in either order, along with the facts that

$$\conjugate{\conjugate{A}} = A \quad\hbox{and}\quad (A^T)^T = A.$$

I have

$$(U^*)^* = \conjugate{[\conjugate{(U^T)}]^T} = \conjugate{\conjugate{[(U^T)^T]}} = \conjugate{\conjugate{U}} = U.$$

This proves (a).

For (c), I have

$$(U V)^* = \conjugate{(U V)^T} = \conjugate{V^T U^T} = \conjugate{V^T} \cdot \conjugate{U^T} = V^* \cdot U^*.$$

For (d), recall that the dot product of complex vectors $u = (u_1, u_2,
   \ldots, u_n)$ and $v = (v_1, v_2,
   \ldots, v_n)$ is

$$u \cdot v = u_1 \conjugate{v_1} + u_2 \conjugate{v_2} + \cdots + u_n \conjugate{v_n}.$$

Notice that you take the complex conjugates of the components of v before multiplying!

This can be expressed as the matrix multiplication

$$u \cdot v = \left[\matrix{ \conjugate{v_1} & \conjugate{v_2} & \cdots & \conjugate{v_n} \cr}\right] \left[\matrix{u_1 \cr u_2 \cr \vdots \cr u_n \cr}\right] = v^* u.\quad\halmos$$


Example. In this example, use the complex dot product.

(a) Compute $(1
   + 3 i, 2 + i) \cdot (4 - 5 i, 2 + 3 i)$ .

(b) Find $\|(2 +
   i, 3 - 5 i)\|$ .

(c) Find a nonzero vector $(a, b)$ which is orthogonal to $(1 + 8 i, 2 - 3 i)$ .

(a)

$$(1 + 3 i, 2 + i) \cdot (4 - 5 i, 2 + 3 i) = \left[\matrix{4 + 5 i & 2 - 3 i \cr}\right] \left[\matrix{1 + 3 i \cr 2 + i \cr}\right] = (4 + 5 i)(1 + 3 i) + (2 - 3 i)(2 + i) = -4 + 13 i.$$

It's a common notational abuse to write the number "$-4 + 13 i$ " instead of writing it as a $1 \times 1$ matrix "$[-4 + 13 i]$ ".

(b)

$$\|(2 + i, 3 - 5 i)\|^2 = (2 + i, 3 - 5 i) \cdot (2 + i, 3 - 5 i) = (2 - i)(2 + i) + (3 + 5 i)(3 - 5 i) = 4 + 1 + 9 + 25 = 39.$$

Hence, $\|(2 +
   i, 3 - 5 i)\| = \sqrt{39}$ .

The following formula is evident from this example:

$$\|(a + b i, c + d i)\| = \sqrt{a^2 + b^2 + c^2 + d^2}.$$

This extends in the obvious way to vectors in $\complex^n$ .

(c) I need

$$(a, b) \cdot (1 + 8 i, 2 - 3 i) = 0.$$

In matrix form, this is

$$\left[\matrix{1 - 8 i & 2 + 3 i \cr}\right] \left[\matrix{a \cr b \cr}\right] = 0.$$

Note that the vector $(1 + 8 i, 2 - 3 i)$ was conjugated and transposed.

Doing the matrix multiplication,

$$(1 - 8 i) a + (2 + 3 i) b = 0.$$

I can get a solution $(a, b)$ by switching the numbers $1 - 8
   i$ and $2 + 3 i$ and negating one of them: $(a, b) = (2 + 3 i,
   -1 + 8 i)$ .


There are two points about the equation $u \cdot v = v^* u$ which might be confusing. First, why is it necessary to conjugate and transpose v? The reason for the conjugation goes back to the need for inner products to be positive definite (so $u \cdot u$ is a nonnegative real number).

The reason for the transpose is that I'm using the convention that vectors are column vectors. So if u and v are n-dimensional column vectors and I want the product to be a number --- i.e. a $1 \times 1$ matrix --- I have to multiply an n-dimensional row vector ($1 \times n$ ) and an n-dimensional column vector ($n
   \times 1$ ). To get the row vector, I have to transpose the column vector.

Finally, why do u and v switch places in going from the left side to the right side? The reason you write $v^* u$ instead of $u^* v$ is because inner products are defined to be linear in the first variable. If you use $u^*v$ you get a product which is linear in the second variable.

Of course, none of this makes any difference if you're dealing with real numbers. So if x and y are vectors in $\real^n$ , you can write

$$x \cdot y = x^T y \quad\hbox{or}\quad x \cdot y = y^T x.$$

Definition. A complex matrix U is unitary if $U U^* = I$ .

Notice that if U happens to be a real matrix, $U^* = U^T$ , and the equation says $U U^T = I$ --- that is, U is orthogonal. In other words, unitary is the complex analog of orthogonal.

By the same kind of argument I gave for orthogonal matrices, $U U^* = I$ implies $U^*U = I$ --- that is, $U^*$ is $U^{-1}$ .

Proposition. Let U be a unitary matrix.

(a) U preserves inner products: $x\cdot y = (U x)
   \cdot (U y)$ . Consequently, it also preserves lengths: $\|U x\| = \|x\|$ .

(b) An eigenvalue of U must have length 1.

(c) The columns of a unitary matrix form an orthonormal set.

Proof. (a)

$$(U x) \cdot (U y) = (U y)^* (U x) = y^* U^* U x = y^* I x = y^* x = x \cdot y.$$

Since U preserves inner products, it also preserves lengths of vectors, and the angles between them. For example,

$$\|x\|^2 = x\cdot x = (U x)\cdot (U x) = \|U x\|^2, \quad\hbox{so}\quad \|x\| = \|U x\|.$$

(b) Suppose x is an eigenvector corresponding to the eigenvalue $\lambda$ of U. Then $U x = \lambda x$ , so

$$\|U x\| = \|\lambda x\| = |\lambda|\|x\|.$$

But U preserves lengths, so $\|U x\| = |x\|$ , and hence $|\lambda| = 1$ .

(c) Suppose

$$U = \left[\matrix{ \uparrow & \uparrow & & \uparrow \cr c_1 & c_2 & \cdot & c_n \cr \downarrow & \downarrow & & \downarrow \cr}\right].$$

Then $U^* U =
   I$ means

$$\left[\matrix{ \leftarrow & \overline{c_1}^T & \rightarrow \cr \leftarrow & \overline{c_2}^T & \rightarrow \cr & \vdots & \cr \leftarrow & \overline{c_n}^T & \rightarrow \cr}\right] \left[\matrix{ \uparrow & \uparrow & & \uparrow \cr c_1 & c_2 & \cdot & c_n \cr \downarrow & \downarrow & & \downarrow \cr}\right] = \left[\matrix{ 1 & 0 & 0 & \cdots & 0 \cr 0 & 1 & 0 & \cdots & 0 \cr 0 & 0 & 1 & \cdots & 0 \cr \vdots & \vdots & \vdots & & \vdots \cr 0 & 0 & 0 & \cdots & 1 \cr}\right].$$

Here $\overline{c_k}^T$ is the complex conjugate of the $k^{\rm th}$ column $c_k$ , transposed to make it a row vector. If you look at the dot products of the rows of $U^*$ and the columns of U, and note that the result is I, you see that the equation above exactly expresses the fact that the columns of U are orthonormal.

For example, take the first row $\overline{c_1}^T$ . Its product with the columns $c_1$ , $c_2$ , and so on give the first row of the identity matrix, so

$$c_1 \cdot c_1 = 1, \quad c_1 \cdot c_2 = 0, \ldots, c_1 \cdot c_n = 0.$$

This says that $c_1$ has length 1 and is perpendicular to the other columns. Similar statements hold for $c_2$ , ..., $c_n$ .


Example. Find c and d so that the following matrix is unitary:

$$\left[\matrix{ \dfrac{1}{\sqrt{7}} (1 + 2 i) & c \cr \noalign{\vskip2pt} \dfrac{1}{\sqrt{7}} (1 - i) & d \cr}\right].$$ I want the columns to be orthogonal, so their complex dot product should be 0. First, I'll find a vector that is orthogonal to the first column. I may ignore the factor of $\dfrac{1}{\sqrt{7}}$; I need

$$\eqalign{ (a, b) \cdot (1 + 2 i, 1 - i) & = 0 \cr \left[\matrix{1 - 2 i & 1 + i \cr}\right] \left[\matrix{a \cr b \cr}\right] & = 0 \cr}$$

This gives

$$(1 - 2 i) a + (1 + i) b = 0.$$

I may take $a =
   1 + i$ and $b = -1 + 2 i$ . Then

$$\|(1 + i, -1 + 2 i)\| = \sqrt{7}.$$

So I need to divide each of a and b by $\sqrt{7}$ to get a unit vector. Thus,

$$(c, d) = \left(\dfrac{1}{\sqrt{7}} (1 + i), \dfrac{1}{\sqrt{7}} (-1 + 2 i)\right).\quad\halmos$$


Proposition. ( Adjointness) let $A \in M(n,
   \complex)$ and let $u, v \in
   \complex^n$ . Then

$$A u \cdot v = u \cdot A^* v.$$

Proof.

$$u \cdot A^* v = (A^* v)^* u = v^* (A^*)^* u = v^* A u = A u \cdot v.\quad\halmos$$

Remark. If $(\cdot, \cdot)$ is any inner product on a vector space V and $T:
   V \to V$ is a linear transformation, the adjoint $T^*$ of T is the linear transformation which satisfies

$$(T(u), v) = (u, T^*(v)) \quad\hbox{for all}\quad u, v \in V.$$

(This definition assumes that there is such a transformation.) This explains why, in the special case of the complex inner product, the matrix $A^*$ is called the adjoint. It also explains the term self-adjoint in the next definition.

Corollary. ( Adjointness) let $A \in M(n, \real)$ and let $u, v \in \real^n$ . Then

$$A u \cdot v = u \cdot A^T v.$$

Proof. This follows from adjointness in the complex case, because $A^* = A^T$ for a real matrix.

Definition. An complex matrix A is Hermitian (or self-adjoint) if $A^* = A$ .

Note that a Hermitian matrix is automatically square.

For real matrices, $A^* = A^T$ , and the definition above is just the definition of a symmetric matrix.


Example. Here are examples of Hermitian matrices:

$$\left[\matrix{ -4 & 2 + 3i \cr 2 - 3i & 17 \cr}\right], \quad \left[\matrix{ 5 & 6i & 2 \cr -6i & 0.87 & 1 - 5i \cr 2 & 1 + 5i & 42 \cr}\right].$$

It is no accident that the diagonal entries are real numbers --- see the result that follows.


Here's a table of the correspondences between the real and complex cases:

$$\vbox{\offinterlineskip \halign{& \vrule # & \strut \hfil \quad # \quad \hfil \cr \noalign{\hrule} height2pt & \omit & & \omit & \cr & {\bf Real Case} & & {\bf Complex Case} & \cr height2pt & \omit & & \omit & \cr \noalign{\hrule} height2pt & \omit & & \omit & \cr & $u \cdot v = u^T v = v^T u$ & & $u \cdot v = v^* u$ & \cr height2pt & \omit & & \omit & \cr \noalign{\hrule} height2pt & \omit & & \omit & \cr & Transpose $()^T$ & & Conjugate transpose $()^*$ & \cr height2pt & \omit & & \omit & \cr \noalign{\hrule} height2pt & \omit & & \omit & \cr & Orthogonal matrix $A A^T = I$ & & Unitary matrix $U U^* = I$ & \cr height2pt & \omit & & \omit & \cr \noalign{\hrule} height2pt & \omit & & \omit & \cr & Symmetric matrix $A = A^T$ & & Hermitian matrix $H = H^*$ & \cr height2pt & \omit & & \omit & \cr \noalign{\hrule} }} $$

Proposition. Let A be a Hermitian matrix.

(a) The diagonal elements of A are real numbers, and elements on opposite sides of the main diagonal are conjugates.

(b) The eigenvalues of a Hermitian matrix are real numbers.

(c) Eigenvectors of A corresponding to different eigenvalues are orthogonal.

Proof. (a) Since $A = A^*$ , I have $A_{ij} =
   \conjugate{A_{ji}}$ . This shows that elements on opposite sides of the main diagonal are conjugates.

Taking $i = j$ , I have

$$A_{ii} = \conjugate{A_{ii}}.$$

But a complex number is equal to its conjugate if and only if it's a real number, so $A_{ii}$ is real.

(b) Suppose A is Hermitian and $\lambda$ is an eigenvalue of A with eigenvector v. Then

$$\lambda(v \cdot v) = (\lambda v) \cdot v = (Av)\cdot v = v \cdot A^*v = v \cdot Av = v \cdot (\lambda v) = \conjugate{\lambda}(v \cdot v).$$

Therefore, $\lambda = \conjugate{\lambda}$ --- but a number that equals its complex conjugate must be real.

(c) Suppose $\mu$ is an eigenvalue of A with eigenvector u and $\lambda$ is an eigenvalue of A with eigenvector v. Then

$$\mu(u \cdot v) = (\mu u) \cdot v = A u \cdot v = u \cdot A^*v = u \cdot Av = u \cdot (\lambda v) = \conjugate{\lambda}(u \cdot v) = \lambda(u \cdot v).$$

$u \cdot v \ne
   0$ implies $\mu = \lambda$ , so if the eigenvalues are different, then $u \cdot v = 0$ .


Example. Let

$$A = \left[\matrix{ 1 & 2 - i \cr 2 + i & -3 \cr}\right].$$

Show that the eigenvalues are real, and that eigenvectors for different eigenvalues are orthogonal.

The matrix is Hermitian. The characteristic polynomial is

$$x^2 + 2 x - 8 = (x + 4)(x - 2).$$

The eigenvalues are real numbers: -4 and 2.

For -4, the eigenvector matrix is

$$A + 4 I = \left[\matrix{ 5 & 2 - i \cr 2 + i & 1 \cr}\right].$$

$(2 - i, -5)$ is an eigenvector.

For 2, the eigenvector matrix is

$$A - 2 I = \left[\matrix{ -1 & 2 - i \cr 2 + i & -5 \cr}\right].$$

$(2 - i, 1)$ is an eigenvector.

Note that

$$(2 - i, -5) \cdot (2 - i, 1) = (2 + i)(2 - i) + (1)(-5) = 5 - 5 = 0.$$

Thus, the eigenvectors are orthogonal.


Since real symmetric matrices are Hermitian, the previous results apply to them as well. I'll restate the previous result for the case of a symmetric matrix.

Corollary. Let A be a symmetric matrix.

(a) The elements on opposite sides of the main diagonal are equal.

(b) The eigenvalues of a symmetric matrix are real numbers.

(c) Eigenvectors of A corresponding to different eigenvalues are orthogonal.


Example. Consider the symmetric matrix

$$A = \left[\matrix{3 & 2 \cr 2 & 6 \cr}\right].$$

The characteristic polynomial is $x^2 - 9x + 14 = (x
   - 7)(x - 2)$ .

Note that the eigenvalues are real numbers.

For $\lambda =
   7$ , an eigenvector is $(1, 2)$ .

For $\lambda =
   2$ , an eigenvector is $(-2, 1)$ .

Since $(1, 2)
   \cdot (-2, 1) = 0$ , the eigenvectors are orthogonal.


Example. A $2 \times 2$ real symmetric matrix A has eigenvalues 1 and 3.

$(2, -3)$ is an eigenvector corresponding to the eigenvalue 1.

(a) Find an eigenvector corresponding to the eigenvalue 3.

Let $(a, b)$ be an eigenvector corresponding to the eigenvalue 3.

Since eigenvectors for different eigenvalues of a symmetric matrix must be orthogonal, I have

$$(2, -3) \cdot (a, b) = 0, \quad\hbox{or}\quad 2 a - 3 b = 0.$$

So, for example, $(a, b) = (3, 2)$ is a solution.

(b) Find A.

From (a), a diagonalizing matrix and the corresponding diagonal matrix are

$$P = \left[\matrix{ 2 & 3 \cr -3 & 2 \cr}\right] \quad\hbox{and}\quad D = \left[\matrix{ 1 & 0 \cr 0 & 3 \cr}\right].$$

Now $P^{-1} A P
   = D$ , so

$$A = P D P^{-1} = \left[\matrix{ 2 & 3 \cr -3 & 2 \cr}\right] \left[\matrix{ 1 & 0 \cr 0 & 3 \cr}\right] \left(\dfrac{1}{13}\right) \left[\matrix{ 2 & -3 \cr 3 & 2 \cr}\right] = \dfrac{1}{13} \left[\matrix{ 31 & 12 \cr 12 & 21 \cr}\right].$$

Note that the result is indeed symmetric.


Example. Let $p, q, r, s \in
   \real$ , and consider the $2 \times 2$ Hermitian matrix

$$A = \left[\matrix{p & q + r i \cr q - r i & s \cr}\right].$$

Compute the characteristic polynomial of A, and show directly that the eigenvalues must be real numbers.

$$|A - xI| = \left|\matrix{p - x & q + r i \cr q - r i & s - x \cr}\right| = (x - p)(x - s) - (q + r i)(q - r i) = x^2 - (p + s)x + [p s - (q^2 + r^2)].$$

The discriminant is

$$(p + s)^2 - 4(1)[p s - (q^2 + r^2)] = (p^2 + 2 p s + s^2) - 4 p s + 4(q^2 + r^2) = (p^2 - 2 p s + s^2) + 4(q^2 + r^2) = (p - s)^2 + 4(q^2 + r^2).$$

Since this is a sum of squares, it can't be negative. Hence, the roots of the characteristic polynomial --- the eigenvalues --- must be real numbers.


Send comments about this page to: Bruce.Ikenaga@millersville.edu.

Bruce Ikenaga's Home Page

Copyright 2014 by Bruce Ikenaga