Definition. Let V be a vector space over F,
where or
. An inner product on V is a function
which satisfies:
(a) (Linearity) ,
for
,
.
(b) (Symmetry) ,for
. ("
" denotes
the complex conjugate of x.)
(c) (Positive-definiteness) If , then
and
.
A vector space with an inner product is an inner
product space. If , V is a
real inner product space; if
, V is a complex inner product space.
Notation. There are various notations for
inner products. You may see " " or
"
" or "
", for
instance. Some specific inner products come with established
notation. For example, the dot product, which
I'll discuss below, is denoted "
".
Proposition. Let V be an inner product space
over F, where or
. Let
, and let
.
(a) and
.
(b) .
In particular, if , then
.
(c) If , then
.
(a)
By symmetry, as well.
(b)
If , then
, and so
.
(c) As in the proof of (b), I have , so
Remarks. Why include complex conjugation in the symmetry axiom? Suppose the symmetry axiom had read
Then
This contradicts . That is, I can't have
both pure symmetry and positive definiteness.
Example. Suppose u, v, and w are vectors in a real inner product space V. Suppose
(a) Compute .
(b) Compute .
(a) Using the linearity and symmetry properties, I have
Notice that this "looks like" the polynomial multiplication you learned in basic algebra:
(b)
Example. Let . The dot product on
is given by
It's easy to verify that the axioms for an inner product hold. For
example, suppose . Then at
least one of
, ...,
is nonzero, so
This proves that the dot product is positive-definite.
I can use an inner product to define lengths and angles. Thus, an inner product introduces (metric) geometry into vector spaces.
Definition. Let V be an inner product space,
and let .
(a) The length of x is .
(b) The distance between x and y is .
(c) The angle between x and y is the smallest
positive real number satisfying
Remark. The definition of the angle between x and y wouldn't make sense if the
expression was greater than 1 or less than -1, since I'm
asserting that it's the cosine of an angle.
In fact, the Cauchy-Schwarz inequality (which I'll prove below) will show that
Proposition. Let V be a real inner product
space, ,
.
(a) .
(b) . ("
" denotes the absolute value of a.)
(c) if and only if
.
(d) ( Cauchy- Schwarz inequality) .
(e) ( Triangle inequality) .
Proof. (a) Squaring gives
.
(b) Since ,
(c) implies
, and
hence
. Conversely, if
, then
, so
.
(d) If , then
Hence, . The same is true
if
.
Thus, I may assume that and
.
The major part of the proof comes next, and it involves a trick. Don't feel bad if you wouldn't have thought of this yourself: Try to follow along and understand the steps.
If , then by positive-definiteness and
linearity,
The trick is to pick "nice" values for a and b. I will set
and
. (A rationale for this is that I
want the expression
to appear in the
inequality.)
I get
Since and
, I have
and
. So I can divide the
inequality by
to obtain
In the last inequality, x and y are arbitrary vectors. So the
inequality is still true if x is replaced by . If I replace x with
, then
and
, and the inequality becomes
Since is greater than or equal to both
and
, I have
(e)
Hence, .
Example. is an inner
product space using the standard dot product of vectors. The cosine
of the angle between
and
is
Example. Let denote the real vector space of continuous functions
on the interval
. Define an inner product on
by
Note that is integrable, since it's continuous on a
closed interval.
The verification that this gives an inner product relies on standard
properties of Riemann integrals. For example, if ,
Given that this is a real inner product, I may apply the preceding proposition to produce some useful results. For example, the Cauchy-Schwarz inequality says that
Definition. A set of vectors S in an inner
product space V is orthogonal if for
,
.
An orthogonal set S is orthonormal if for all
.
If you've seen dot products in a multivariable calculus course, you
know that vectors in whose dot product is 0 are
perpendicular. With this interpretation, the vectors in an orthogonal
set are mutually perpendicular. The vectors in an orthonormal set are
mutually perpendicular unit vectors.
Notation. If I is an index set, the Kronecker delta (or
) is defined by
With this notation, a set is orthonormal if
Note that the matrix whose
-th component is
is the
identity matrix.
Example. The standard basis for is
It's clear that relative to the dot product on , each of these vectors as length 1, and each pair of
the vectors has dot product 0. Hence, the standard basis is an
orthonormal set relative to the dot product on
.
Example. Consider the following set of
vectors in :
I have
It follows that the set is orthonormal relative to the dot product on
.
Example. Let denote the complex-valued continuous
functions on
. Define an inner product by
Let . Then
It follows that the following set is orthonormal in relative to this inner product:
Proposition. Let be an orthogonal set of vectors,
for all i. Then
is independent.
Proof. Suppose
Take the inner product of both sides with :
Since is orthogonal,
The equation becomes
But by positive-definiteness,
since
. Therefore,
.
Similarly, taking the inner product of both sides of the original
equation with , ...
shows
for all j. Therefore,
is independent.
An orthonormal set consists of vectors of length 1, so the vectors are obviously nonzero. Hence, an orthonormal set is independent, and forms a basis for the subspace it spans. A basis which is an orthonormal set is called an orthonormal basis.
It is very easy to find the components of a vector relative to an orthonormal basis.
Proposition. Let be an orthonormal basis for V, and let
. Then
Note: In fact, the sum above is a finite sum --- that is, only finitely many terms are nonzero.
Proof. Since is a basis, there are elements
and
such that
Take the inner product of both sides with . Then
As in the proof of the last proposition, all the inner product terms
on the right vanish, except that by orthonormality. Thus,
Taking the inner product of both sides of the original equation with
, ...
shows
Example. Here is an orthonormal basis for
:
To express in terms of this basis, take the dot
product of the vector with each element of the basis:
Hence,
Example. Let denote the complex inner product space of
complex-valued continuous functions on
, where the inner product is defined by
I noted earlier that the following set is orthonormal:
Suppose I try to compute the "components" of relative to this orthonormal set by taking inner
products --- that is, using the approach of the preceding example.
For ,
Suppose . Then
There are infinitely many nonzero components! Of course, the reason
this does not contradict the earlier result is that may not lie in the span of S. S is orthonormal, hence
independent, but it is not a basis for
.
In fact, since , a
finite linear combination of elements of S must be periodic.
It is still reasonable to ask whether (or in what sense) can be represented by the the infinite sum
For example, it is reasonable to ask whether the series converges
uniformly to f at each point of . The answers
to these kinds of questions would require an excursion into the
theory of Fourier series.
Since it's so easy to find the components of a vector relative to an orthonormal basis, it's of interest to have an algorithm which converts a given basis to an orthonormal one.
The Gram-Schmidt algorithm converts a basis to an orthonormal basis by "straightening out" the vectors one by one.
The picture shows the first step in the straightening process. Given
vectors and
, I want to replace
with a vector perpendicular to
. I can do this by taking the component of
perpendicular to
, which is
Lemma. ( Gram-Schmidt
algorithm) Let be a set of
nonzero vectors in an inner product space V. Suppose
, ...,
are pairwise
orthogonal. Let
Then is orthogonal to
, ...,
.
Proof. Let . Then
Now for
because the set is orthogonal. Hence, the right side
collapses to
Suppose that I start with an independent set . Apply the Gram-Schmidt procedure to the
set, beginning with
. This produces an
orthogonal set
. In fact,
is a nonzero orthogonal set, so it
is independent as well.
To see that each is nonzero, suppose
Then
This contradicts the independence of , because
is expressed as the linear combination of
, ...
.
In general, if the algorithm is applied iteratively to a set of vectors, the span is preserved at each state. That is,
This is true at the start, since . Assume
inductively that
Consider the equation
It expresses as a linear combination of
. Hence,
.
Conversely,
It follows that , so
, by induction.
To summarize: If you apply Gram-Schmidt to a set of vectors, the algorithm produces a new set of vectors with the same span as the old set. If the original set was independent, the new set is independent (and orthogonal) as well.
So, for example, if Gram-Schmidt is applied to a basis for an inner product space, it will produce an orthogonal basis for the space.
Finally, you can always produce orthonormal set from a
orthogonal set (of nonzero vectors) --- merely divide each vector in
the orthogonal set by its length.
Example. (
Gram-Schmidt) Apply Gram-Schmidt to the following set of vectors
in (relative to the usual dot product):
(A common mistake here is to project onto ,
, ... . I need to project onto the
vectors that have already been orthogonalized. That is why I
projected onto
and
rather than
and
.)
The algorithm has produced the following orthogonal set:
The lengths of these vectors are 5, 5, and 9. For example
The correponding orthonormal set is
Example. (
Gram-Schmidt) Find an orthonormal basis (relative to the usual
dot product) for the subspace of spanned by the
vectors
I'll use ,
,
to denote the orthonormal basis.
To simplify the computations, you should fix the vectors so they're mutually perpendicular first. Then you can divide each by its length to get vectors of length 1.
First,
Next,
You can check that , so the first two are
perpendicular.
Finally,
If at any point you wind up with a vector with fractions, it's a good idea to clear the fractions before continuing. Since multiplying a vector by a number doesn't change its direction, it remains perpendicular to the vectors already constructed.
Thus, I'll multiply the last vector by 9 and use
Thus, the orthogonal set is
The lengths of these vectors are 3, 9, and . Dividing the vectors by their
lengths gives and orthonormal basis:
Recall that when an n-dimensional vector is interpreted as a matrix,
it is taken to be an matrix: that is, an
n-dimensional column vector
If I need an n-dimensional row vector, I'll take the transpose. Thus,
Lemma. Let A be an invertible matrix with entries in
. Let
Then defines an inner product on
.
Proof. I have to check linearity, symmetry, and positive definiteness.
First, if , then
This proves that the function is linear in the first slot.
Next,
The second equality comes from the fact that for matrices. The third inequality comes
from the fact that
is a
matrix, so it equals its transpose.
This proves that the function is symmetric.
Finally,
Now is an
vector --- I'll label its
components this way:
Then
That is, the inner product of a vector with itself is a nonnegative
number. All that remains is to show that if the inner product of a
vector with itself is 0, them the vector is .
Using the notation above, suppose
Then , because a nonzero u would
produce a positive number on the right side of the equation.
So
Finally, I'll use the fact that A is invertible:
This proves that the function is positive definite, so it's an inner
product.
Example. The previous lemma provides lots of
examples of inner products on besides the usual
dot product. All I have to do is take an invertible matrix A and form
, defining the inner product as above.
For example, this real matrix is invertible:
Now
(Notice that will always be symmetric.) The inner
product defined by this matrix is
For example, under this inner product,
Definition. A matrix A in is orthogonal if
.
Proposition. Let A be an orthogonal matrix.
(a) .
(b) --- in other words,
.
(c) The rows of A form an orthonormal set. The columns of A form an orthonormal set.
(d) A preserves dot products --- and hence, lengths and angles --- in the sense that
Proof. (a) If A is orthogonal,
Therefore, .
(b) Since , the determinant is certainly
nonzero, so A is invertible. Hence,
But , so
as well.
(c) The equation implies that the rows of A form an
orthonormal set of vector. Likewise,
shows that the
same is true for the columns of A.
(d) The ordinary dot product of vectors and
can be written as a matrix
multiplication:
(Remember the convention that vectors are column vectors.)
Suppose A is orthogonal. Then
In other words, orthogonal matrices preserve dot products. It follows
that orthogonal matrices will also preserve lengths of
vectors and angles between vectors, because these are
defined in terms of dot products.
Example. Find real numbers a and b such that the following matrix is orthogonal:
Since the columns of A must form an orthonormal set, I must have
(Note that already.) The first equation
gives
The easy way to get a solution is to swap 0.6 and 0.8 and negate one
of them; thus, and
.
Since , I'm done. (If the a and b I
chose had made
, then I'd simply divide
by its length.)
Example. Orthogonal matrices represent rotations of the
plane about the origin or reflections across a
line through the origin.
Rotations are represented by matrices
You can check that this works by considering the effect of
multiplying the standard basis vectors and
by this matrix.
Multiplying a vector by the following matrix product reflects the
vector across the line L that makes an angle with the x-axis:
Reading from right to left, the first matrix rotates everything by
radians, so L coincides with the x-axis. The second
matrix reflects everything across the x-axis. The third matrix
rotates everything by
radians. Hence, a given vector is
rotated by
and reflected across the x-axis, after
which the reflected vector is rotated by
. The net effect is to reflect across L.
Many transformation problems can be easily accomplished by doing
transformations to reduce a general problem to a special case.
Send comments about this page to: Bruce.Ikenaga@millersville.edu.
Copyright 2014 by Bruce Ikenaga