Orthogonality principle and error minimization

From Wikipedia, the free encyclopedia

The orthogonality principle is commonly used as a foundation in solving the following type of problem (generally):

Given a vector space V, let \operatorname{span}(w_{1},w_{2},\ldots,w_{n})=W where w_{1},w_{2},\ldots,w_{n}\subset V. Find a vector \widehat{y} that is a combination of vectors in W, such that we can approximate a vector y\in Vby \widehat{y}.

This type of problem arises in signal processing, for example, when we have a signal in some unknown or unworkable space. We want to estimate this signal with one in a known space or a space that we can work in. This problem also implicitly calls for error minimization.

Geometrically, we can see this problem by the following simple case where W contains one vector:

Image:vectorprojection.jpg

We want to find the closest approximation to the vector y by a vector \widehat{y} in the space W. From the geometric interpretation, it is easily seen that the best approximation, or smallest error, occurs when the error vector, e, is orthogonal to vectors in the space W. This is the foundation behind the orthogonality principle.

[edit] A solution to error minimization problems

While there are various ways to approach error minimization problems, the following is one way to arrive at a solution.

If possible, we want to be able to approximate a vector v by

v=\widehat{v}+e\,

where

\hat{v}=\sum_{i=1}^{n}c_{i}p_{i}

is the approximation of v as a linear combination of vectors in span(p_{1},p_{2},\ldots,p_{n}). Therefore, we want to be able to solve for the coefficients,ci , so that we may write our approximation in known terms.

By the orthogonality theorem, the square norm of the error vector, \left\Vert e\right\Vert ^{2}, is minimized when, for j = 1, 2, ..., n

\left\langle v-\sum_{i=1}^{n}c_{i}p_{i},p_{j}\right\rangle =0.

Developing this equation, we obtain


\left\langle v,p_{j}\right\rangle =\left\langle \sum_{i=1}^{n}c_{i}p_{i},p_{j}\right\rangle


\left\langle v,p_{j}\right\rangle =\sum_{i=1}^{n}c_{i}\left\langle p_{i},p_{j}\right\rangle

or in matrix form


\begin{bmatrix}
\left\langle v,p_{1}\right\rangle \\
\left\langle v,p_{2}\right\rangle \\
\vdots\\
\left\langle v,p_{n}\right\rangle \end{bmatrix}
=
\begin{bmatrix}
\left\langle p_{1},p_{1}\right\rangle  & \left\langle p_{2},p_{1}\right\rangle  & \cdots & \left\langle p_{n},p_{1}\right\rangle \\
\left\langle p_{1},p_{2}\right\rangle  & \left\langle p_{2},p_{2}\right\rangle  & \cdots & \left\langle p_{n},p_{2}\right\rangle \\
\vdots & \vdots & \ddots & \vdots\\
\left\langle p_{1},p_{n}\right\rangle  & \left\langle p_{2},p_{n}\right\rangle  & \cdots & \left\langle p_{n},p_{n}\right\rangle \end{bmatrix}
\begin{bmatrix}
c_{1}\\
c_{2}\\
\vdots\\
c_{n}\end{bmatrix}

If the grammian matrix is invertible,

\begin{bmatrix}
c_{1}\\
c_{2}\\
\vdots\\
c_{n}\end{bmatrix}
=
\begin{bmatrix}
\left\langle p_{1},p_{1}\right\rangle  & \left\langle p_{2},p_{1}\right\rangle  & \cdots & \left\langle p_{n},p_{1}\right\rangle \\
\left\langle p_{1},p_{2}\right\rangle  & \left\langle p_{2},p_{2}\right\rangle  & \cdots & \left\langle p_{n},p_{2}\right\rangle \\
\vdots & \vdots & \ddots & \vdots\\
\left\langle p_{1},p_{n}\right\rangle  & \left\langle p_{2},p_{n}\right\rangle  & \cdots & \left\langle p_{n},p_{n}\right\rangle \end{bmatrix}^{-1}
\begin{bmatrix}
\left\langle v,p_{1}\right\rangle \\
\left\langle v,p_{2}\right\rangle \\
\vdots\\
\left\langle v,p_{n}\right\rangle \end{bmatrix}

The inner products in the Grammian matrix and the cross correlation vector should be defined so that the elements of these matrices are known.

[edit] References

  • Moon, Todd K. (2000), Mathematical Methods and Algorithms for Signal Processing, Prentice-Hall, ISBN 0-201-36186-8