Orthogonal Matrix: Definition and Example
In this post, we introduce orthonormal bases, orthogonal matrices and discuss their properties.
An orthogonal matrix is a square matrix whose rows and columns are vectors that are orthogonal to each other and of unit length. We can also say that they form an orthonormal basis.
Orthonormal Basis
A set of vectors V = {v1, v2,…vj} form an orthonormal basis if all vectors are orthogonal to each other and each vector is of unit length. Remember, unit length means that the vector length equals 1.
Since one by one is one, we can formalize this by saying that a vector vi multiplied by itself needs to result in 1
v_i \cdot v_i = 1
Since the vectors in V are orthogonal to each other, multiplying any vector v_i with any other vector v_j needs to result in zero.
v_i \cdot v_j = 0
Let’s do an example and try to show that v_i and v_j form an orthonormal basis.
v_i = \begin{bmatrix} \frac{3}{5} \\ \frac{4}{5} \end{bmatrix} \,and\, v_j = \begin{bmatrix} \frac{4}{5} \\ -\frac{3}{5} \end{bmatrix}
v_i v_i= \begin{bmatrix} \frac{3}{5} \\ \frac{4}{5} \end{bmatrix} \begin{bmatrix} \frac{3}{5} \\ \frac{4}{5} \end{bmatrix} = \frac{9}{25} + \frac{16}{25} = 1
We see that vi multiplied by itself is indeed 1. This is also equivalent to the length of v_i, which has to be one since v_i is a unit vector.
|v_i| = \sqrt( ( \frac{3}{5} )^2 + (\frac{4}{5})^2) = \sqrt(1) = 1
v_iv_j = \begin{bmatrix} \frac{3}{5} \\ \frac{4}{5} \end{bmatrix} \begin{bmatrix} \frac{4}{5} \\ -\frac{3}{5} \end{bmatrix} = \frac{3}{5}\frac{4}{5}+\frac{4}{5}(-\frac{3}{5}) = 0
Properties of an Orthogonal Matrix
In an orthogonal matrix, the columns and rows are vectors that form an orthonormal basis. This means it has the following features:
- it is a square matrix
- all vectors need to be orthogonal
- all vectors need to be of unit length (1)
- all vectors need to be linearly independent of each other
- the determinant equals 1
The vectors and v_i and v_j would form an orthogonal matrix
\begin{bmatrix} \frac{3}{5} & \frac{4}{5} \\ \frac{4}{5} & -\frac{3}{5} \end{bmatrix}
Furthermore, the inverse of an orthogonal matrix is its transpose.
A^T = A^{-1}
We can demonstrate this by showing that an orthogonal matrix A multiplied by its transpose is equivalent to the identity matrix.
\begin{bmatrix} \frac{3}{5} & \frac{4}{5} \\ \frac{4}{5} & -\frac{3}{5} \end{bmatrix} \begin{bmatrix} \frac{3}{5} & \frac{4}{5} \\ \frac{4}{5} & -\frac{3}{5} \end{bmatrix} = \begin{bmatrix} 1 & 0\\ 0 & 1\\ \end{bmatrix}
Orthonormal Matrix vs Orthogonal Matrix
Understandably, there is a bit of confusion about the terminology. Let’s briefly recap what the terms orthonormal and orthogonal mean when dealing with vectors.
Orthogonal means that two vectors are perpendicular
Orthonormal means that two vectors are perpendicular and of unit length.
If we were to extend this to matrices, we would have to say that an orthogonal matrix consists of vectors perpendicular to each other, while an orthonormal matrix consists of perpendicular vectors that are also of unit length. However, the concept of a matrix whose vectors are perpendicular but not of unit length is not very useful. I’m not going to discuss this any further at this point because it would bring us into the land of complex numbers and unitary matrices. It is simply too far off-topic for now, but if you are interested in it, I recommend you check out this discussion on math StackExchange.
This post is part of a series on linear algebra for machine learning. To read other posts in this series, go to the index.