😺

Matrix multiplication

2024/02/24に公開

Let A \in \R^{N \times M} be an (N, M) matrix, A_m be the m-th column vector, and a_n be the n-th row vector of A.

\begin{align*} A = \begin{pmatrix} a_{11} & a_{12} & \dots & a_{1M} \\ a_{21} & a_{22} & \dots & a_{2M} \\ \vdots & \vdots & \ddots & \vdots \\ a_{N1} & a_{N2} & \dots & a_{NM} \\ \end{pmatrix} ,\quad A_m = \begin{pmatrix} a_{1m} \\ a_{2m} \\ \vdots \\ a_{Nm} \end{pmatrix} \in \R^N ,\quad a_n = \begin{pmatrix} a_{n1} \\ a_{n2} \\ \dots \\ a_{nM} \end{pmatrix} \in \R^M \end{align*}

Using the column vector and the row vector, A can also be represented as:

\begin{align*} A = \begin{pmatrix} A_1 & A_2 & \dots & A_M \end{pmatrix} = \begin{pmatrix} a_1^\top \\ a_2^\top \\ \vdots \\ a_N^\top \end{pmatrix} \end{align*}

Let's consider a matrix vector multiplication A e_m where e_m represents a unit vector whose m-th element is 1 otherwise 0.

\begin{align*} A e_m &= A_m ,\quad e_m = \begin{pmatrix} 0 \\ \vdots \\ 1 \\ \vdots \\ 0 \end{pmatrix} \in \R^N \end{align*}

We can regard the matrix A as the list of destinations of each \set{e_m}_{m=1}^M.


Next, let's consider A x.

\begin{align*} A x &= \sum_{m=1}^M x_m A_m ,\quad x = \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_M \end{pmatrix} \in \R^M ,\quad x_m \in \R ,\quad a_m \in \R^N \end{align*}

We can understand this equation as A x is decomposed as the summation of directions A_m with the weight x_m.


From another point of view, A x will be a list of inner products a_n^\top x.

\begin{align*} A x &= \begin{pmatrix} a_1^\top x \\ a_2^\top x \\ \vdots \\ a_N^\top x \\ \end{pmatrix} \in \R^M ,\quad a_n = \begin{pmatrix}a_{n1} \\ \vdots \\ a_{nM}\end{pmatrix} \in \R^M \end{align*}

where a_n is the n-th row vector of A.


Let A \in \R^{N \times N} be a square matrix,

\begin{align*} A^{-1} A_m &= e_m \\ A^{-1} \begin{pmatrix} A_1 & A_2 & \dots & A_M \end{pmatrix} &= \begin{pmatrix} e_1 & e_2 & \dots & e_M \end{pmatrix} \end{align*}
\begin{align*} \end{align*}

\begin{align*} A^{-1} \left( \sum_{n=1}^N x_n A_n \right) = x ,\quad x \in \R^{N} \end{align*}

because

\begin{align*} A x = \sum_{n=1}^N x_n A_n \\ A^{-1} A x = A^{-1} \left( \sum_{n=1}^N x_n A_n \right) \\ x = A^{-1} \left( \sum_{n=1}^N x_n A_n \right) \\ \end{align*}

If A^{-1} exists, \forall y \in \R^N can be represented as a linear combination y = x_1 A_1 + \dots + x_N A_N = A x.

And the weights x = \set{x_n}_{n=1}^N can be obtained by A^{-1} y because

\begin{align*} y = A x ,\\ A^{-1} y = A^{-1} A x ,\\ A^{-1} y = x . \\ \end{align*}

Discussion