Mathematics for Machine Learning - Day 3

Pour Terra10 July, 2024

Matrix Singularity Meme

Inverse and Transpose

Don't feel bad!, even matrix, a widely used object in mathematics, engineering, and computer science can still be single. But how can a matrix be single?

It's actually called singular or invertible matrix. These matrix are defined as singular when the don't have an inverse! either because of the determinant is zero or their lack of personality and relying on materialistic tendencies instead of working on themselves!

Inverse

Take matrix A for example:

A=(24 31)A = \begin{pmatrix} 2 & 4 \\\ 3 & 1 \end{pmatrix}

Matrix A contains two rows and two columns, with the inverse matrix formula being:

A1=1det(A)AA^{-1} = \frac{1}{\det(A)} A'

We can calculate the inverse as below:

A1=1(21)(43)(14 32)A^{-1} = \frac{1}{(2 \cdot 1) - (4 \cdot 3)} \begin{pmatrix} 1 & -4 \\\ -3 & 2 \end{pmatrix}

Then we add the the values

A1=110(14 32)A^{-1} = \frac{1}{-10} \begin{pmatrix} 1 & -4 \\\ -3 & 2 \end{pmatrix}

Finally dividing the matrix by the determinant!

A1=(0.10.4 0.30.2)A^{-1} = \begin{pmatrix} -0.1 & 0.4 \\\ 0.3 & -0.2 \end{pmatrix}

What was that? I don't understand anything

Here, let's go step by step, our goal is to prove this equation.

AA=det(A)IA \cdot A' = \det(A) \cdot I

This equation is just the Inverse equation with the determinant moved with the identity matrix. This step is also useful so you understand the basics of a 2x2 adjoint matrix (A')

Given:

A=(a11a12 a21a22)A = \begin{pmatrix} a_{11} & a_{12} \\\ a_{21} & a_{22} \end{pmatrix}

The adjugate of (A) is:

A=(a22a12 a21a11)A' = \begin{pmatrix} a_{22} & -a_{12} \\\ -a_{21} & a_{11} \end{pmatrix}

The determinant of (A) is:

det(A)=a11a22a12a21\det(A) = a_{11}a_{22} - a_{12}a_{21}

Now, compute A . A':

AA=(a11a12 a21a22)(a22a12 a21a11)A \cdot A' = \begin{pmatrix} a_{11} & a_{12} \\\ a_{21} & a_{22} \end{pmatrix} \begin{pmatrix} a_{22} & -a_{12} \\\ -a_{21} & a_{11} \end{pmatrix}

Perform the matrix multiplication:

=((a11a22a12a21)(a11(a12)+a12a11) (a21a22a22a21)(a21(a12)+a22a11))= \begin{pmatrix} (a_{11}a_{22} - a_{12}a_{21}) & (a_{11}(-a_{12}) + a_{12}a_{11}) \\\ (a_{21}a_{22} - a_{22}a_{21}) & (a_{21}(-a_{12}) + a_{22}a_{11}) \end{pmatrix}

Simplify the elements:

=(det(A)0 0det(A))= \begin{pmatrix} \det(A) & 0 \\\ 0 & \det(A) \end{pmatrix}

Thus:

AA=det(A)IA \cdot A' = \det(A) \cdot I

Properties

These are the properties of both transpose and inverse. Remember, these are the same as a normal matrix (Aside from equation 2) so just remember the normal properties of matrix and you're all set!

AA1=A1AA A^{-1} = A^{-1} A (AB)1=B1A1(AB)^{-1} = B^{-1} A^{-1} (A+B)1A1+B1(A + B)^{-1} \neq A^{-1} + B^{-1} (AT)T=A(A^T)^T = A (A+B)T=AT+BT(A + B)^T = A^T + B^T (AB)T=BTAT(AB)^T = B^T A^T

Symmetric Matrices

If A=AT, then A is a symmetric matrix.\text{If } A = A^T \text{, then } A \text{ is a symmetric matrix.} AmnT=Anm    m=nA^T_{mn} = A_{nm} \implies m = n If  A=AT  then  A1=AT and AT=(A1)T=(AT)1If \; A = A^T \; then \; A^{-1} = A^{-T} \\\ and \\\ A^{-T} = (A^{-1})^T = (A^T)^{-1}

Specific Case: Multiplication by a scaler

Though this is the properties are the same as multiplying with a matrix (Associativity and Distribution), it's best to drill it down one more time.

Given:

C:=(12 34)C := \begin{pmatrix} 1 & 2 \\\ 3 & 4 \end{pmatrix}

Consider ((\lambda + \psi)C):

(λ+ψ)C=((λ+ψ)2(λ+ψ) 3(λ+ψ)4(λ+ψ))(\lambda + \psi)C = \begin{pmatrix} (\lambda + \psi) & 2(\lambda + \psi) \\\ 3(\lambda + \psi) & 4(\lambda + \psi) \end{pmatrix}

Distribute (\lambda) and (\psi):

=(λ+ψ2λ+2ψ 3λ+3ψ4λ+4ψ)= \begin{pmatrix} \lambda + \psi & 2\lambda + 2\psi \\\ 3\lambda + 3\psi & 4\lambda + 4\psi \end{pmatrix}

Separate terms:

=(λ2λ 3λ4λ)+(ψ2ψ 3ψ4ψ)= \begin{pmatrix} \lambda & 2\lambda \\\ 3\lambda & 4\lambda \end{pmatrix} + \begin{pmatrix} \psi & 2\psi \\\ 3\psi & 4\psi \end{pmatrix}

Factor out (\lambda) and (\psi):

=λ(12 34)+ψ(12 34)= \lambda \begin{pmatrix} 1 & 2 \\\ 3 & 4 \end{pmatrix} + \psi \begin{pmatrix} 1 & 2 \\\ 3 & 4 \end{pmatrix}

Express in terms of (C):

=λC+ψC= \lambda C + \psi C

Compact Representation of System of Linear Equations

System of Linear Equations

The section is intended to prove that we can simplify large equations into a simpler form. This takes the load of seeing so many numbers everywhere and we can focus on the problem at hand, in this case, it's finding (x)

(1)2x1+3x2+5x3=1 (2)4x12x27x3=8 (3)9x1+5x23x3=2\begin{align*} (1) \quad & 2x_1 + 3x_2 + 5x_3 = 1 \\\ (2) \quad & 4x_1 - 2x_2 - 7x_3 = 8 \\\ (3) \quad & 9x_1 + 5x_2 - 3x_3 = 2 \end{align*}

The matrix form:

(235 427 953)(x1 x2 x3)=(1 8 2)\begin{pmatrix} 2 & 3 & 5 \\\ 4 & -2 & -7 \\\ 9 & 5 & -3 \end{pmatrix} \begin{pmatrix} x_1 \\\ x_2 \\\ x_3 \end{pmatrix} = \begin{pmatrix} 1 \\\ 8 \\\ 2 \end{pmatrix}

equal to:

j=13Aijxi=Bj for j = 1, 2, 3\sum_{j=1}^{3} A_{ij} x_i = B_j \textit{ for j = 1, 2, 3}

Acknowledgement

I can't overstate this: I'm truly grateful for this book being open-sourced for everyone. Many people will be able to learn and understand machine learning on a fundamental level. Whether changing careers, demystifying AI, or just learning in general, this book offers immense value even for fledgling composer such as myself. So, Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, thank you for this book.

Source: Deisenroth, M. P., Faisal, A. A., & Ong, C. S. (2020). Mathematics for Machine Learning. Cambridge: Cambridge University Press. https://mml-book.com

Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA