Mathematics for Machine Learning - Day 5

Pour Terra12 July, 2024

Machine Learning Math Meme

Disclaimer

Today's topic will still be split between today and tomorrow. Since you might tell from the meme, I don't know if I'm a genius, an idiot, or that's just how math maths. Usually it's the second one.

Today we'll discuss

  1. Elementary Transformations
  2. An example of using row-echelon form (REF) to help us find the particular and general solutions
  3. A missing connection between previous discussion.

Tomorrow, I'll write about what's row-echelon form (REF) even is and what's the difference between using gaussian elimination to create a upper triangle matrix. Honestly, I'm just curious because that's what REF seems to me...

Elementary Transformation

Basically, it transforms the matrix into a simpler form without changing the result. We can transform it in a few different ways

  1. Exchange of two equations (rows in the matrix representing the system of equations)
  2. Multiplication or divisions of an equation (row) with a constant.
  3. Addition or subtractions of two equations (rows)

Example

System of equations

2x1+4x22x31x4+4x5=3 4x18x2+3x33x4+1x5=2 1x12x2+1x31x4+1x5=0 1x12x2+0x33x4+4x5=a\begin{align*} -2x_1 + 4x_2 - 2x_3 - 1x_4 + 4x_5 &= -3 \\\ 4x_1 - 8x_2 + 3x_3 - 3x_4 + 1x_5 &= 2 \\\ 1x_1 - 2x_2 + 1x_3 - 1x_4 + 1x_5 &= 0 \\\ 1x_1 - 2x_2 + 0x_3 - 3x_4 + 4x_5 &= a \end{align*}

Matrix-vector multiplication form

Let's divide the variables and the constants!

(24214 48331 12111 12034)(x1 x2 x3 x4 x5)=(3 2 0 a)\begin{pmatrix} -2 & 4 & -2 & -1 & 4 \\\ 4 & -8 & 3 & -3 & 1 \\\ 1 & -2 & 1 & -1 & 1 \\\ 1 & -2 & 0 & -3 & 4 \end{pmatrix} \begin{pmatrix} x_1 \\\ x_2 \\\ x_3 \\\ x_4 \\\ x_5 \end{pmatrix} = \begin{pmatrix} -3 \\\ 2 \\\ 0 \\\ a \end{pmatrix}

Augmented matrix form

Finally some new UI design of the matrix, looks more elegant this way.

[242143 483312 121110 12034a]\left[ \begin{array}{ccccc|c} -2 & 4 & -2 & -1 & 4 & -3 \\\ 4 & -8 & 3 & -3 & 1 & 2 \\\ 1 & -2 & 1 & -1 & 1 & 0 \\\ 1 & -2 & 0 & -3 & 4 & a \end{array} \right]

Now what?

Now we'll transform the augmented matrix using elementary transformation to becoming a REF!

Before we continue, remember! R stands for row and the notation will be Rx -> Ry, which means the value of Rx will be put in row Ry. Now let's continue.

Swap R1 and R3

[121110 483312 242143 12034a]R1R3\overset{R_1 \leftrightarrow R_3}{\left[ \begin{array}{ccccc|c} 1 & -2 & 1 & -1 & 1 & 0 \\\ 4 & -8 & 3 & -3 & 1 & 2 \\\ -2 & 4 & -2 & -1 & 4 & -3 \\\ 1 & -2 & 0 & -3 & 4 & a \end{array} \right]}

Multiple Row Multiplication

[121110 001132 000363 00123a]R24R1R2 R3+2R1R3 R4R1R4\overset{\begin{array}{l} R_2 - 4R_1 \rightarrow R_2 \\\ R_3 + 2R_1 \rightarrow R_3 \\\ R_4 - R_1 \rightarrow R_4 \end{array}}{\left[ \begin{array}{ccccc|c} 1 & -2 & 1 & -1 & 1 & 0 \\\ 0 & 0 & -1 & 1 & -3 & 2 \\\ 0 & 0 & 0 & -3 & 6 & -3 \\\ 0 & 0 & -1 & -2 & 3 & a \end{array} \right]}

Subtract R4 with R2 and R3

[121110 001132 000363 00000a2+3]R4R2R3R4\overset{R_4 - R_2 - R_3 \rightarrow R_4}{\left[ \begin{array}{ccccc|c} 1 & -2 & 1 & -1 & 1 & 0 \\\ 0 & 0 & -1 & 1 & -3 & 2 \\\ 0 & 0 & 0 & -3 & 6 & -3 \\\ 0 & 0 & 0 & 0 & 0 & a - 2 + 3 \end{array} \right]}

Scaling

[121110 001132 000121 00000a+1]R2R2 13R3R3\overset{\begin{array}{l} -R_2 \rightarrow R_2 \\\ -\frac{1}{3}R_3 \rightarrow R_3 \end{array}}{\left[ \begin{array}{ccccc|c} 1 & -2 & 1 & -1 & 1 & 0 \\\ 0 & 0 & 1 & -1 & 3 & -2 \\\ 0 & 0 & 0 & 1 & -2 & 1 \\\ 0 & 0 & 0 & 0 & 0 & a + 1 \end{array} \right]}

Okay Terra, you made me read all of this complex mumbo jumbo, what do I do now?

First off, you need to relax. Secondly, Given that a+1 = 0 we can create a solution for a = -1!

But... I need to confess, the authors didn't provide a step by step solution to their answers :( But I did create one for my solution :D so you be the judge if I'm correct or not by proving it yourselves as well!

Particular solution

Remember on the previous chapter we used a type of identity matrix? well, there isn't one here and since on the pages I'm reading on didn't have the steps and only the solution, I'll show you know I did it :D

Find Ax = B
B=(0 2 1 0)B = \begin{pmatrix} 0 \\\ -2 \\\ 1 \\\ 0 \end{pmatrix}

and assume that:

x1=a x2=b x3=c x4=d x5=ex_1 = a \\\ x_2 = b \\\ x_3 = c \\\ x_4 = d \\\ x_5 = e

We can split the result to form an equation like this, where each x is associated to a specific column.

(0 2 1 0)=a(1 0 0 0)+b(2 0 0 0)+c(1 1 0 0)+d(1 1 1 0)+e(1 3 2 0)Let’s focus on this part\begin{pmatrix} 0 \\\ -2 \\\ 1 \\\ 0 \end{pmatrix} = a \begin{pmatrix} 1 \\\ 0 \\\ 0 \\\ 0 \end{pmatrix} + b \begin{pmatrix} -2 \\\ 0 \\\ 0 \\\ 0 \end{pmatrix} + c \begin{pmatrix} 1 \\\ 1 \\\ 0 \\\ 0 \end{pmatrix} + \underbrace{d \begin{pmatrix} -1 \\\ -1 \\\ 1 \\\ 0 \end{pmatrix} + e \begin{pmatrix} 1 \\\ -3 \\\ -2 \\\ 0 \end{pmatrix}}_{\text{Let's focus on this part}}
Let's assume that d and e are equal to 1 and add both of these to see what we get
1(1 1 1 0)+1(1 3 2 0)=(0 2 1 0)1 \begin{pmatrix} -1 \\\ -1 \\\ 1 \\\ 0 \end{pmatrix} + 1 \begin{pmatrix} 1 \\\ -3 \\\ -2 \\\ 0 \end{pmatrix} = \begin{pmatrix} 0 \\\ 2 \\\ -1 \\\ 0 \end{pmatrix}

Notice that the result of adding these result in the negative value of the result we're searching for?

(0 2 1 0)=1(0 2 1 0)\begin{pmatrix} 0 \\\ 2 \\\ -1 \\\ 0 \end{pmatrix} = -1 \begin{pmatrix} 0 \\\ -2 \\\ 1 \\\ 0 \end{pmatrix}
Finding d and e

Assume:

x=(1 1 1 0)x = \begin{pmatrix} -1 \\\ -1 \\\ 1 \\\ 0 \end{pmatrix}

and

y=(1 3 2 0)y = \begin{pmatrix} 1 \\\ -3 \\\ -2 \\\ 0 \end{pmatrix}

Given:

dx+ey=(x+y)dx + ey = -(x + y) dx+x+ey+y=0dx + x + ey + y = 0 x(d+1)+y(e+1)=0x(d+1) + y(e+1) = 0

Since we know that x and y are not zero, the only way to fulfill this equation is by having (d+1) and (e+1) = 0.

So, both

d=1ande=1d = -1 \quad \text{and} \quad e = -1

Particular Solution Result

Great! now we can conclude that the particular solution is:

x (My answer)=(0 0 0 1 1)x \text{ (My answer)} = \begin{pmatrix} 0 \\\ 0 \\\ 0 \\\ -1 \\\ -1 \end{pmatrix}

Though... there's a small caveat. This isn't the author's answer, it's mine, I tried finding the solution in the sentences or maybe I slipped a page or two, but I think it's after going in-depth to row echelon form first, so I'm not sure about the method used to solve it.

x (Author’s answer)=(2 0 1 1 0)x \text{ (Author's answer)} = \begin{pmatrix} 2 \\\ 0 \\\ -1 \\\ 1 \\\ 0 \end{pmatrix}

General Solutions

I won't go into depth in this one, considering I also got the answer of Ax = 0 different. For more context, I'm using the same method as the yesterday with the lambda is from column 4 and column 5 (Since only those two contain the most non-zero column) and the column will become -1.

I'm really not sure if this is just math being math or I'm doing something wrong!

My answer
{xR5x=(0 0 0 1 1)+λ1(1 0 52 0 12)+λ2(2 0 5 2 1),  λ1,λ2R}\left\{ x \in \mathbb{R}^5 \mid x = \begin{pmatrix} 0 \\\ 0 \\\ 0 \\\ -1 \\\ -1 \end{pmatrix} + \lambda_1 \begin{pmatrix} 1 \\\ 0 \\\ -\frac{5 }{2} \\\ 0 \\\ \frac{1}{2} \end{pmatrix} + \lambda_2 \begin{pmatrix} -2 \\\ 0 \\\ 5 \\\ 2 \\\ -1 \end{pmatrix}, \; \lambda_1, \lambda_2 \in \mathbb{R} \right\}
Author's answer
{xR5x=(2 0 1 1 0)+λ1(2 1 0 0 0)+λ2(2 0 1 2 1),  λ1,λ2R}\left\{ x \in \mathbb{R}^5 \mid x = \begin{pmatrix} 2 \\\ 0 \\\ -1 \\\ 1 \\\ 0 \end{pmatrix} + \lambda_1 \begin{pmatrix} 2 \\\ 1 \\\ 0 \\\ 0 \\\ 0 \end{pmatrix} + \lambda_2 \begin{pmatrix} 2 \\\ 0 \\\ -1 \\\ 2 \\\ 1 \end{pmatrix}, \; \lambda_1, \lambda_2 \in \mathbb{R} \right\}

Ending note:

That's the end, I spent too much time racking my brain about the answer and might as well post it since I might get an answer on what kind of behavior is this.


Acknowledgement

I can't overstate this: I'm truly grateful for this book being open-sourced for everyone. Many people will be able to learn and understand machine learning on a fundamental level. Whether changing careers, demystifying AI, or just learning in general, this book offers immense value even for fledgling composer such as myself. So, Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, thank you for this book.

Source: Deisenroth, M. P., Faisal, A. A., & Ong, C. S. (2020). Mathematics for Machine Learning. Cambridge: Cambridge University Press. https://mml-book.com

Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA