Mathematics for Machine Learning - Day 16

Pour Terra24 July, 2024

Matrix transformation meme

Another day another question at the bottom.

As always, if there's a question that means I stumbled upon an solution that's different, this time however, it isn't the solution, but the way to get there itself. But since I've read till the end of the section this might be a solution I'll use until proven otherwise.

I'll break down the steps with so much scrutiny that even my mental state will join the the fun... The mental break down type of fun :D. So, let's begin.

Transformation matrix

Transformation matrix is much like the last topic, but if the previous day was regarding vectors this will be with matrices.

So Why do you need to make this if it's just from vectors to matrices?

So I'll explain the method that works in the book's example and I'll explain what's really going on.

Explanation from the book

Consider a vector space with the corresponding ordered bases.

V vector space and an ordered basis b1,,bn W vector space and an ordered basis c1,,cnV \text{ vector space and an ordered basis } {b_1, \dots, b_n} \\\ W \text{ vector space and an ordered basis } {c_1, \dots, c_n}

We're also considering a linear mapping

Φ:VW For j1,,n\Phi : V \to W \text{ For } j \in {1, \dots, n}

Which will result in the following unique representation

Φ(bj)=α1jc1++αmjcm=Σi=1mαijci\Phi (b_j) = \alpha_{1j}c_1 + \dots + \alpha_{mj}c_m = \Sigma_{i = 1}^m \alpha_{ij}c_i

As I've said, this is the unique representationwith respect to C. Then we call the mxn-matrix whose element are given my

AΦ(i,j)=αijA_{\Phi} (i,j) = \alpha_{ij}

is the transformation matrix with respect to the ordered bases B:V and C:W.

Example

Consider a homomorphism with an ordered bases B of V and C of W.

Φ:VW B=b1,,bn C=c1,,cn\Phi : V \to W \\\ B = {b_1, \dots, b_n} \\\ C = {c_1, \dots, c_n}
With
Φ(b1)=C1C2+3C3C4 Φ(b2)=2C1+C2+7C3+2C4 Φ(b3)=3C2+1C3+4C4\Phi (b_1) = C_1 - C_2 + 3C_3 - C_4 \\\ \Phi (b_2) = 2C_1 + C_2 + 7C_3 + 2C_4 \\\ \Phi (b_3) = 3C_2 + 1C_3 + 4C_4
The transformation matrix with respect to B and C satisfies the following
Φ(bk)=Σi=14αikci for k=1,,3\Phi (b_k) = \Sigma_{i = 1}^4 \alpha_{ik} c_i \text{ for } k = 1, \dots, 3
and is given as :
AΦ=[α1,α2,α3]=[120 113 371 124 ] where j=1,2,3A_\Phi = [\alpha_1, \alpha_2, \alpha_3] = \left[\begin{array}{ccc} 1 & 2 & 0 \\\ -1 & 1 & 3 \\\ 3 & 7 & 1 \\\ 1 & 2 & 4 \\\ \end{array}\right] \text{ where } j = 1, 2, 3
You get all that?

Honestly, I sure don't. So let's try going in even more depth than the book intended.

So in order to transform the matrix, we need to map the result that we want to the transformation matrix. Which is why we use C instead of B.
Φ(b1)=C1C2+3C3C4 Φ(b2)=2C1+C2+7C3+2C4 Φ(b3)=3C2+1C3+4C4\Phi (b_1) = C_1 - C_2 + 3C_3 - C_4 \\\ \Phi (b_2) = 2C_1 + C_2 + 7C_3 + 2C_4 \\\ \Phi (b_3) = 3C_2 + 1C_3 + 4C_4
If we lie them in a transposed vector type of way, it'll look something like this
[Φ(b1),Φ(b2),Φ(b3)]=[C12C10 C2C23C2 3C37C31C3 1C42C44C4 ][\Phi (b_1), \Phi (b_2), \Phi (b_3)] = \left[\begin{array}{ccc} C_1 & 2C_1 & 0 \\\ -C_2 & C_2 & 3C_2 \\\ 3C_3 & 7C_3 & 1C_3 \\\ 1C_4 & 2C_4 & 4C_4 \\\ \end{array}\right]

then just remove the variable and you have the same formula as before.

But Terra, I get how it's made now but how do you use it?

So if the matrix we're trying to achieve is W. Given there's three variables, we can write it like this. Remember, A is the transformation and V is the vector beforehand.

W=AΦVW = A_{\Phi} V
Then we expand the formula
W=[120 113 371 124 ][x1 x2 x3 ]W = \left[\begin{array}{ccc} 1 & 2 & 0 \\\ -1 & 1 & 3 \\\ 3 & 7 & 1 \\\ 1 & 2 & 4 \\\ \end{array}\right] \left[\begin{array}{c} x_1 \\\ x_2 \\\ x_3 \\\ \end{array}\right]
And that's it, just multiply the vectors before with the transformation you want and you're all set!

A question.

I haven't found the answer and probably never will considering it took me way longer than I imagined. So I'll just leave it here :D

The equation for finding transformation matrix.

Φ(bk)=Σi=14αikci for k=1,,3\Phi (b_k) = \Sigma_{i = 1}^4 \alpha_{ik} c_i \text{ for } k = 1, \dots, 3

The solution I thought and that haven't been disproven (a.k.a not scientifically proven)

Just... transpose it.

Here's the formula

Φ(b1)=C1C2+3C3C4 Φ(b2)=2C1+C2+7C3+2C4 Φ(b3)=3C2+1C3+4C4\Phi (b_1) = C_1 - C_2 + 3C_3 - C_4 \\\ \Phi (b_2) = 2C_1 + C_2 + 7C_3 + 2C_4 \\\ \Phi (b_3) = 3C_2 + 1C_3 + 4C_4

Split it and remove the constants

[1131 2172 0314]\left[\begin{array}{cccc} 1 & -1 & 3 & -1 \\\ 2 & 1 & 7 & 2 \\\ 0 & 3 & 1 & 4 \end{array}\right]

Notice that with the result before, each row corresponds to a column?

[1131 2172 0314]T=[120 113 371 124 ]\left[\begin{array}{cccc} 1 & -1 & 3 & -1 \\\ 2 & 1 & 7 & 2 \\\ 0 & 3 & 1 & 4 \end{array}\right]^T = \left[\begin{array}{ccc} 1 & 2 & 0 \\\ -1 & 1 & 3 \\\ 3 & 7 & 1 \\\ 1 & 2 & 4 \\\ \end{array}\right]

So change the row into columns and vice versa!

[120 113 371 124 ]\left[\begin{array}{ccc} 1 & 2 & 0 \\\ -1 & 1 & 3 \\\ 3 & 7 & 1 \\\ 1 & 2 & 4 \\\ \end{array}\right]

Seriously, what's math?? What am I doing wrong? I know the guy from Cambridge who wrote this is much smarter than I am so he must have thought someone will think this is just a transposed version. So why :(


Acknowledgement

I can't overstate this: I'm truly grateful for this book being open-sourced for everyone. Many people will be able to learn and understand machine learning on a fundamental level. Whether changing careers, demystifying AI, or just learning in general, this book offers immense value even for fledgling composer such as myself. So, Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, thank you for this book.

Source: Axler, Sheldon. 2015. Linear Algebra Done Right. Springer Deisenroth, M. P., Faisal, A. A., & Ong, C. S. (2020). Mathematics for Machine Learning. Cambridge: Cambridge University Press. https://mml-book.com

Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA Made with TERRA