CS5062 Machine Learning: Practical 01-vectors,Matrix

CS5062 Machine Learning: Practical 01

In this practical, we will work on defining vectors and matrices in Python, vector and matrix calculation, inverse matrices, and Singular Value Decomposition (SVD).

Please read over the whole notebook. It contains several excercises.

Exercise 1: Create row vectors and column vectors in Python by using numpy library

Firstly, we need to create row vectors
a = ( 1 , 0 , 2 ) a = \left( \begin{array}{cc} 1, 0, 2 \end{array}\right) a=(1,0,2)
b = ( 3 , 2 , 1 ) b = \left( \begin{array}{cc} 3, 2, 1 \end{array}\right) b=(3,2,1)

Secondly, we need to create column vectors
a T = ( 1 0 2 ) a^T = \left( \begin{array}{cc} 1 \\ 0 \\ 2 \end{array}\right) aT= 102

b T = ( 3 2 1 ) b^T = \left( \begin{array}{cc} 3 \\ 2 \\ 1 \end{array}\right) bT= 321

Create row vectors a = [1, 0, 2] and b = [3, 2, 1]

python 复制代码

import numpy as np
#"Put your code here"
a=np.array([1,0,2])
b=np.array([3,2,1])
print("row vectors:")
print(a)
print(b)

Create column vectors of a and b

python 复制代码

#"Put your code here"
aT=np.array([[1],
            [0],
            [2]])

bT=np.array([[3],
            [2],
            [1]])
print("column vectors:")
print(aT)
print(bT)

Exercise 2: Show the dimensions of vectors a a a, a T a^T aT, b b b, and b T b^T bT

A row vector only has one dimension in python (its not directly treated as matrix). We can see that as the second dimension (calling shape) is left free.

Show the dimensions of vectors a, a^T, b, and b^T

python 复制代码

# "Put your code here"
print(a.shape)
print(aT.shape)
print(b.shape)
print(bT.shape)

Exercise 3: Add vectors and compute the inner product.

Hint: For the inner product, you need to use the dot function

Add Vector a and b, which are obtained in Exercise 1

python 复制代码

#"Put your code here"
c= a+b
print(c)

Compute the inner product of a and b

python 复制代码

#"Put your code here"
d=a.dot(b)   #在 NumPy 中，dot 是矩阵或向量的点积（dot product）操作
print(d) #d=(1⋅3)+(0⋅2)+(3⋅1)=3+0+3=6

Exercise 4: Create c in Python by using numpy library.

Create Matrix A and B:

A = ( 2 1 3 1 1 2 ) A = \left( \begin{array}{cc} 2 & 1 & 3 \\ 1 & 1 & 2 \end{array}\right) A=(211132)

B = ( 2 1 1 2 5 2 ) B = \left( \begin{array}{cc} 2 & 1 \\ 1 & 2 \\ 5 & 2 \end{array}\right) B= 215122

Then, print the dimensions of Matrix A and B.

Create Matrix A and B

python 复制代码

# "Put your code here"
A=np.array([[2,1,3],[1,1,2]])  # matrices must have []
B=np.array([[2,1],
            [1,2],
            [5,2]])

Print the dimensions of Matrix A and B

python 复制代码

# "Put your code here"
A.shape,B.shape

Exercise 5: Transpose, Add and Multiply matrices.

Let's transpose Matrix B B B to Matrix B T B^T BT

Transpose Matrix B

"Put your code here"

python 复制代码

BT=B.transpose()
print(BT)

Let's add Matrix A and Matrix B T B^T BT for deriving Matrix C C C. That is,

C = A + B T C = A + B^T C=A+BT

Show Matrix C C C

matrix addition A + B^T

"Put your code here"

C=A+BT

print©

Let's multiply C C C by Matrix A T A^T AT and derive Matrix D D D. That is,

D = C × A T D = C \times A^T D=C×AT

Show Matrix D D D

matrix product C * A^T

python 复制代码

# "Put your code here"
AT=A.transpose()
print(AT)
D=C.dot(AT) 
print(D)

Let's multiply matrices with vectors.

Firstly, we multiply Matrix A A A by Vector a a a and derive Matrix E E E. That is, E = A × a T E = A \times a^T E=A×aT

Secondly, we multiply Vector a a a by Matrix B B B and derive Matrix F F F. That is, F = a × B F = a \times B F=a×B

Thirdly, show Matrix E E E and F F F

this corresponds to A * a^T

python 复制代码

# "Put your code here" 
E=A.dot(aT)

this corresponds to a^T * B

python 复制代码

#"Put your code here" 
F=a.dot(B)

Print Matrices E, F

python 复制代码

# "Put your code here"
E,F
print(a)
print(B)
print(E)
print(F)

Exercise 6: The inverse of a matrix

Firstly, compute Matrix AA, where
A A = A × A T AA = A\times A^T AA=A×AT

Secondly, compute the inverse of Matrix A A AA AA

Hint: you can use linalg class of numpy to compute the inverse of a matrix

import numpy.linalg as linalg

A * A^T ... we can only invert quadratic matrices

python 复制代码

# "Put your code here"
AA=A.dot(A.transpose())

the inverse of Matrix AA

"Put your code here"

AAinv=linalg.inv(AA)

AA,AAinv

Thirdly, multiplying Matrix A A AA AA with its inverse ( A A ) − 1 (AA)^{-1} (AA)−1 needs to result in the Identity matrix. That is,
A A × ( A A ) − 1 = I AA \times (AA)^{-1} = I AA×(AA)−1=I

Fourthly, print your results.

AA * AA^(-1)

python 复制代码

# "Put your code here"
AA.dot(AAinv),AAinv.dot(AA)

Note: Computing the inverse of a matrix is tricky and it is hard to get a numerically accurate solution. Whenever we need to compute the inverse of a matrix times another matrix ( A A ) − 1 h \boldsymbol({AA})^{-1}\boldsymbol{h} (AA)−1h, then it is better to use specifically tailored methods for this which are numerically more stable.

python 复制代码

import numpy.linalg as linalg
# compute AA^-1*h in a more stable way using linalg.solve.
h = np.array([1, 2])
out1 = linalg.solve(AA, h)

out1

Exercise 7: More exercises for vector and matrix calculations

Compute:

( A a − b ) T ( A a − b ) (\boldsymbol{A} \boldsymbol a - \boldsymbol b)^T(\boldsymbol A \boldsymbol a - \boldsymbol b) (Aa−b)T(Aa−b),
( C b ) T C (\boldsymbol{C} \boldsymbol b)^T\boldsymbol C (Cb)TC
( C T C ) − 1 C T a (\boldsymbol{C}^T \boldsymbol C)^{-1}\boldsymbol C^T \boldsymbol a (CTC)−1CTa,

where
A = ( 1 0 1 2 3 1 ) A = \left( \begin{array}{cc} 1 & 0 & 1 \\ 2 & 3 & 1 \end{array}\right) A=(120311)

C = ( 1 0 2 3 1 5 ) C = \left( \begin{array}{cc} 1 & 0 \\ 2 & 3 \\ 1 & 5 \end{array}\right) C= 121035

a = ( 1 2 1 ) a = \left( \begin{array}{cc} 1 \\ 2 \\ 1 \end{array}\right) a= 121

b = ( 2 2 ) b = \left( \begin{array}{cc} 2 \\ 2 \end{array}\right) b=(22)

Check your result also in terms of the dimensionalities of the resulting matrices. That is an easy way of spotting an error. Always use the linalg.solve method instead of the linalg.inv method if possible.

Create Matrix A, C

python 复制代码

# "Put your code here"
A=np.array([[1,0,1],[2,3,1]])
C=np.array([[1,0],[2,3],[1,5]])

# Create Vector a, b
# "Put your code here"
a=np.array([[1],[2],[1]])
b=np.array([[2],[2]])

#  Caculate (Aa-b)^T * (Aa -b)
# "Put your code here"
temp=(A.dot(a) -b)
sol1=temp.transpose().dot(temp)
# Caculate (Cb)^T * C
# "Put your code here"
temp1=C.dot(b).transpose()
sol2=temp1.dot(C)

Caculate (C^T * C)^(-1) * C^T * a

python 复制代码

# "Put your code here"
temp2=C.transpose()
#sol3=linalg.inv(temp2.dot(C)).dot(temp2.dot(a))
sol3=linalg.solve(temp2.dot(C),temp2.dot(a))

Print results of three above equations

python 复制代码

# "Put your code here"

sol1,sol2,sol3

Exercise 8: Singular Value Decomposition (SVD)

Conduct SVD on Matrix G G G and obtain Matrices U, D, and V, where

G = ( 7 2 3 4 5 3 ) G = \left( \begin{array}{cc} 7 & 2 \\ 3 & 4 \\ 5 & 3 \end{array}\right) G= 735243

Hint: The singular value decomposition can be done with the linalg.svd() function from Numpy

Create Matrix G

python 复制代码

# "Put your code here"
G=np.array([[7,2],[3,4],[5,3]])

Compute Matrices U, D and V

python 复制代码

# "Put your code here"
U, D, V=linalg.svd(G)
U,D,V

Show the dimensions of Matrices U, D, and V

The shape of Matrix U, D, and V

python 复制代码

# "Put your code here"
U.shape,D.shape,V.shape

(Optional) Exercise 9: Apply the SVD on images

Let's start by loading an image in python and convert it to a Numpy array. We will convert it to grayscale to have one dimension per pixel. The shape of the matrix corresponds to the dimension of the image filled with intensity values: 1 cell per pixel.

from PIL import Image #从 PIL 库导入 Image 模块，用于加载和处理图像。

import matplotlib.pyplot as plt #导入 matplotlib.pyplot 模块，用于可视化图像。plt 是 matplotlib 的子模块，提供了绘制图像和图表的功能。

Load "test_svd.png" image

python 复制代码

# "Put your code here"
img=Image.open("test_svd.png")  #加载图像：使用 Image.open() 函数打开名为 "test_svd.png" 的图像文件并将其加载到 img 对象中。可以本地新建一个test_svd.png

convert image to grayscale

python 复制代码

# "Put your code here"
imggray=img.convert('LA')
'''转换为灰度图像：使用 convert('LA') 将图像转换为灰度模式。
'L' 模式表示每个像素使用一个 8 位的灰度值（0-255 范围），而 'A' 是透明度通道。LA 模式会产生灰度图像，保留透明度通道。'''

convert to numpy array

python 复制代码

# "Put your code here"
imgmat=np.array(list(imggray.getdata(band=0)),float)
'''imggray.getdata(band=0)：从灰度图像中提取第一个通道的数据（灰度通道，不包括透明度通道），以线性列表的形式返回图像像素值。
list(imggray.getdata(band=0))：将提取的像素值转换为 Python 列表。
np.array(..., float)：将像素值列表转换为 NumPy 数组，并将数据类型转换为 float 类型以便后续操作。'''

Reshape according to orginal image dimensions

python 复制代码

# "Put your code here"
imgmat.shape=(imggray.size[1],imggray.size[0])

# imggray.size[0] 是图像的宽度，imggray.size[1] 是图像的高度。
'''这里通过 imgmat.shape=(height, width) 将 NumPy 数组重新塑形，使其符合原始图像的尺寸。
图像数据最初是以一维数组形式存储的，此处将其调整为与图像像素的二维结构匹配（即高度和宽度的形式）。'''

plt.figure(figsize=(9,6))

#创建绘图窗口：使用 plt.figure() 创建一个新的绘图窗口，并设置图像显示的尺寸为 9x6 英寸。

plt.imshow(imgmat,cmap='gray')

'''显示图像：使用 imshow() 显示 NumPy 数组表示的图像。
imgmat 是灰度图像的二维 NumPy 数组。
cmap='gray' 参数指定使用灰度颜色映射来显示图像。'''

plt.show() #显示绘制的图像窗口：plt.show() 命令将图像在屏幕上显示出来。

# 代码整体流程：
# 1.加载 "test_svd.png" 图像。
# 2.将图像转换为灰度模式。
# 3.将灰度图像的像素数据转换为 NumPy 数组。
# 4.调整 NumPy 数组的形状，使其与图像的尺寸相匹配。
# 5.使用 Matplotlib 显示处理后的灰度图像。
# 这段代码的核心目的是将图像从文件中读取，处理成灰度图像，然后以 NumPy 数组的形式存储并显示。

Compute Matrix U U U, D D D, and V V V by Singular Value Decomposition (SVD).

Hint: You can use np.linalg.svd() function to obtain Matrix U U U, D D D, and V V V

Compute Matrix U, D, and V

python 复制代码

# "Put your code here"
U,D,V=np.linalg.svd(imgmat)

Reconstruct the image from two singular values

python 复制代码

# Provide your code here
# "Put your code here"
reconsting=np.matrix(U[:,:2]) * np.diag(D[:2]) * np.matrix(V[:2,:])
plt.imshow(reconsting,cmap='gray')
plt.show()

这段代码是通过 奇异值分解（SVD）对图像进行低秩近似的操作。它将原始图像矩阵分解成三个矩阵，保留前两个奇异值和相应的奇异向量，最后使用它们重构出一个简化版的图像。以下是每行代码的详细解释：

1. `U, D, V = np.linalg.svd(imgmat)`

进行 SVD 分解：
- 使用 np.linalg.svd() 对 imgmat 进行奇异值分解，将原始的图像矩阵 ( A ) 分解为三个矩阵 ( U )、( D )、和 ( V^T )，其中：
  - U：左奇异矩阵，形状为 ( (m \times m) )，代表图像行空间的正交基。
  - D：奇异值向量（而非矩阵），表示奇异值的大小。它会在后续被转换为对角矩阵。
  - V：右奇异矩阵，形状为 ( (n \times n) )，代表图像列空间的正交基。
SVD 的分解公式：( A = U \cdot \Sigma \cdot V^T )。

2. `reconsting = np.matrix(U[:, :2]) * np.diag(D[:2]) * np.matrix(V[:2, :])`

低秩重构图像：
- U[:, :2]：保留左奇异矩阵 ( U ) 的前两列（前两个左奇异向量），这些列捕捉了数据中最主要的两个维度信息。
- D[:2]：保留奇异值 ( D ) 的前两个元素，并使用 np.diag(D[:2]) 将其转换为一个 ( 2 \times 2 ) 的对角矩阵。
- V[:2, :]：保留右奇异矩阵 ( V ) 的前两行（前两个右奇异向量）。
- np.matrix(...)：将 U 和 V 转换为矩阵形式，以便执行矩阵乘法（尽管 NumPy 数组也支持，但这里选择用 np.matrix 使操作更直观）。
重构过程：通过前两个奇异值及其对应的奇异向量对图像矩阵进行重构，这种低秩近似保留了最主要的信息，同时丢弃了较小的奇异值对应的次要信息，从而降低了图像的复杂度。

公式：重构的矩阵为 ( U[:, :2] \cdot \Sigma[:2, :2] \cdot V[:2, :] )，这就是前两个奇异值和向量的近似。

3. `plt.imshow(reconsting, cmap='gray')`

显示重构后的图像 ：
- 使用 imshow() 函数显示通过低秩近似重构出的图像。
- reconsting 是重构后的图像矩阵。
- cmap='gray'：将图像显示为灰度图。

4. `plt.show()`

显示图像窗口：将绘制的图像输出到屏幕上。

低秩近似的目的：

通过只保留前两个最大的奇异值和相应的奇异向量，重构的图像会是原始图像的一个简化版本。
这种技术可以用来压缩图像 或降噪，因为较小的奇异值通常捕捉了图像中的细节和噪声，而主要信息集中在大的奇异值中。

可视化结果：

重构后的图像将比原始图像更加模糊，因为我们只保留了前两个奇异值和对应的向量。尽管图像的清晰度降低了，但它仍保留了图像的主要结构。

Draw the reconstructed image using different number of singular values, e.g. 5, 10, 15, 20, 30, and 50.

reconstructed image with different number of singular values

python 复制代码

# "Put your code here"
for i in   [5,10,15,20,30,50,60,70,80,90]:
    reconstimg=np.matrix(U[:,:i]) * np.diag(D[:i]) * np.matrix(V[:i,:])
    plt.imshow(reconstimg,cmap='gray')
    title ="n = %s" % i
    plt.title(title)
    plt.show()

这段代码通过奇异值分解（SVD）对图像进行多次重构，每次保留不同数量的奇异值和对应的奇异向量，观察图像质量随着奇异值数量的增加如何变化。它遍历不同的 ( n ) 值，使用从 5 到 90 的不同数量的奇异值来重构图像。以下是逐行解释：

1. `for i in [5, 10, 15, 20, 30, 50, 60, 70, 80, 90]:`

遍历不同数量的奇异值 ( i ) ：
- 这是一个循环，i 代表奇异值的数量。每次循环都会选择不同数量的奇异值来进行图像重构。
- [5, 10, 15, 20, 30, 50, 60, 70, 80, 90] 是一个列表，定义了不同的 ( i ) 值，即分别保留 5、10、15 等奇异值来进行重构。

2. `reconstimg = np.matrix(U[:, :i]) * np.diag(D[:i]) * np.matrix(V[:i, :])`

基于前 ( i ) 个奇异值重构图像：
- U[:, :i]：选择左奇异矩阵 ( U ) 的前 ( i ) 列，这些列对应前 ( i ) 个左奇异向量，捕捉矩阵的主要信息。
- D[:i]：选择奇异值 ( D ) 的前 ( i ) 个奇异值，并用 np.diag() 将其转换为 ( i \times i ) 的对角矩阵。
- V[:i, :]：选择右奇异矩阵 ( V ) 的前 ( i ) 行，对应前 ( i ) 个右奇异向量。
- np.matrix(...)：将 U 和 V 转换为矩阵形式，以便进行矩阵乘法。
重构公式：图像矩阵 ( A \approx U[:, :i] \cdot \Sigma[:i, :i] \cdot V[:i, :] )，这是使用前 ( i ) 个奇异值和奇异向量进行的图像近似。

3. `plt.imshow(reconstimg, cmap='gray')`

显示重构后的图像 ：
- 使用 imshow() 函数显示基于 ( i ) 个奇异值重构的图像。
- reconstimg 是当前循环中重构的图像矩阵。
- cmap='gray'：将图像显示为灰度图。

4. `title = "n = %s" % i`

设置图像标题 ：
- 创建一个字符串 title，表示当前循环中所使用的奇异值数量 ( i )。
- title = "n = %s" % i 将当前的奇异值数量 ( i ) 格式化并插入到字符串 "n = %s" 中。
- 例如，当 ( i = 5 ) 时，标题为 "n = 5"。

5. `plt.title(title)`

设置图像窗口的标题 ：
- 使用 plt.title() 函数将上一步中生成的标题 title 设置为当前图像的标题。

6. `plt.show()`

显示图像 ：
- 使用 plt.show() 显示当前重构的图像。每次循环都会生成一个新的图像窗口，展示基于不同数量的奇异值 ( i ) 重构的图像。

代码整体作用：

该代码在每次循环中使用前 ( i ) 个奇异值和奇异向量来对图像进行低秩近似，并将重构后的图像显示出来。
通过增加 ( i ) 的值（即增加保留的奇异值数量），重构的图像会变得越来越清晰，因为更多的奇异值保留了图像中的更多信息。
- 当 ( i ) 较小（如 5）时，图像会非常模糊，因为只保留了少量的主要信息。
- 随着 ( i ) 的增加，图像的细节会逐渐恢复，最终当 ( i ) 接近矩阵的秩时，图像会变得与原始图像非常相似。

可视化效果：

每次循环都会输出一幅图像，并显示在图像标题中保留的奇异值数量。
你可以通过这些图像看到随着 ( i ) 的增加，图像的质量是如何逐步提高的。这种方式有效地展示了奇异值的重要性，以及如何通过保留较少的奇异值对图像进行压缩。

CS5062 Machine Learning: Practical 01-vectors,Matrix