吴恩达机器学习课后作业-03多分类、神经网络前向传播

这里写目录标题

逻辑回归解决多分类问题（逻辑回归的"一对多"（One-vs-All）策略。）
- 绘制图像
- 结果
神经网络
- 前向传播
- 数字识别

、

逻辑回归解决多分类问题（逻辑回归的"一对多"（One-vs-All）策略。）

手写数字识别

绘制图像

c 复制代码

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.io as sio
def plot_an_image(x) :
    pick_one=np.random.randint(5000)
    image=x[pick_one,:]#取第一行
    fig,ax=plt.subplots(figsize=(1,1))
    ax.imshow(image.reshape(20,20).T,cmap="gray_r")
    plt.xticks([])
    plt.yticks([])
    plt.show()

def plot_100_image(x):
    sample_index=np.random.choice(len(x),100)
    image=x[sample_index,:]#随机取100行，也就是取一百张图片
    fig, ax = plt.subplots(figsize=(8,8),nrows=10,ncols=10,sharey=True,sharex=True)
    plt.xticks([])
    plt.yticks([])
    for r in range(10):
        for c in range(10):
            ax[r,c].imshow(image[10*r+c].reshape(20,20).T,cmap="gray_r")
    plt.show()

data=sio.loadmat("E:/学习/研究生阶段/python-learning/吴恩达机器学习课后作业/code/ex3-neural network/ex3data1.mat")

raw_x=data["X"]
raw_y=data["y"]

plot_100_image(raw_x)
片

结果

c 复制代码

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.io as sio
from scipy.optimize import minimize
def plot_an_image(x) :
    pick_one=np.random.randint(5000)
    image=x[pick_one,:]#取第一行
    fig,ax=plt.subplots(figsize=(1,1))
    ax.imshow(image.reshape(20,20).T,cmap="gray_r")
    plt.xticks([])
    plt.yticks([])
    plt.show()

def plot_100_image(x):
    sample_index=np.random.choice(len(x),100)
    image=x[sample_index,:]#随机取100行，也就是取一百张图片
    fig, ax = plt.subplots(figsize=(8,8),nrows=10,ncols=10,sharey=True,sharex=True)
    plt.xticks([])
    plt.yticks([])
    for r in range(10):
        for c in range(10):
            ax[r,c].imshow(image[10*r+c].reshape(20,20).T,cmap="gray_r")
    plt.show()



"""
代价函数
"""
def sigmoid(z):
    return 1/(1+np.exp(-z))


"""
回答一下：此时要求theta必须放在第一位，因为分类器那里
所要用到的函数theta是做为要优化的参数来的，其他的参数叫args

"""
def cost_function(theta,x,y,lamda):#
    y_=sigmoid(x @ theta)

    reg=theta[:1]@theta[:1]*(lamda/(2*len(x)))#正则化
    return np.sum(-(y*np.log(y_)+(1-y)*np.log(1-y_))/len(x))+reg

"""
梯度向量
"""


def gradient_reg(theta,x,y,lamda):
    reg = theta[1:] * (lamda / len(x))  #
    reg = np.insert(reg, 0, values=0, axis=0)  # 在第一个元素前0，为了与后面维数匹配
    first=x.T @ (sigmoid(x @ theta)-y)/len(x)
    return first+reg
"""
定义梯度下降函数
alpha:学习速率
inters:迭代次数
lamda
"""
def gradientDescent(x,y,theta,alpha,inters,lamda):
    costs = []
    for i in range(inters):

        reg = theta[1:] * (lamda / len(x))  #
        reg = np.insert(reg, 0, values=0, axis=0) #在第一个元素前插入0，为了与后面维数匹配
        theta=theta-alpha*x.T @ (sigmoid(x @ theta)-y)/len(x)


        cost=cost_function(x,y,theta,lamda)
        costs.append(cost)
        # if i%1000==0:
        #     print(cost)

    return theta,costs


def one_vs_all(x,y,lamda,k):
    n=x.shape[1]
    theta_all=np.zeros((k,n))
    for i in range(1,k+1):
        theta_i=np.zeros(n,)
        res=minimize(fun=cost_function,
                     x0=theta_i,
                     args=(x,y==i,lamda),
                     method="TNC",
                     jac=gradient_reg)
        theta_all[i-1,:]=res.x
    return theta_all

"""

预测函数
"""

def predict(x,theta_finall):
    h=sigmoid(x@theta_finall.T)
    h_argmax=np.argmax(h,axis=1)
    return h_argmax+1



data=sio.loadmat("E:/学习/研究生阶段/python-learning/吴恩达机器学习课后作业/code/ex3-neural network/ex3data1.mat")

raw_x=data["X"]
raw_y=data["y"]
x=np.insert(raw_x,0,1,axis=1)
y=raw_y.flatten()#变为一维的

# plot_100_image(raw_x)

lamda=1
k=10
theta_finall=one_vs_all(x,y, lamda, k)
y_predict=predict(x,theta_finall)
acc=np.mean(y_predict==y)
print(acc)

神经网络

前向传播

数字识别

c 复制代码

import numpy as np
import matplotlib.pyplot as plt
import scipy.io as sio

def sigmoid(z):
    return 1/(1+np.exp(-z))



data=sio.loadmat("E:/学习/研究生阶段/python-learning/吴恩达机器学习课后作业/code/ex3-neural network/ex3data1.mat")

raw_x=data["X"]
raw_y=data["y"]
x=np.insert(raw_x,0,1,axis=1)
y=raw_y.flatten()#变为一维的
theta=sio.loadmat("E:/学习/研究生阶段/python-learning/吴恩达机器学习课后作业/code/ex3-neural network/ex3weights.mat")
theta1=theta["Theta1"]#输入层到隐藏层传递参数
theta2=theta["Theta2"]#隐藏层到输出层传递参数


a1=x
z2=x@theta1.T
a2=sigmoid(z2)
a2=np.insert(a2,0,1,axis=1)
z3=a2 @ theta2.T
a3=sigmoid(z3)


y_pred=np.argmax(a3,axis=1)
y_pred=y_pred+1
acc=np.mean(y_pred==y)
print(acc)

代码看不太懂，下面是ai的解释

c 复制代码

你的代码实现了使用已加载的权重（theta1 和 theta2）在一个简单的神经网络上进行预测，并计算了预测的准确率。这个神经网络有一个输入层、一个隐藏层和一个输出层。不过，代码中有些小地方可以优化或注意，以下是对你代码的详细解释和一些建议：

导入库和定义sigmoid函数：这部分是正确的，你已经成功导入了所需的库并定义了sigmoid函数。
加载数据：你使用scipy.io.loadmat来加载MATLAB的.mat文件，这是处理这类文件的标准方式。
处理输入和输出数据：
你将raw_x（输入数据）的每一行前面插入了一个1，这是为了加入偏置项（bias term），这个处理是正确的。
raw_y（目标输出）被转换为一维数组，这也是正确的，因为np.argmax需要一个一维数组来确定最大值的索引。
加载权重：你正确地加载了权重矩阵theta1和theta2。
前向传播：
z2的计算是正确的，它代表了输入层到隐藏层的线性组合。
a2通过sigmoid函数激活，然后你在其前面插入了1以添加隐藏层的偏置项。然而，这一步通常不是必要的，因为隐藏层的偏置项通常作为权重矩阵theta1的一部分被处理（即theta1的每一行都会有一个偏置项对应的权重）。如果你决定在a2中手动添加偏置项，那么你应该在theta2中相应地调整，但这通常不是标准做法。
z3和a3的计算是正确的，它们代表了隐藏层到输出层的计算。
预测和准确率计算：
使用np.argmax(a3, axis=1)来获取预测类别，这是正确的，因为a3的每一行都代表了一个样本属于各个类别的概率。
由于MATLAB的索引从1开始，而Python的索引从0开始，你通过y_pred=y_pred+1来调整了预测类别的索引，这是必要的，但请确保你的y（真实标签）也是以MATLAB的索引方式（即从1开始）给出的。如果不是，你可能需要调整y的值。
准确率计算acc=np.mean(y_pred==y)是正确的。
建议：

检查y的索引是否也是从1开始的，如果不是，你需要调整y的值以匹配y_pred的索引。
如果隐藏层不需要手动添加偏置项（即theta1已经包含了偏置项），则应该从a2中移除np.insert(a2, 0, 1, axis=1)这一行。
确保theta1和theta2的维度与你的网络架构相匹配（例如，theta1的列数应该等于x的行数（加上偏置项），theta2的列数应该等于输出类别的数量）。
最后，如果你的y数组是从0开始索引的，你应该在比较之前将y转换为从1开始的索引，或者修改y_pred的索引方式以匹配y的索引方式。