机器学习 - 训练模型

接着这一篇博客做进一步说明:
机器学习 - 选择模型

为了解决测试和预测之间的差距,可以通过更新 internal parameters, the weights set randomly use nn.Parameter() and bias set randomly use torch.randn().

Much of the time you won't know what the ideal parameters are for a model.

Instead, it is much more fun to write code to see if the model can try and figure them out itself. That is a loss function as well as and optimizer.

Function What does it do? Where does it live in PyTorch? Common values
Loss function Measures how wrong your models predictions (e.g. y_preds) are compared to the truth labels (e.g. y_test). Lower the better.差距越小越好 PyTorch has plenty of built-in loss functions in torch.nn Mean absolute error (MAE) for regression problems (torch.nn.L1Loss()). Binary cross entropy for binary classification problems (torch.nn.BCELoss())
Optimizer Tells your model how to update its internal parameters to best lower the loss. You can find various optimization function implementations in torch.optim Stochastic gradient descent (torch.optim.SGD()). Adam.optimizer (torch.optim.Adam())

介绍 MAE: Mean absolute error 也称为平均绝对误差,是一种用于衡量预测值与真实值之间差异的损失函数。MAE计算的是预测值与真实值之间的绝对差值的平均值,即平均误差的绝对值。在 PyTorch 中可以使用 torch.nn.L1Loss 来计算MAE.

介绍Stochastic gradient descent:

这是一种常用的优化算法,用于训练神经网络模型。它是梯度下降算法的变种,在每次更新参数时都使用随机样本的梯度估计来更新参数。SGD的基本思想是通过最小化损失函数来调整模型参数,使得模型的预测结果与真实标签尽可能接近。在每次迭代中,SGD随机选择一小批样本 (mini-batch) 来计算损失函数关于参数的梯度,并使用该梯度来更新参数。由于每次更新只使用了一部分样本,因此SGD通常具有更快的收敛速度和更低的计算成本。在 PyTorch 中,可以使用 torch.optim.SGD(params, lr) 来实现,其中

  • params is the target model parameters you'd like to optimize (e.g. the weights and bias values we randomly set before).
  • lr is the learning rate you'd like the optimizer to update the parameters at, higher means the optimizer will try larger updates (these can sometimes be too large and the optimizer will fail to work), lower means the optimizer will try smaller updates (these can sometimes be too small and the optimizer will take too long to find the ideal values). Common starting values for the learning rate are 0.01, 0.001, 0.0001.

介绍 Adam 优化器:

Adam优化器是一种常用的优化算法,它结合了动量法和自适应学习率调整的特性,能够高效地优化神经网络模型的参数。Adam优化器的基本思想是在梯度下降的基础上引入了动量项和自适应学习率调整项。动量项可以帮助优化器在更新参数时保持方向性,从而加速收敛;而自适应学习率调整项可以根据参数的历史梯度来调整学习率,从而在不同参数上使用不同的学习率,使得参数更新更加稳健。

介绍学习率:

学习率是在训练神经网络时控制参数更新步长的一个超参数。它决定了每次参数更新时,参数沿着梯度方向更新的程度。学习率越大,参数更新的步长越大;学习率越小,参数更新的步长越小。选择合适的学习率通常是训练神经网络时需要调节的一个重要超参数。如果学习率过大,可能导致参数更新过大,导致模型不稳定甚至发散;如果学习率过小,可能导致模型收敛速度过慢,训练时间变长。

代码如下:

python 复制代码
import torch

# Create the loss function 
loss_fn = nn.L1Loss()  # MAE loss is same as L1Loss

# Create the optimizer
optimizer = torch.optim.SGD(params = model_0.parameters(),
                            lr = 0.01)

现在创造一个optimization loop

The training loop involves the model going through the training data and learning the relationships between the features and labels.

The testing loop involves going through the testing data and evaluating how good the patterns are that the model learned on the training data (the model never sees the testing data during training).

Each of these is called a "loop" because we want our model to look (loop through) at each sample in each dataset. 所以,得用 for 循环来实现。

PyTorch training loop

Number Step name What does it do? Code example
1 Forward pass The model goes through all of the training data once, performing its forward() function calculations. model(x_train)
2 Calculate the loss The model's outputs (predictions) are compared to the ground truth and evaluated to see how wrong they are. loss = loss_fn(y_pred, y_train)
3 Zero gradients The optimizers gradients are set to zero (they are accumulated by default) so they can be recalculated for the specific training step. optimizer.zero_grad()
4 Perform backpropagation on the loss Computes the gradient of the loss with respect for every model parameter to be updated (each parameter with requires_grad=True). This is known as backpropagation, hence "backwards". loss.backward()
5 Update the optimizer (gradient descent) Update the parameters with requires_grad=True with respect to the loss gradients in order to improve them. optimizer.step()

PyTorch testing loop

As for the testing loop (evaluating the model), the typical steps include:

Number Step name What does it do? Code example
1 Forward pass The model goes through all of the training data once, performing its forward() function calculations. model(x_test)
2 Calculate the loss The model's outputs (predictions) are compared to the ground truth and evaluated to see how wrong they are. loss = loss_fn(y_pred, y_test)
3 Calculate evaluation metrics (optional) Alongside the loss value you may want to calculate other evaluation metrics such as accuracy on the test set. Custom functions

下面是代码实现

python 复制代码
# Create the loss function
# 那你。L1Loss() 是用于计算平均绝对误差 (MAE) 的损失函数。
loss_fn = nn.L1Loss()  # MAE loss is same as L1Loss

# Create the optimizer
# torch.optim.SGD() 是用于创建随机梯度下降优化器的函数。
# parameters() 返回一个包含了模型中所有需要进行梯度更新的参数的迭代器
optimizer = torch.optim.SGD(params = model_0.parameters(),
                            lr = 0.01)

# Set the number of epochs (how many times the model will pass over the training data)
epochs = 200

# Create empty loss lists to track values
train_loss_values = []
test_loss_values = []
epoch_count = []

for epoch in range(epochs):
  ### Training 

  # Put model in training mode (this is the default state of a model)
  # train() 函数通常用于将模型设置为训练模式
  model_0.train()

  # 1. Forward pass on train data using the forward() method inside
  y_pred = model_0(X_train)

  # 2. Calculate the loss (how different are our models predictions to the ground truth)
  loss = loss_fn(y_pred, y_train)

  # 3. Zero grad of the optimizer
  optimizer.zero_grad() 

  # 4. Loss backwards
  loss.backward()

  # 5. Progress the optimizer
  # step() 用于执行一步参数更新操作。
  optimizer.step() 

  ### Testing

  # Put the model in evaluation mode
  model_0.eval() 

  with torch.inference_mode():
    # 1. Forward pass on test data 
    test_pred = model_0(X_test)

    # 2. Calculate loss on test data 
    test_loss = loss_fn(test_pred, y_test.type(torch.float))  # predictions come in torch.float datatype, so comparisons need to be done with tensors of the same type 

    # Print out 
    if epoch % 10 == 0:
      epoch_count.append(epoch)
      # detach() 方法用于将这个张量从计算图中分离出来,目的是为了避免在将张量转换为numpy数组时保留计算图的依赖关系,减少内存占用并加速代码的执行
      train_loss_values.append(loss.detach().numpy())
      test_loss_values.append(test_loss.detach().numpy())
      print(f"Epoch: {epoch} | MAE Train Loss: {loss} | MAE Test Loss: {test_loss}")


plt.plot(epoch_count, train_loss_values, label="Train loss")
plt.plot(epoch_count, test_loss_values, label="Test loss")
plt.title("Training and test loss curves")
plt.ylabel("Loss")
plt.xlabel("Epochs")
plt.legend()

print("The model learned the following values for weights and bias: ")
print(model_0.state_dict())
print("\nAnd the original values for weights and bias are: ")
print(f"weights: {weight}, bias: {bias}")

# 结果如下:
Epoch: 0 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 10 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 20 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 30 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 40 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 50 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 60 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 70 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 80 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 90 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 100 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 110 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 120 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 130 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 140 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 150 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 160 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 170 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 180 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
Epoch: 190 | MAE Train Loss: 0.008932482451200485 | MAE Test Loss: 0.005023092031478882
The model learned the following values for weights and bias: 
OrderedDict([('weights', tensor([0.6990])), ('bias', tensor([0.3093]))])

And the original values for weights and bias are: 
weights: 0.7, bias: 0.3

Loss is the measure of how wrong your model is. Loss 的值越低,效果越好。

都看到这了,点个赞支持下呗~

相关推荐
DisonTangor1 小时前
MiniMax 开源一个为极致编码与智能体工作流打造的迷你模型——MiniMax-M2
人工智能·语言模型·开源·aigc
Giser探索家3 小时前
无人机桥梁巡检:以“空天地”智慧之力守护交通生命线
大数据·人工智能·算法·安全·架构·无人机
不会学习的小白O^O3 小时前
双通道深度学习框架可实现从无人机激光雷达点云中提取橡胶树冠
人工智能·深度学习·无人机
恒点虚拟仿真3 小时前
虚拟仿真实训破局革新:打造无人机飞行专业实践教学新范式
人工智能·无人机·ai教学·虚拟仿真实训·无人机飞行·无人机专业虚拟仿真·无人机飞行虚拟仿真
丁浩6664 小时前
Python机器学习---6.集成学习与随机森林
python·随机森林·机器学习
鲜枣课堂4 小时前
华为最新光通信架构AI-OTN,如何应对AI浪潮?
人工智能·华为·架构
格林威5 小时前
AOI在新能源电池制造领域的应用
人工智能·数码相机·计算机视觉·视觉检测·制造·工业相机
dxnb225 小时前
Datawhale25年10月组队学习:math for AI+Task5解析几何
人工智能·学习
DooTask官方号5 小时前
DooTask 1.3.38 版本更新:MCP 服务器与 AI 工具深度融合,开启任务管理新体验
运维·服务器·人工智能·开源软件·dootask
救救孩子把5 小时前
30-机器学习与大模型开发数学教程-3-4 矩阵的逆与伪逆
线性代数·机器学习·矩阵