| 初始化 |
w = 0(权重向量) b = 0(偏置项) learning_rate = α(学习率) max_iterations = N(最大迭代次数) |
初始化模型参数和超参数 |
| Sigmoid函数 |
def sigmoid(z): return 1 / (1 + exp(-z)) |
将线性输出映射到概率空间(0,1) |
| 前向传播 |
def forward(X): z = X * w + b(线性组合) y_hat = sigmoid(z)(预测概率) return y_hat |
计算模型预测值 |
| 损失函数计算 |
def compute_loss(y, y_hat): return -sum(y * log(y_hat) + (1-y) * log(1-y_hat)) / m (m为样本数量) |
计算交叉熵损失 |
| 梯度计算 |
def compute_gradients(X, y, y_hat): dw = X.T * (y_hat - y) / m db = sum(y_hat - y) / m return dw, db |
计算损失函数对参数的梯度 |
| 参数更新 |
w = w - learning_rate * dw b = b - learning_rate * db |
使用梯度下降更新参数 |
| 训练循环 |
for i in 0 to max_iterations: y_hat = forward(X_train) loss = compute_loss(y_train, y_hat) dw, db = compute_gradients(X_train, y_train, y_hat) w = w - learning_rate * dw b = b - learning_rate * db if loss收敛:break |
迭代训练过程 |
| 预测函数 |
def predict(X): y_hat = forward(X) return 1 if y_hat >= 0.5 else 0 |
二分类预测 |
| 多分类扩展 |
# 一对多策略 for each class c: 训练模型将c与其他类别分开 预测时选择概率最高的类别 |
处理多分类问题 |