机器学习评估指标详解 - 进阶篇
本文是机器学习评估指标系列的第二篇,深入讲解训练过程指标、模型参数指标、超参数调优指标和模型验证指标,帮助您全面理解机器学习模型的训练和优化过程。
目录
引言
在入门篇中,我们学习了数据质量、前处理和基础评估指标。在进阶篇中,我们将深入探讨:
📊 图表生成说明 :本文档中的所有图表都可以通过运行
docs/ml_metrics_visualization_complete.py脚本生成。每个指标部分都包含了相应的图表说明和代码引用。
- 训练过程监控:如何实时监控模型训练状态
- 模型复杂度评估:参数量、计算量等指标
- 超参数优化:如何选择最佳超参数
- 模型验证策略:交叉验证、留出法等
这些指标帮助我们:
- 及时发现训练问题(过拟合、欠拟合、梯度消失等)
- 优化模型架构和超参数
- 评估模型的泛化能力
- 平衡模型性能和计算资源
训练过程指标
1. 损失函数指标
训练损失 (Training Loss)
定义:模型在训练集上的损失值
监控目的:
- 观察模型是否在学习
- 识别训练异常(损失不下降、损失爆炸等)
Python实现:
python
import matplotlib.pyplot as plt
import numpy as np
class LossTracker:
"""
损失函数跟踪器
"""
def __init__(self):
self.train_losses = []
self.val_losses = []
self.epochs = []
def update(self, epoch, train_loss, val_loss=None):
"""
更新损失值
"""
self.epochs.append(epoch)
self.train_losses.append(train_loss)
if val_loss is not None:
self.val_losses.append(val_loss)
def plot(self):
"""
绘制损失曲线
"""
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(self.epochs, self.train_losses, label='Training Loss', color='blue')
if self.val_losses:
plt.plot(self.epochs, self.val_losses, label='Validation Loss', color='red')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Loss Curve')
plt.legend()
plt.grid(True, alpha=0.3)
# 绘制对数尺度损失(如果损失值很大)
plt.subplot(1, 2, 2)
plt.semilogy(self.epochs, self.train_losses, label='Training Loss', color='blue')
if self.val_losses:
plt.semilogy(self.epochs, self.val_losses, label='Validation Loss', color='red')
plt.xlabel('Epoch')
plt.ylabel('Loss (log scale)')
plt.title('Loss Curve (Log Scale)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
def analyze(self):
"""
分析损失趋势
"""
if len(self.train_losses) < 2:
return "数据不足"
# 计算损失下降率
recent_losses = self.train_losses[-10:] if len(self.train_losses) >= 10 else self.train_losses
loss_trend = np.polyfit(range(len(recent_losses)), recent_losses, 1)[0]
analysis = {
'current_train_loss': self.train_losses[-1],
'min_train_loss': min(self.train_losses),
'loss_trend': loss_trend,
'is_decreasing': loss_trend < 0
}
if self.val_losses:
analysis['current_val_loss'] = self.val_losses[-1]
analysis['min_val_loss'] = min(self.val_losses)
analysis['overfitting'] = self.val_losses[-1] > self.train_losses[-1] * 1.1
return analysis
验证损失 (Validation Loss)
定义:模型在验证集上的损失值
关键观察:
- 训练损失 < 验证损失:正常情况
- 训练损失 << 验证损失:可能过拟合
- 训练损失 ≈ 验证损失:可能欠拟合或数据不足
可视化图表:

生成图表代码:
python
from docs.ml_metrics_visualization_complete import plot_loss_function_comparison
plot_loss_function_comparison()
损失函数类型
1. 分类任务损失
python
import torch
import torch.nn as nn
# 交叉熵损失
criterion_ce = nn.CrossEntropyLoss()
# Focal Loss(处理类别不平衡)
class FocalLoss(nn.Module):
def __init__(self, alpha=0.25, gamma=2.0):
super(FocalLoss, self).__init__()
self.alpha = alpha
self.gamma = gamma
def forward(self, inputs, targets):
ce_loss = nn.CrossEntropyLoss(reduction='none')(inputs, targets)
pt = torch.exp(-ce_loss)
focal_loss = self.alpha * (1 - pt) ** self.gamma * ce_loss
return focal_loss.mean()
# Label Smoothing Loss
class LabelSmoothingLoss(nn.Module):
def __init__(self, num_classes, smoothing=0.1):
super(LabelSmoothingLoss, self).__init__()
self.num_classes = num_classes
self.smoothing = smoothing
def forward(self, inputs, targets):
log_probs = nn.functional.log_softmax(inputs, dim=1)
with torch.no_grad():
true_dist = torch.zeros_like(log_probs)
true_dist.fill_(self.smoothing / (self.num_classes - 1))
true_dist.scatter_(1, targets.unsqueeze(1), 1.0 - self.smoothing)
return torch.mean(torch.sum(-true_dist * log_probs, dim=1))
2. 回归任务损失
python
# MSE Loss
mse_loss = nn.MSELoss()
# MAE Loss
mae_loss = nn.L1Loss()
# Huber Loss(对异常值更鲁棒)
huber_loss = nn.HuberLoss(delta=1.0)
# Smooth L1 Loss
smooth_l1_loss = nn.SmoothL1Loss()
3. 目标检测损失
python
# IoU Loss
def iou_loss(pred_boxes, target_boxes):
"""
计算IoU损失
"""
iou = calculate_iou_batch(pred_boxes, target_boxes)
return 1 - iou.mean()
# GIoU Loss
def giou_loss(pred_boxes, target_boxes):
"""
计算GIoU损失
"""
giou = calculate_giou_batch(pred_boxes, target_boxes)
return 1 - giou.mean()
# CIoU Loss
def ciou_loss(pred_boxes, target_boxes):
"""
计算CIoU损失
"""
ciou = calculate_ciou_batch(pred_boxes, target_boxes)
return 1 - ciou.mean()
2. 准确率指标
训练准确率 vs 验证准确率
监控目的:
- 识别过拟合:训练准确率 >> 验证准确率
- 识别欠拟合:训练准确率和验证准确率都很低
- 确定最佳训练轮数:验证准确率不再提升时停止
Python实现:
python
class AccuracyTracker:
"""
准确率跟踪器
"""
def __init__(self):
self.train_accuracies = []
self.val_accuracies = []
self.epochs = []
def update(self, epoch, train_acc, val_acc=None):
self.epochs.append(epoch)
self.train_accuracies.append(train_acc)
if val_acc is not None:
self.val_accuracies.append(val_acc)
def plot(self):
plt.figure(figsize=(10, 6))
plt.plot(self.epochs, self.train_accuracies, 'o-', label='Training Accuracy', color='blue')
if self.val_accuracies:
plt.plot(self.epochs, self.val_accuracies, 'o-', label='Validation Accuracy', color='red')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title('Accuracy Curve')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
def get_best_epoch(self):
"""
获取最佳验证准确率对应的epoch
"""
if not self.val_accuracies:
return None
best_idx = np.argmax(self.val_accuracies)
return {
'epoch': self.epochs[best_idx],
'val_accuracy': self.val_accuracies[best_idx],
'train_accuracy': self.train_accuracies[best_idx]
}
可视化图表:

生成图表代码:
python
from docs.ml_metrics_visualization_complete import plot_training_curves
plot_training_curves()
3. 梯度指标
梯度范数 (Gradient Norm)
定义:所有参数梯度的L2范数
监控目的:
- 梯度爆炸:梯度范数突然增大
- 梯度消失:梯度范数接近0
- 训练稳定性:梯度范数应该稳定下降
Python实现:
python
def calculate_gradient_norm(model):
"""
计算模型梯度的L2范数
"""
total_norm = 0
for param in model.parameters():
if param.grad is not None:
param_norm = param.grad.data.norm(2)
total_norm += param_norm.item() ** 2
total_norm = total_norm ** (1. / 2)
return total_norm
class GradientTracker:
"""
梯度跟踪器
"""
def __init__(self):
self.gradient_norms = []
self.epochs = []
def update(self, epoch, model):
grad_norm = calculate_gradient_norm(model)
self.epochs.append(epoch)
self.gradient_norms.append(grad_norm)
def plot(self):
plt.figure(figsize=(10, 6))
plt.plot(self.epochs, self.gradient_norms, 'o-', color='green')
plt.axhline(y=1.0, color='red', linestyle='--', label='Normal Range')
plt.xlabel('Epoch')
plt.ylabel('Gradient Norm')
plt.title('Gradient Norm Over Training')
plt.legend()
plt.grid(True, alpha=0.3)
plt.yscale('log')
plt.show()
def analyze(self):
"""
分析梯度问题
"""
if len(self.gradient_norms) < 2:
return "数据不足"
current_norm = self.gradient_norms[-1]
max_norm = max(self.gradient_norms)
min_norm = min(self.gradient_norms)
issues = []
if current_norm > 100:
issues.append("梯度爆炸:梯度范数过大")
if current_norm < 1e-6:
issues.append("梯度消失:梯度范数过小")
if max_norm / min_norm > 1000:
issues.append("梯度不稳定:梯度变化过大")
return {
'current_norm': current_norm,
'max_norm': max_norm,
'min_norm': min_norm,
'issues': issues
}
可视化图表:

生成图表代码:
python
from docs.ml_metrics_visualization_complete import plot_gradient_analysis
plot_gradient_analysis()
梯度裁剪 (Gradient Clipping)
目的:防止梯度爆炸
实现:
python
def clip_gradients(model, max_norm=1.0):
"""
梯度裁剪
"""
torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm)
4. 学习率指标
学习率调度
常见策略:
- 固定学习率
python
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
- StepLR:每隔固定epoch降低学习率
python
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1)
- ExponentialLR:指数衰减
python
scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.95)
- CosineAnnealingLR:余弦退火
python
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=100)
- ReduceLROnPlateau:验证损失不再下降时降低学习率
python
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(
optimizer, mode='min', factor=0.5, patience=10
)
学习率跟踪:
python
class LearningRateTracker:
"""
学习率跟踪器
"""
def __init__(self):
self.learning_rates = []
self.epochs = []
def update(self, epoch, optimizer):
current_lr = optimizer.param_groups[0]['lr']
self.epochs.append(epoch)
self.learning_rates.append(current_lr)
def plot(self):
plt.figure(figsize=(10, 6))
plt.plot(self.epochs, self.learning_rates, 'o-', color='purple')
plt.xlabel('Epoch')
plt.ylabel('Learning Rate')
plt.title('Learning Rate Schedule')
plt.yscale('log')
plt.grid(True, alpha=0.3)
plt.show()
可视化图表:

生成图表代码:
python
from docs.ml_metrics_visualization_complete import plot_learning_rate_schedules
plot_learning_rate_schedules()
5. 权重分布指标
权重统计
监控目的:
- 观察权重是否正常更新
- 识别权重初始化问题
- 检测权重爆炸或消失
Python实现:
python
def analyze_weight_distribution(model, layer_name=None):
"""
分析权重分布
"""
weight_stats = {}
for name, param in model.named_parameters():
if 'weight' in name and (layer_name is None or layer_name in name):
weights = param.data.cpu().numpy().flatten()
weight_stats[name] = {
'mean': np.mean(weights),
'std': np.std(weights),
'min': np.min(weights),
'max': np.max(weights),
'median': np.median(weights),
'zero_ratio': np.sum(weights == 0) / len(weights)
}
return weight_stats
def plot_weight_distribution(model, layer_name):
"""
绘制权重分布直方图
"""
for name, param in model.named_parameters():
if layer_name in name and 'weight' in name:
weights = param.data.cpu().numpy().flatten()
plt.figure(figsize=(10, 6))
plt.hist(weights, bins=50, alpha=0.7)
plt.xlabel('Weight Value')
plt.ylabel('Frequency')
plt.title(f'Weight Distribution: {name}')
plt.grid(True, alpha=0.3)
plt.show()
break
可视化图表:

生成图表代码:
python
from docs.ml_metrics_visualization_complete import plot_weight_distribution_analysis
plot_weight_distribution_analysis()
模型参数指标
1. 模型复杂度指标
参数量 (Parameter Count)
定义:模型中可训练参数的总数
计算公式:
总参数量 = Σ(各层参数量)
Python实现:
python
def count_parameters(model):
"""
计算模型参数量
"""
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
return {
'total_params': total_params,
'trainable_params': trainable_params,
'non_trainable_params': total_params - trainable_params,
'total_params_million': total_params / 1e6,
'total_params_billion': total_params / 1e9
}
def print_model_summary(model, input_size=(3, 224, 224)):
"""
打印模型摘要
"""
total_params = count_parameters(model)
print("=" * 60)
print("Model Summary")
print("=" * 60)
print(f"Total Parameters: {total_params['total_params']:,}")
print(f"Trainable Parameters: {total_params['trainable_params']:,}")
print(f"Non-trainable Parameters: {total_params['non_trainable_params']:,}")
print(f"Total Parameters: {total_params['total_params_million']:.2f}M")
print("=" * 60)
# 按层统计
print("\nLayer-wise Parameter Count:")
print("-" * 60)
for name, param in model.named_parameters():
print(f"{name:50s} {param.numel():>15,}")
可视化图表:

生成图表代码:
python
from docs.ml_metrics_visualization_complete import plot_model_complexity
plot_model_complexity()
扩展解析:
模型复杂度分析包括三个关键维度:
- 参数量 vs 准确率:评估模型效率,寻找准确率和参数量的平衡点
- FLOPs vs 准确率:评估计算效率,FLOPs越低,推理速度越快
- 模型效率对比:综合考虑准确率、参数量效率和计算效率
最佳实践:
- 参数量 < 10M:适合移动端部署
- FLOPs < 1G:适合实时推理
- 准确率 > 90%:满足大多数应用需求
FLOPs (Floating Point Operations)
定义:模型前向传播所需的浮点运算次数
Python实现:
python
try:
from thop import profile, clever_format
def calculate_flops(model, input_size=(1, 3, 224, 224)):
"""
计算模型FLOPs
"""
dummy_input = torch.randn(input_size)
flops, params = profile(model, inputs=(dummy_input,))
flops, params = clever_format([flops, params], "%.3f")
return {
'flops': flops,
'params': params
}
except ImportError:
print("请安装thop: pip install thop")
模型大小 (Model Size)
定义:模型文件占用的存储空间
Python实现:
python
def calculate_model_size(model, save_path='temp_model.pth'):
"""
计算模型大小(MB)
"""
torch.save(model.state_dict(), save_path)
import os
size_mb = os.path.getsize(save_path) / (1024 * 1024)
os.remove(save_path)
return {
'size_mb': size_mb,
'size_kb': size_mb * 1024,
'size_bytes': size_mb * 1024 * 1024
}
2. 内存占用指标
前向传播内存
Python实现:
python
import torch
def estimate_memory_usage(model, input_size, batch_size=1):
"""
估算模型内存占用
"""
# 模型参数内存
param_memory = sum(p.numel() * p.element_size() for p in model.parameters())
# 梯度内存(训练时)
grad_memory = param_memory # 梯度大小与参数相同
# 激活值内存(估算)
# 这里需要根据具体模型架构估算
# 简化估算:假设激活值大小与输入大小相关
input_memory = batch_size * np.prod(input_size) * 4 # float32 = 4 bytes
total_memory = param_memory + grad_memory + input_memory
return {
'param_memory_mb': param_memory / (1024 ** 2),
'grad_memory_mb': grad_memory / (1024 ** 2),
'activation_memory_mb': input_memory / (1024 ** 2),
'total_memory_mb': total_memory / (1024 ** 2),
'total_memory_gb': total_memory / (1024 ** 3)
}
3. 推理速度指标
推理时间 (Inference Time)
Python实现:
python
import time
def measure_inference_time(model, input_size, num_runs=100, device='cuda'):
"""
测量推理时间
"""
model.eval()
model = model.to(device)
# 预热
dummy_input = torch.randn(input_size).to(device)
with torch.no_grad():
for _ in range(10):
_ = model(dummy_input)
# 测量
torch.cuda.synchronize() if device == 'cuda' else None
start_time = time.time()
with torch.no_grad():
for _ in range(num_runs):
_ = model(dummy_input)
torch.cuda.synchronize() if device == 'cuda' else None
end_time = time.time()
avg_time = (end_time - start_time) / num_runs
fps = 1.0 / avg_time
return {
'avg_inference_time_ms': avg_time * 1000,
'fps': fps,
'total_time_s': end_time - start_time
}
吞吐量 (Throughput)
定义:单位时间内处理的样本数
Python实现:
python
def measure_throughput(model, input_size, batch_sizes=[1, 4, 8, 16, 32], device='cuda'):
"""
测量不同batch size下的吞吐量
"""
results = {}
for batch_size in batch_sizes:
batch_input_size = (batch_size,) + input_size[1:]
time_result = measure_inference_time(model, batch_input_size, device=device)
results[batch_size] = {
'fps': time_result['fps'],
'samples_per_second': batch_size * time_result['fps']
}
return results
超参数调优指标
1. 超参数重要性分析
超参数敏感性分析
Python实现:
python
import optuna
def hyperparameter_sensitivity_analysis(study):
"""
分析超参数敏感性
"""
importance = optuna.importance.get_param_importances(study)
# 可视化
import matplotlib.pyplot as plt
params = list(importance.keys())
values = list(importance.values())
plt.figure(figsize=(10, 6))
plt.barh(params, values)
plt.xlabel('Importance')
plt.title('Hyperparameter Importance')
plt.tight_layout()
plt.show()
return importance
可视化图表:

生成图表代码:
python
from docs.ml_metrics_visualization_complete import plot_hyperparameter_tuning
plot_hyperparameter_tuning()
2. 超参数搜索策略
网格搜索 (Grid Search)
Python实现:
python
from sklearn.model_selection import GridSearchCV
def grid_search_hyperparameters(estimator, param_grid, X, y, cv=5):
"""
网格搜索超参数
"""
grid_search = GridSearchCV(
estimator, param_grid, cv=cv,
scoring='accuracy', n_jobs=-1, verbose=1
)
grid_search.fit(X, y)
return {
'best_params': grid_search.best_params_,
'best_score': grid_search.best_score_,
'cv_results': grid_search.cv_results_
}
随机搜索 (Random Search)
Python实现:
python
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint, uniform
def random_search_hyperparameters(estimator, param_distributions, X, y, n_iter=100, cv=5):
"""
随机搜索超参数
"""
random_search = RandomizedSearchCV(
estimator, param_distributions, n_iter=n_iter,
cv=cv, scoring='accuracy', n_jobs=-1, verbose=1, random_state=42
)
random_search.fit(X, y)
return {
'best_params': random_search.best_params_,
'best_score': random_search.best_score_,
'cv_results': random_search.cv_results_
}
贝叶斯优化 (Bayesian Optimization)
Python实现:
python
import optuna
def bayesian_optimization(objective, n_trials=100):
"""
贝叶斯优化超参数
"""
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=n_trials)
return {
'best_params': study.best_params,
'best_value': study.best_value,
'study': study
}
3. 超参数调优可视化
超参数关系图
Python实现:
python
def plot_hyperparameter_relationships(study):
"""
绘制超参数关系图
"""
fig = optuna.visualization.plot_parallel_coordinate(study)
fig.show()
fig = optuna.visualization.plot_contour(study)
fig.show()
fig = optuna.visualization.plot_slice(study)
fig.show()
模型验证指标
1. 交叉验证指标
K折交叉验证 (K-Fold Cross-Validation)
Python实现:
python
from sklearn.model_selection import cross_val_score, KFold
def k_fold_cross_validation(estimator, X, y, k=5, scoring='accuracy'):
"""
K折交叉验证
"""
kfold = KFold(n_splits=k, shuffle=True, random_state=42)
scores = cross_val_score(estimator, X, y, cv=kfold, scoring=scoring)
return {
'scores': scores,
'mean_score': scores.mean(),
'std_score': scores.std(),
'min_score': scores.min(),
'max_score': scores.max()
}
分层K折交叉验证 (Stratified K-Fold)
Python实现:
python
from sklearn.model_selection import StratifiedKFold
def stratified_k_fold_cv(estimator, X, y, k=5, scoring='accuracy'):
"""
分层K折交叉验证(保持类别比例)
"""
skfold = StratifiedKFold(n_splits=k, shuffle=True, random_state=42)
scores = cross_val_score(estimator, X, y, cv=skfold, scoring=scoring)
return {
'scores': scores,
'mean_score': scores.mean(),
'std_score': scores.std()
}
可视化图表:

生成图表代码:
python
from docs.ml_metrics_visualization_complete import plot_cross_validation_results
plot_cross_validation_results()
2. 留出法 (Hold-Out)
Python实现:
python
from sklearn.model_selection import train_test_split
def hold_out_validation(X, y, test_size=0.2, random_state=42):
"""
留出法验证
"""
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=test_size, random_state=random_state
)
return {
'X_train': X_train,
'X_test': X_test,
'y_train': y_train,
'y_test': y_test
}
3. 时间序列交叉验证
Python实现:
python
from sklearn.model_selection import TimeSeriesSplit
def time_series_cross_validation(estimator, X, y, n_splits=5):
"""
时间序列交叉验证
"""
tscv = TimeSeriesSplit(n_splits=n_splits)
scores = cross_val_score(estimator, X, y, cv=tscv)
return {
'scores': scores,
'mean_score': scores.mean(),
'std_score': scores.std()
}
目标检测进阶指标
1. AP (Average Precision) - 平均精度
计算方法
11点插值法(Pascal VOC标准):
python
def calculate_ap_11point(precision, recall):
"""
使用11点插值法计算AP
"""
ap = 0.0
for t in np.arange(0, 1.1, 0.1):
if np.sum(recall >= t) == 0:
p = 0
else:
p = np.max(precision[recall >= t])
ap += p / 11.0
return ap
所有点插值法(COCO标准):
python
def calculate_ap_all_points(precision, recall):
"""
使用所有点插值法计算AP(COCO标准)
"""
mrec = np.concatenate(([0.0], recall, [1.0]))
mpre = np.concatenate(([0.0], precision, [0.0]))
# 确保Precision是单调递减的
for i in range(mpre.size - 1, 0, -1):
mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])
# 找到Recall值变化的位置
i = np.where(mrec[1:] != mrec[:-1])[0]
# 计算AP(曲线下面积)
ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
return ap
2. AR (Average Recall) - 平均召回率
Python实现:
python
def calculate_ar(detections, ground_truth, max_dets_list=[1, 10, 100], iou_threshold=0.5):
"""
计算不同maxDets下的AR
"""
ar_dict = {}
for max_dets in max_dets_list:
limited_detections = detections[:max_dets]
tp = 0
fn = 0
for gt in ground_truth:
matched = False
for det in limited_detections:
if calculate_iou(det.bbox, gt.bbox) >= iou_threshold:
matched = True
break
if matched:
tp += 1
else:
fn += 1
recall = tp / (tp + fn) if (tp + fn) > 0 else 0.0
ar_dict[max_dets] = recall
return ar_dict
3. mAP (Mean Average Precision) - 平均精度均值
Python实现:
python
def calculate_map(all_class_aps, iou_thresholds=np.arange(0.5, 1.0, 0.05)):
"""
计算COCO标准的mAP
"""
map_per_iou = []
for iou_thresh in iou_thresholds:
aps_at_iou = [ap_dict[iou_thresh] for ap_dict in all_class_aps]
map_per_iou.append(np.mean(aps_at_iou))
map_value = np.mean(map_per_iou)
return map_value
损失函数详解
1. 分类损失函数
Cross Entropy Loss
公式:
CE = -log(p)
特点:
- 最基础的分类损失
- 对困难样本关注不够
- 类别不平衡时效果差
Focal Loss
公式:
Focal Loss = -α(1-p)^γ log(p)
参数:
α:类别权重(通常0.25)γ:聚焦参数(通常2.0)
特点:
- 自动关注困难样本
- 解决类别不平衡问题
Label Smoothing Loss
公式:
LS Loss = -(1-ε)log(p) - εlog(1-p)
参数:
ε:平滑参数(通常0.1)
特点:
- 防止模型过度自信
- 提高泛化能力
2. 回归损失函数
MSE Loss
公式:
MSE = (1/n) × Σ(y - ŷ)²
MAE Loss
公式:
MAE = (1/n) × Σ|y - ŷ|
Huber Loss
公式:
Huber = { 0.5 × (y - ŷ)², if |y - ŷ| ≤ δ
{ δ × |y - ŷ| - 0.5 × δ², otherwise
特点:
- 对异常值更鲁棒
- 结合MSE和MAE的优点
3. 目标检测损失函数
IoU Loss
公式:
IoU Loss = 1 - IoU
GIoU Loss
公式:
GIoU Loss = 1 - GIoU
GIoU = IoU - |C \ (A ∪ B)| / |C|
CIoU Loss
公式:
CIoU Loss = 1 - CIoU
CIoU = IoU - (ρ²(b, b^gt) / c²) - (αv)
特点:
- 最全面的IoU损失函数
- 同时考虑重叠、中心距离和宽高比
优化器指标
1. SGD优化器
特点:
- 简单稳定
- 需要手动调整学习率
- 可能陷入局部最优
2. Adam优化器
特点:
- 自适应学习率
- 收敛速度快
- 内存占用较大
3. AdamW优化器
特点:
- Adam的改进版本
- 权重衰减解耦
- 更好的泛化能力
4. 优化器性能对比
Python实现:
python
def compare_optimizers(model, train_loader, optimizers, num_epochs=10):
"""
比较不同优化器的性能
"""
results = {}
for opt_name, optimizer in optimizers.items():
model_copy = copy.deepcopy(model)
train_losses = []
for epoch in range(num_epochs):
epoch_loss = 0
for batch in train_loader:
# 训练代码
loss = train_step(model_copy, batch, optimizer)
epoch_loss += loss
train_losses.append(epoch_loss / len(train_loader))
results[opt_name] = train_losses
# 可视化
plt.figure(figsize=(10, 6))
for opt_name, losses in results.items():
plt.plot(losses, label=opt_name)
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Optimizer Comparison')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
return results
可视化图表:

生成图表代码:
python
from docs.ml_metrics_visualization_complete import plot_optimizer_comparison
plot_optimizer_comparison()
正则化指标
1. L1正则化 (Lasso)
公式:
Loss = Original Loss + λ × Σ|w|
特点:
- 产生稀疏权重
- 特征选择
2. L2正则化 (Ridge)
公式:
Loss = Original Loss + λ × Σw²
特点:
- 防止权重过大
- 提高泛化能力
3. Dropout
定义:训练时随机丢弃部分神经元
Python实现:
python
class DropoutTracker:
"""
Dropout效果跟踪
"""
def __init__(self):
self.train_accuracies = []
self.val_accuracies = []
self.dropout_rates = []
def evaluate_dropout_rate(self, model, dropout_rate, train_loader, val_loader):
"""
评估不同dropout率的效果
"""
# 设置dropout率
for module in model.modules():
if isinstance(module, nn.Dropout):
module.p = dropout_rate
# 训练和评估
train_acc = evaluate(model, train_loader)
val_acc = evaluate(model, val_loader)
self.dropout_rates.append(dropout_rate)
self.train_accuracies.append(train_acc)
self.val_accuracies.append(val_acc)
def plot(self):
plt.figure(figsize=(10, 6))
plt.plot(self.dropout_rates, self.train_accuracies, 'o-', label='Train')
plt.plot(self.dropout_rates, self.val_accuracies, 'o-', label='Val')
plt.xlabel('Dropout Rate')
plt.ylabel('Accuracy')
plt.title('Dropout Rate Analysis')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
可视化图表:

生成图表代码:
python
from docs.ml_metrics_visualization_complete import plot_regularization_effects
plot_regularization_effects()
4. Batch Normalization
定义:对每个batch进行标准化
效果指标:
- 训练稳定性
- 收敛速度
- 最终性能
总结
核心要点回顾
-
训练过程指标
- 损失函数:训练损失、验证损失
- 准确率:训练准确率、验证准确率
- 梯度:梯度范数、梯度裁剪
- 学习率:学习率调度、学习率跟踪
-
模型参数指标
- 参数量:总参数量、可训练参数量
- FLOPs:计算复杂度
- 模型大小:存储空间
- 内存占用:训练内存、推理内存
-
超参数调优指标
- 搜索策略:网格搜索、随机搜索、贝叶斯优化
- 超参数重要性:敏感性分析
- 可视化:超参数关系图
-
模型验证指标
- 交叉验证:K折、分层K折
- 留出法:训练集/测试集划分
- 时间序列验证:时间序列交叉验证
最佳实践
-
训练监控
- 实时监控损失和准确率
- 设置早停机制
- 记录最佳模型
-
模型优化
- 平衡模型复杂度和性能
- 考虑计算资源限制
- 优化超参数
-
验证策略
- 使用交叉验证评估泛化能力
- 保持数据分布一致性
- 避免数据泄露
下一步学习
在高级篇中,我们将深入学习:
- 后处理指标:NMS、阈值优化等
- 模型量化指标:精度损失、压缩比、加速比等
- 部署指标:延迟、吞吐量、资源占用等
- 全流程监控指标:端到端性能评估
相关资源: