模型评价指标概念说明(回归,分类,多分类)

  1. 回归任务(Regression Tasks)
    • MSE(Mean Squared Error):均方误差,表示预测值与实际值之间的平均平方差。MSE越小,说明模型的预测性能越好。
    • RMSE(Root Mean Squared Error):均方误差的平方根,常用于金融领域的预测。
    • MAE(Mean Absolute Error):平均绝对误差,表示预测值与实际值之间的平均绝对差。MAE越小,说明模型的预测性能越好。
    • R-Square(R-squared):R平方,表示模型解释的数据变异性的比例。R-Square越接近1,说明模型的拟合效果越好。
    • MAPE(Mean Absolute Percentage Error):平均绝对百分比误差,表示预测值与实际值之间的平均绝对百分误差。MAPE越小,说明模型的预测性能越好。
  2. 二分类任务(Binary Classification Tasks)
    • Accuracy:准确率,表示模型正确预测的样本数占总样本数的比例。
    • Precision:精确率,表示模型预测为正例的样本中实际为正例的比例。
    • Recall:召回率,表示实际为正例的样本中被模型预测为正例的比例。
    • LogLoss:对数损失,表示模型预测概率与实际概率之间的对数损失。LogLoss越小,说明模型的预测性能越好。
    • F1-Score:F1分数,是精确率和召回率的调和平均值,用于综合评价模型的性能。F1-Score越高,说明模型的性能越好。
    • AUC(Area Under Curve):曲线下面积,表示模型预测的概率与实际概率之间的相对关系。AUC越接近1,说明模型的性能越好。
    • Confusion Matrix:混淆矩阵,表示模型预测的正例和负例与实际情况的对应关系,用于分析模型的预测性能。
  3. 多分类任务(Multi-class Classification Tasks)
    • Accuracy:准确率,表示模型正确预测的样本数占总样本数的比例。
    • Precision:精确率,表示模型预测为某一类的样本中实际为这一类的比例。
    • Recall:召回率,表示实际为某一类的样本中被模型预测为这一类的比例。
    • Confusion Matrix:混淆矩阵,表示模型预测的各个类与实际情况的对应关系,用于分析模型的预测性能。
    • ROC Curve:ROC曲线,表示模型在不同阈值下的真正例率和假正例率之间的关系。ROC曲线越接近左上角,说明模型的性能越好。
    • PR Curve:PR曲线,表示模型在不同阈值下的精确率和召回率之间的关系。PR曲线越接近对角线,说明模型的性能越好。

  • Regression Tasks :
    • MSE (Mean Squared Error): It measures the average squared difference between predicted values and actual values, indicating how well the model can predict. The smaller the MSE, the better the predictive performance of the model.
    • RMSE (Root Mean Squared Error): It is the square root of the MSE, often used in financial forecasting.
    • MAE (Mean Absolute Error): It measures the average absolute difference between predicted values and actual values, indicating how well the model can predict. The smaller the MAE, the better the predictive performance of the model.
    • R-Square (R-squared): It represents the proportion of data variation explained by the model. The closer the R-Square to 1, the better the fitting effect of the model.
    • MAPE (Mean Absolute Percentage Error): It measures the average absolute percentage error between predicted values and actual values, indicating how well the model can predict. The smaller the MAPE, the better the predictive performance of the model.
  • Binary Classification Tasks :
    • Accuracy: It represents the ratio of correct predictions to the total number of samples.
    • Precision: It indicates the proportion of positive predictions that are actually positive.
    • Recall: It indicates the proportion of actual positives that are correctly predicted as positive.
    • LogLoss: It measures the logarithmic loss between predicted probabilities and actual probabilities, indicating how well the model can predict. The smaller the LogLoss, the better the predictive performance of the model.
    • F1-Score: It is the harmonic mean of precision and recall, used to comprehensively evaluate the performance of the model. A higher F1-Score indicates a better performance of the model.
    • AUC (Area Under Curve): It represents the relative relationship between predicted probability and actual probability. The closer the AUC to 1, the better the performance of the model.
    • Confusion Matrix: It shows the correspondence between predicted classes and actual classes, used to analyze the predictive performance of the model.
  • Multi-class Classification Tasks :
    • Accuracy: It represents the ratio of correct predictions to the total number of samples.
    • Precision: It indicates the proportion of predictions for a specific class that are actually from this class.
    • Recall: It indicates the proportion of actual samples from a specific class that are correctly predicted as this class.
    • Confusion Matrix: It shows the correspondence between predicted classes and actual classes, used to analyze the predictive performance of the model.
    • ROC Curve: It represents the relationship between true positive rate and false positive rate at different thresholds, indicating the performance of the model. The closer the ROC curve to the upper left corner, the better the performance of the model.
    • PR Curve: It represents the relationship between precision and recall at different thresholds, indicating the performance of the model. The closer the PR curve to the diagonal line, the better the performance of the model.

回归

  • Mean Absolute Error (MAE): 平均绝对误差,用于衡量模型预测值与真实值之间的平均绝对差距。它的值越小表示模型拟合得越好。
  • Mean Squared Error (MSE): 均方误差,是预测值与真实值之间平均差的平方,用于衡量模型的预测精度。与MAE类似,值越小表示模型拟合得越好。
  • R-Squared (R2): R平方,用于评估回归模型的拟合程度。它衡量模型预测值与真实值方差的比例,取值范围为[0, 1],越接近1表示模型拟合得越好。
py 复制代码
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

# 实际值与预测值
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 1.8, 8]

# MAE
mae = mean_absolute_error(y_true, y_pred)
print("MAE:", mae)

# MSE
mse = mean_squared_error(y_true, y_pred)
print("MSE:", mse)

# R2
r2 = r2_score(y_true, y_pred)
print("R2:", r2)

分类

  1. Accuracy (准确率): 分类模型预测正确的样本数与总样本数的比例,用于衡量模型的整体分类准确性。取值范围为[0, 1],值越大表示模型拟合得越好。
  2. Precision (精确率): 预测为正的样本中,确实为正的比例。它衡量模型分类为正的准确性。取值范围为[0, 1],值越大表示模型拟合得越好。
  3. Recall (召回率): 真实为正的样本中,被正确预测为正的比例。它衡量模型发现真实正例的能力。取值范围为[0, 1],值越大表示模型拟合得越好。
py 复制代码
from sklearn.metrics import accuracy_score, precision_score, recall_score

# 实际标签与预测标签
y_true = [0, 1, 1, 0, 0, 1]
y_pred = [0, 1, 0, 0, 1, 1]

# Accuracy
accuracy = accuracy_score(y_true, y_pred)
print("Accuracy:", accuracy)

# Precision
precision = precision_score(y_true, y_pred)
print("Precision:", precision)

# Recall
recall = recall_score(y_true, y_pred)
print("Recall:", recall)

多分类

  1. 多分类任务评价指标:
    • Categorical Accuracy (分类准确率): 多分类问题中的准确率,表示预测正确的样本数与总样本数的比例。取值范围为[0, 1],值越大表示模型拟合得越好。
    • F1-Score: F1分数综合了准确率和召回率。它衡量预测准确性和发现真实正例的能力。取值范围为[0, 1],值越大表示模型拟合得越好。
python 复制代码
from sklearn.metrics import accuracy_score, f1_score

# 实际标签与预测标签
y_true = [0, 1, 2, 1, 0, 2]
y_pred = [0, 1, 1, 2, 0, 1]

# Categorical Accuracy
accuracy = accuracy_score(y_true, y_pred)
print("Accuracy:", accuracy)

# F1-Score
f1 = f1_score(y_true, y_pred, average='macro')
print("F1-Score:", f1)
相关推荐
C7211BA3 小时前
使用knn算法对iris数据集进行分类
算法·分类·数据挖掘
紫钺-高山仰止3 小时前
【脑机接口】脑机接口性能的电压波形的尖峰分类和阈值比较
大数据·分类·数据挖掘
阡之尘埃5 小时前
Python数据分析案例59——基于图神经网络的反欺诈交易检测(GCN,GAT,GIN)
python·神经网络·数据挖掘·数据分析·图神经网络·反欺诈·风控大数据
经纬恒润7 小时前
应用案例分享 | 智驾路试数据分析及 SiL/HiL 回灌案例介绍
数据挖掘·数据分析·智能驾驶·ai智能体
y_dd13 小时前
【machine learning-七-线性回归之成本函数】
算法·回归·线性回归
青椒大仙KI1117 小时前
24/9/19 算法笔记 kaggle BankChurn数据分类
笔记·算法·分类
ShuQiHere17 小时前
【ShuQiHere】 探索数据挖掘的世界:从概念到应用
人工智能·数据挖掘
惟长堤一痕1 天前
医学数据分析实训 项目四回归分析--预测帕金森病病情的严重程度
数据挖掘·数据分析·回归
勤劳兔码农1 天前
文本分类实战项目:如何使用NLP构建情感分析模型
自然语言处理·分类·数据挖掘
xuehaisj1 天前
论文内容分类与检测系统源码分享
人工智能·分类·数据挖掘