模型评价指标概念说明(回归,分类,多分类)

  1. 回归任务(Regression Tasks)
    • MSE(Mean Squared Error):均方误差,表示预测值与实际值之间的平均平方差。MSE越小,说明模型的预测性能越好。
    • RMSE(Root Mean Squared Error):均方误差的平方根,常用于金融领域的预测。
    • MAE(Mean Absolute Error):平均绝对误差,表示预测值与实际值之间的平均绝对差。MAE越小,说明模型的预测性能越好。
    • R-Square(R-squared):R平方,表示模型解释的数据变异性的比例。R-Square越接近1,说明模型的拟合效果越好。
    • MAPE(Mean Absolute Percentage Error):平均绝对百分比误差,表示预测值与实际值之间的平均绝对百分误差。MAPE越小,说明模型的预测性能越好。
  2. 二分类任务(Binary Classification Tasks)
    • Accuracy:准确率,表示模型正确预测的样本数占总样本数的比例。
    • Precision:精确率,表示模型预测为正例的样本中实际为正例的比例。
    • Recall:召回率,表示实际为正例的样本中被模型预测为正例的比例。
    • LogLoss:对数损失,表示模型预测概率与实际概率之间的对数损失。LogLoss越小,说明模型的预测性能越好。
    • F1-Score:F1分数,是精确率和召回率的调和平均值,用于综合评价模型的性能。F1-Score越高,说明模型的性能越好。
    • AUC(Area Under Curve):曲线下面积,表示模型预测的概率与实际概率之间的相对关系。AUC越接近1,说明模型的性能越好。
    • Confusion Matrix:混淆矩阵,表示模型预测的正例和负例与实际情况的对应关系,用于分析模型的预测性能。
  3. 多分类任务(Multi-class Classification Tasks)
    • Accuracy:准确率,表示模型正确预测的样本数占总样本数的比例。
    • Precision:精确率,表示模型预测为某一类的样本中实际为这一类的比例。
    • Recall:召回率,表示实际为某一类的样本中被模型预测为这一类的比例。
    • Confusion Matrix:混淆矩阵,表示模型预测的各个类与实际情况的对应关系,用于分析模型的预测性能。
    • ROC Curve:ROC曲线,表示模型在不同阈值下的真正例率和假正例率之间的关系。ROC曲线越接近左上角,说明模型的性能越好。
    • PR Curve:PR曲线,表示模型在不同阈值下的精确率和召回率之间的关系。PR曲线越接近对角线,说明模型的性能越好。

  • Regression Tasks :
    • MSE (Mean Squared Error): It measures the average squared difference between predicted values and actual values, indicating how well the model can predict. The smaller the MSE, the better the predictive performance of the model.
    • RMSE (Root Mean Squared Error): It is the square root of the MSE, often used in financial forecasting.
    • MAE (Mean Absolute Error): It measures the average absolute difference between predicted values and actual values, indicating how well the model can predict. The smaller the MAE, the better the predictive performance of the model.
    • R-Square (R-squared): It represents the proportion of data variation explained by the model. The closer the R-Square to 1, the better the fitting effect of the model.
    • MAPE (Mean Absolute Percentage Error): It measures the average absolute percentage error between predicted values and actual values, indicating how well the model can predict. The smaller the MAPE, the better the predictive performance of the model.
  • Binary Classification Tasks :
    • Accuracy: It represents the ratio of correct predictions to the total number of samples.
    • Precision: It indicates the proportion of positive predictions that are actually positive.
    • Recall: It indicates the proportion of actual positives that are correctly predicted as positive.
    • LogLoss: It measures the logarithmic loss between predicted probabilities and actual probabilities, indicating how well the model can predict. The smaller the LogLoss, the better the predictive performance of the model.
    • F1-Score: It is the harmonic mean of precision and recall, used to comprehensively evaluate the performance of the model. A higher F1-Score indicates a better performance of the model.
    • AUC (Area Under Curve): It represents the relative relationship between predicted probability and actual probability. The closer the AUC to 1, the better the performance of the model.
    • Confusion Matrix: It shows the correspondence between predicted classes and actual classes, used to analyze the predictive performance of the model.
  • Multi-class Classification Tasks :
    • Accuracy: It represents the ratio of correct predictions to the total number of samples.
    • Precision: It indicates the proportion of predictions for a specific class that are actually from this class.
    • Recall: It indicates the proportion of actual samples from a specific class that are correctly predicted as this class.
    • Confusion Matrix: It shows the correspondence between predicted classes and actual classes, used to analyze the predictive performance of the model.
    • ROC Curve: It represents the relationship between true positive rate and false positive rate at different thresholds, indicating the performance of the model. The closer the ROC curve to the upper left corner, the better the performance of the model.
    • PR Curve: It represents the relationship between precision and recall at different thresholds, indicating the performance of the model. The closer the PR curve to the diagonal line, the better the performance of the model.

回归

  • Mean Absolute Error (MAE): 平均绝对误差,用于衡量模型预测值与真实值之间的平均绝对差距。它的值越小表示模型拟合得越好。
  • Mean Squared Error (MSE): 均方误差,是预测值与真实值之间平均差的平方,用于衡量模型的预测精度。与MAE类似,值越小表示模型拟合得越好。
  • R-Squared (R2): R平方,用于评估回归模型的拟合程度。它衡量模型预测值与真实值方差的比例,取值范围为[0, 1],越接近1表示模型拟合得越好。
py 复制代码
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

# 实际值与预测值
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 1.8, 8]

# MAE
mae = mean_absolute_error(y_true, y_pred)
print("MAE:", mae)

# MSE
mse = mean_squared_error(y_true, y_pred)
print("MSE:", mse)

# R2
r2 = r2_score(y_true, y_pred)
print("R2:", r2)

分类

  1. Accuracy (准确率): 分类模型预测正确的样本数与总样本数的比例,用于衡量模型的整体分类准确性。取值范围为[0, 1],值越大表示模型拟合得越好。
  2. Precision (精确率): 预测为正的样本中,确实为正的比例。它衡量模型分类为正的准确性。取值范围为[0, 1],值越大表示模型拟合得越好。
  3. Recall (召回率): 真实为正的样本中,被正确预测为正的比例。它衡量模型发现真实正例的能力。取值范围为[0, 1],值越大表示模型拟合得越好。
py 复制代码
from sklearn.metrics import accuracy_score, precision_score, recall_score

# 实际标签与预测标签
y_true = [0, 1, 1, 0, 0, 1]
y_pred = [0, 1, 0, 0, 1, 1]

# Accuracy
accuracy = accuracy_score(y_true, y_pred)
print("Accuracy:", accuracy)

# Precision
precision = precision_score(y_true, y_pred)
print("Precision:", precision)

# Recall
recall = recall_score(y_true, y_pred)
print("Recall:", recall)

多分类

  1. 多分类任务评价指标:
    • Categorical Accuracy (分类准确率): 多分类问题中的准确率,表示预测正确的样本数与总样本数的比例。取值范围为[0, 1],值越大表示模型拟合得越好。
    • F1-Score: F1分数综合了准确率和召回率。它衡量预测准确性和发现真实正例的能力。取值范围为[0, 1],值越大表示模型拟合得越好。
python 复制代码
from sklearn.metrics import accuracy_score, f1_score

# 实际标签与预测标签
y_true = [0, 1, 2, 1, 0, 2]
y_pred = [0, 1, 1, 2, 0, 1]

# Categorical Accuracy
accuracy = accuracy_score(y_true, y_pred)
print("Accuracy:", accuracy)

# F1-Score
f1 = f1_score(y_true, y_pred, average='macro')
print("F1-Score:", f1)
相关推荐
MoRanzhi12034 小时前
15. Pandas 综合实战案例(零售数据分析)
数据结构·python·数据挖掘·数据分析·pandas·matplotlib·零售
qq_436962184 小时前
数据民主化实践:ChatBI赋能全民数据分析
数据挖掘·数据分析
geneculture5 小时前
融智学院十大学部知识架构示范样板
人工智能·数据挖掘·信息科学·哲学与科学统一性·信息融智学
DP+GISer6 小时前
自己制作遥感深度学习数据集进行遥感深度学习地物分类-试读
人工智能·深度学习·分类
victory04316 小时前
TODO 分类任务指标计算和展示 准确率 F1 Recall
人工智能·机器学习·分类
rengang666 小时前
07-逻辑回归:分析用于分类问题的逻辑回归模型及其数学原理
人工智能·算法·机器学习·分类·逻辑回归
rengang6611 小时前
08-决策树:探讨基于树结构的分类和回归方法及其优缺点
人工智能·算法·决策树·机器学习·分类·回归
Michelle802311 小时前
23大数据 数据挖掘复习1
大数据·人工智能·数据挖掘
梦想画家14 小时前
Cohen‘s Kappa系数:衡量分类一致性的黄金标准及其在NLP中的应用
自然语言处理·分类·数据挖掘
iceslime16 小时前
头歌Educator机器学习与数据挖掘-逻辑回归
机器学习·数据挖掘·逻辑回归