模型评价指标概念说明(回归,分类,多分类)

  1. 回归任务(Regression Tasks)
    • MSE(Mean Squared Error):均方误差,表示预测值与实际值之间的平均平方差。MSE越小,说明模型的预测性能越好。
    • RMSE(Root Mean Squared Error):均方误差的平方根,常用于金融领域的预测。
    • MAE(Mean Absolute Error):平均绝对误差,表示预测值与实际值之间的平均绝对差。MAE越小,说明模型的预测性能越好。
    • R-Square(R-squared):R平方,表示模型解释的数据变异性的比例。R-Square越接近1,说明模型的拟合效果越好。
    • MAPE(Mean Absolute Percentage Error):平均绝对百分比误差,表示预测值与实际值之间的平均绝对百分误差。MAPE越小,说明模型的预测性能越好。
  2. 二分类任务(Binary Classification Tasks)
    • Accuracy:准确率,表示模型正确预测的样本数占总样本数的比例。
    • Precision:精确率,表示模型预测为正例的样本中实际为正例的比例。
    • Recall:召回率,表示实际为正例的样本中被模型预测为正例的比例。
    • LogLoss:对数损失,表示模型预测概率与实际概率之间的对数损失。LogLoss越小,说明模型的预测性能越好。
    • F1-Score:F1分数,是精确率和召回率的调和平均值,用于综合评价模型的性能。F1-Score越高,说明模型的性能越好。
    • AUC(Area Under Curve):曲线下面积,表示模型预测的概率与实际概率之间的相对关系。AUC越接近1,说明模型的性能越好。
    • Confusion Matrix:混淆矩阵,表示模型预测的正例和负例与实际情况的对应关系,用于分析模型的预测性能。
  3. 多分类任务(Multi-class Classification Tasks)
    • Accuracy:准确率,表示模型正确预测的样本数占总样本数的比例。
    • Precision:精确率,表示模型预测为某一类的样本中实际为这一类的比例。
    • Recall:召回率,表示实际为某一类的样本中被模型预测为这一类的比例。
    • Confusion Matrix:混淆矩阵,表示模型预测的各个类与实际情况的对应关系,用于分析模型的预测性能。
    • ROC Curve:ROC曲线,表示模型在不同阈值下的真正例率和假正例率之间的关系。ROC曲线越接近左上角,说明模型的性能越好。
    • PR Curve:PR曲线,表示模型在不同阈值下的精确率和召回率之间的关系。PR曲线越接近对角线,说明模型的性能越好。

  • Regression Tasks :
    • MSE (Mean Squared Error): It measures the average squared difference between predicted values and actual values, indicating how well the model can predict. The smaller the MSE, the better the predictive performance of the model.
    • RMSE (Root Mean Squared Error): It is the square root of the MSE, often used in financial forecasting.
    • MAE (Mean Absolute Error): It measures the average absolute difference between predicted values and actual values, indicating how well the model can predict. The smaller the MAE, the better the predictive performance of the model.
    • R-Square (R-squared): It represents the proportion of data variation explained by the model. The closer the R-Square to 1, the better the fitting effect of the model.
    • MAPE (Mean Absolute Percentage Error): It measures the average absolute percentage error between predicted values and actual values, indicating how well the model can predict. The smaller the MAPE, the better the predictive performance of the model.
  • Binary Classification Tasks :
    • Accuracy: It represents the ratio of correct predictions to the total number of samples.
    • Precision: It indicates the proportion of positive predictions that are actually positive.
    • Recall: It indicates the proportion of actual positives that are correctly predicted as positive.
    • LogLoss: It measures the logarithmic loss between predicted probabilities and actual probabilities, indicating how well the model can predict. The smaller the LogLoss, the better the predictive performance of the model.
    • F1-Score: It is the harmonic mean of precision and recall, used to comprehensively evaluate the performance of the model. A higher F1-Score indicates a better performance of the model.
    • AUC (Area Under Curve): It represents the relative relationship between predicted probability and actual probability. The closer the AUC to 1, the better the performance of the model.
    • Confusion Matrix: It shows the correspondence between predicted classes and actual classes, used to analyze the predictive performance of the model.
  • Multi-class Classification Tasks :
    • Accuracy: It represents the ratio of correct predictions to the total number of samples.
    • Precision: It indicates the proportion of predictions for a specific class that are actually from this class.
    • Recall: It indicates the proportion of actual samples from a specific class that are correctly predicted as this class.
    • Confusion Matrix: It shows the correspondence between predicted classes and actual classes, used to analyze the predictive performance of the model.
    • ROC Curve: It represents the relationship between true positive rate and false positive rate at different thresholds, indicating the performance of the model. The closer the ROC curve to the upper left corner, the better the performance of the model.
    • PR Curve: It represents the relationship between precision and recall at different thresholds, indicating the performance of the model. The closer the PR curve to the diagonal line, the better the performance of the model.

回归

  • Mean Absolute Error (MAE): 平均绝对误差,用于衡量模型预测值与真实值之间的平均绝对差距。它的值越小表示模型拟合得越好。
  • Mean Squared Error (MSE): 均方误差,是预测值与真实值之间平均差的平方,用于衡量模型的预测精度。与MAE类似,值越小表示模型拟合得越好。
  • R-Squared (R2): R平方,用于评估回归模型的拟合程度。它衡量模型预测值与真实值方差的比例,取值范围为[0, 1],越接近1表示模型拟合得越好。
py 复制代码
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

# 实际值与预测值
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 1.8, 8]

# MAE
mae = mean_absolute_error(y_true, y_pred)
print("MAE:", mae)

# MSE
mse = mean_squared_error(y_true, y_pred)
print("MSE:", mse)

# R2
r2 = r2_score(y_true, y_pred)
print("R2:", r2)

分类

  1. Accuracy (准确率): 分类模型预测正确的样本数与总样本数的比例,用于衡量模型的整体分类准确性。取值范围为[0, 1],值越大表示模型拟合得越好。
  2. Precision (精确率): 预测为正的样本中,确实为正的比例。它衡量模型分类为正的准确性。取值范围为[0, 1],值越大表示模型拟合得越好。
  3. Recall (召回率): 真实为正的样本中,被正确预测为正的比例。它衡量模型发现真实正例的能力。取值范围为[0, 1],值越大表示模型拟合得越好。
py 复制代码
from sklearn.metrics import accuracy_score, precision_score, recall_score

# 实际标签与预测标签
y_true = [0, 1, 1, 0, 0, 1]
y_pred = [0, 1, 0, 0, 1, 1]

# Accuracy
accuracy = accuracy_score(y_true, y_pred)
print("Accuracy:", accuracy)

# Precision
precision = precision_score(y_true, y_pred)
print("Precision:", precision)

# Recall
recall = recall_score(y_true, y_pred)
print("Recall:", recall)

多分类

  1. 多分类任务评价指标:
    • Categorical Accuracy (分类准确率): 多分类问题中的准确率,表示预测正确的样本数与总样本数的比例。取值范围为[0, 1],值越大表示模型拟合得越好。
    • F1-Score: F1分数综合了准确率和召回率。它衡量预测准确性和发现真实正例的能力。取值范围为[0, 1],值越大表示模型拟合得越好。
python 复制代码
from sklearn.metrics import accuracy_score, f1_score

# 实际标签与预测标签
y_true = [0, 1, 2, 1, 0, 2]
y_pred = [0, 1, 1, 2, 0, 1]

# Categorical Accuracy
accuracy = accuracy_score(y_true, y_pred)
print("Accuracy:", accuracy)

# F1-Score
f1 = f1_score(y_true, y_pred, average='macro')
print("F1-Score:", f1)
相关推荐
lilye669 小时前
精益数据分析(53/126):双边市场模式指标全解析与运营策略深度探讨
数据挖掘·数据分析
BioRunYiXue10 小时前
一文了解氨基酸的分类、代谢和应用
人工智能·深度学习·算法·机器学习·分类·数据挖掘·代谢组学
Blossom.11813 小时前
低代码开发:开启软件开发的新篇章
人工智能·深度学习·安全·低代码·机器学习·计算机视觉·数据挖掘
请你喝好果汁64115 小时前
TWASandGWAS中GBS filtering and GWAS(1)
信息可视化·数据挖掘·数据分析
Leo.yuan16 小时前
数据分析怎么做?高效的数据分析方法有哪些?
大数据·数据库·信息可视化·数据挖掘·数据分析
白杆杆红伞伞17 小时前
02_线性模型(回归分类模型)
分类·数据挖掘·回归
人大博士的交易之路1 天前
今日行情明日机会——20250512
大数据·数学建模·数据挖掘·缠论·缠中说禅·涨停回马枪
yzx9910132 天前
支持向量机与逻辑回归的区别及 SVM 在图像分类中的应用
支持向量机·分类·逻辑回归
鸿蒙布道师2 天前
英伟达开源Llama-Nemotron系列模型:14万H100小时训练细节全解析
深度学习·神经网络·opencv·机器学习·自然语言处理·数据挖掘·llama
慕婉03072 天前
如何理解编程中的递归、迭代与回归?
人工智能·数据挖掘·回归