模型评价指标概念说明(回归,分类,多分类)

  1. 回归任务(Regression Tasks)
    • MSE(Mean Squared Error):均方误差,表示预测值与实际值之间的平均平方差。MSE越小,说明模型的预测性能越好。
    • RMSE(Root Mean Squared Error):均方误差的平方根,常用于金融领域的预测。
    • MAE(Mean Absolute Error):平均绝对误差,表示预测值与实际值之间的平均绝对差。MAE越小,说明模型的预测性能越好。
    • R-Square(R-squared):R平方,表示模型解释的数据变异性的比例。R-Square越接近1,说明模型的拟合效果越好。
    • MAPE(Mean Absolute Percentage Error):平均绝对百分比误差,表示预测值与实际值之间的平均绝对百分误差。MAPE越小,说明模型的预测性能越好。
  2. 二分类任务(Binary Classification Tasks)
    • Accuracy:准确率,表示模型正确预测的样本数占总样本数的比例。
    • Precision:精确率,表示模型预测为正例的样本中实际为正例的比例。
    • Recall:召回率,表示实际为正例的样本中被模型预测为正例的比例。
    • LogLoss:对数损失,表示模型预测概率与实际概率之间的对数损失。LogLoss越小,说明模型的预测性能越好。
    • F1-Score:F1分数,是精确率和召回率的调和平均值,用于综合评价模型的性能。F1-Score越高,说明模型的性能越好。
    • AUC(Area Under Curve):曲线下面积,表示模型预测的概率与实际概率之间的相对关系。AUC越接近1,说明模型的性能越好。
    • Confusion Matrix:混淆矩阵,表示模型预测的正例和负例与实际情况的对应关系,用于分析模型的预测性能。
  3. 多分类任务(Multi-class Classification Tasks)
    • Accuracy:准确率,表示模型正确预测的样本数占总样本数的比例。
    • Precision:精确率,表示模型预测为某一类的样本中实际为这一类的比例。
    • Recall:召回率,表示实际为某一类的样本中被模型预测为这一类的比例。
    • Confusion Matrix:混淆矩阵,表示模型预测的各个类与实际情况的对应关系,用于分析模型的预测性能。
    • ROC Curve:ROC曲线,表示模型在不同阈值下的真正例率和假正例率之间的关系。ROC曲线越接近左上角,说明模型的性能越好。
    • PR Curve:PR曲线,表示模型在不同阈值下的精确率和召回率之间的关系。PR曲线越接近对角线,说明模型的性能越好。

  • Regression Tasks :
    • MSE (Mean Squared Error): It measures the average squared difference between predicted values and actual values, indicating how well the model can predict. The smaller the MSE, the better the predictive performance of the model.
    • RMSE (Root Mean Squared Error): It is the square root of the MSE, often used in financial forecasting.
    • MAE (Mean Absolute Error): It measures the average absolute difference between predicted values and actual values, indicating how well the model can predict. The smaller the MAE, the better the predictive performance of the model.
    • R-Square (R-squared): It represents the proportion of data variation explained by the model. The closer the R-Square to 1, the better the fitting effect of the model.
    • MAPE (Mean Absolute Percentage Error): It measures the average absolute percentage error between predicted values and actual values, indicating how well the model can predict. The smaller the MAPE, the better the predictive performance of the model.
  • Binary Classification Tasks :
    • Accuracy: It represents the ratio of correct predictions to the total number of samples.
    • Precision: It indicates the proportion of positive predictions that are actually positive.
    • Recall: It indicates the proportion of actual positives that are correctly predicted as positive.
    • LogLoss: It measures the logarithmic loss between predicted probabilities and actual probabilities, indicating how well the model can predict. The smaller the LogLoss, the better the predictive performance of the model.
    • F1-Score: It is the harmonic mean of precision and recall, used to comprehensively evaluate the performance of the model. A higher F1-Score indicates a better performance of the model.
    • AUC (Area Under Curve): It represents the relative relationship between predicted probability and actual probability. The closer the AUC to 1, the better the performance of the model.
    • Confusion Matrix: It shows the correspondence between predicted classes and actual classes, used to analyze the predictive performance of the model.
  • Multi-class Classification Tasks :
    • Accuracy: It represents the ratio of correct predictions to the total number of samples.
    • Precision: It indicates the proportion of predictions for a specific class that are actually from this class.
    • Recall: It indicates the proportion of actual samples from a specific class that are correctly predicted as this class.
    • Confusion Matrix: It shows the correspondence between predicted classes and actual classes, used to analyze the predictive performance of the model.
    • ROC Curve: It represents the relationship between true positive rate and false positive rate at different thresholds, indicating the performance of the model. The closer the ROC curve to the upper left corner, the better the performance of the model.
    • PR Curve: It represents the relationship between precision and recall at different thresholds, indicating the performance of the model. The closer the PR curve to the diagonal line, the better the performance of the model.

回归

  • Mean Absolute Error (MAE): 平均绝对误差,用于衡量模型预测值与真实值之间的平均绝对差距。它的值越小表示模型拟合得越好。
  • Mean Squared Error (MSE): 均方误差,是预测值与真实值之间平均差的平方,用于衡量模型的预测精度。与MAE类似,值越小表示模型拟合得越好。
  • R-Squared (R2): R平方,用于评估回归模型的拟合程度。它衡量模型预测值与真实值方差的比例,取值范围为[0, 1],越接近1表示模型拟合得越好。
py 复制代码
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

# 实际值与预测值
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 1.8, 8]

# MAE
mae = mean_absolute_error(y_true, y_pred)
print("MAE:", mae)

# MSE
mse = mean_squared_error(y_true, y_pred)
print("MSE:", mse)

# R2
r2 = r2_score(y_true, y_pred)
print("R2:", r2)

分类

  1. Accuracy (准确率): 分类模型预测正确的样本数与总样本数的比例,用于衡量模型的整体分类准确性。取值范围为[0, 1],值越大表示模型拟合得越好。
  2. Precision (精确率): 预测为正的样本中,确实为正的比例。它衡量模型分类为正的准确性。取值范围为[0, 1],值越大表示模型拟合得越好。
  3. Recall (召回率): 真实为正的样本中,被正确预测为正的比例。它衡量模型发现真实正例的能力。取值范围为[0, 1],值越大表示模型拟合得越好。
py 复制代码
from sklearn.metrics import accuracy_score, precision_score, recall_score

# 实际标签与预测标签
y_true = [0, 1, 1, 0, 0, 1]
y_pred = [0, 1, 0, 0, 1, 1]

# Accuracy
accuracy = accuracy_score(y_true, y_pred)
print("Accuracy:", accuracy)

# Precision
precision = precision_score(y_true, y_pred)
print("Precision:", precision)

# Recall
recall = recall_score(y_true, y_pred)
print("Recall:", recall)

多分类

  1. 多分类任务评价指标:
    • Categorical Accuracy (分类准确率): 多分类问题中的准确率,表示预测正确的样本数与总样本数的比例。取值范围为[0, 1],值越大表示模型拟合得越好。
    • F1-Score: F1分数综合了准确率和召回率。它衡量预测准确性和发现真实正例的能力。取值范围为[0, 1],值越大表示模型拟合得越好。
python 复制代码
from sklearn.metrics import accuracy_score, f1_score

# 实际标签与预测标签
y_true = [0, 1, 2, 1, 0, 2]
y_pred = [0, 1, 1, 2, 0, 1]

# Categorical Accuracy
accuracy = accuracy_score(y_true, y_pred)
print("Accuracy:", accuracy)

# F1-Score
f1 = f1_score(y_true, y_pred, average='macro')
print("F1-Score:", f1)
相关推荐
ZWZhangYu21 分钟前
【Gradio系列】使用 Gradio 快速构建机器学习图像分类实战
人工智能·机器学习·分类
动物园猫1 小时前
城市道路设施及道路安全隐患数据集分享(适用于YOLO系列深度学习分类检测任务)
深度学习·yolo·分类
爱看科技3 小时前
量子计算赋能图像智能新突破,微美全息(NASDAQ:WIMI)PQCNN并行混合架构引领多类分类性能跃升
分类·数据挖掘·量子计算
算法玩不起13 小时前
以乳腺癌诊断数据为例的医学AI分类建模方法入门
人工智能·分类·数据挖掘
阿钱真强道21 小时前
27 Python 分类-从概率角度做分类,一文认识朴素贝叶斯
python·分类·朴素贝叶斯·分类算法·贝叶斯分类·gaussiannb
一招定胜负1 天前
基于通义千问 API 的课堂话语智能分类分析工具实现
人工智能·分类·数据挖掘
jerryinwuhan1 天前
python数据挖掘基础
python·数据挖掘·numpy
坚持学习前端日记1 天前
从零开始构建小说推荐智能体 - Coze 本地部署完整教程
大数据·人工智能·数据挖掘
电商API_180079052471 天前
电商平台公开数据采集实践:基于合规接口的数据分析方案
开发语言·数据库·人工智能·数据挖掘·数据分析·网络爬虫
阿钱真强道1 天前
28 Python 分类:不只是画一条线,一文认识支持向量机(SVM)
python·支持向量机·分类·svm·边界·核方法·高维