【工具】survex一个解释机器学习生存模型的R包

文章目录

介绍

由于其灵活性和优越的性能,机器学习模型经常补充并优于传统的统计生存模型。然而,由于缺乏用户友好的工具来解释其内部操作和预测原理,它们的广泛采用受到阻碍。为了解决这个问题,我们引入了survex R包,它提供了一个内聚框架,通过应用可解释的人工智能技术来解释任何生存模型。所提出的软件的功能包括理解和诊断生存模型,这可以导致它们的改进。通过揭示决策过程的洞察力,例如变量效应和重要性,survex能够评估模型的可靠性和检测偏差。因此,可以在诸如生物医学研究和保健应用等敏感领域促进透明度和责任。

Due to their flexibility and superior performance, machine learning models frequently complement and outperform traditional statistical survival models. However, their widespread adoption is hindered by a lack of user-friendly tools to explain their internal operations and prediction rationales. To tackle this issue, we introduce the survex R package, which provides a cohesive framework for explaining any survival model by applying explainable artificial intelligence techniques. The capabilities of the proposed software encompass understanding and diagnosing survival models, which can lead to their improvement. By revealing insights into the decision-making process, such as variable effects and importances, survex enables the assessment of model reliability and the detection of biases. Thus, transparency and responsibility may be promoted in sensitive areas, such as biomedical research and healthcare applications.

代码

案例

r 复制代码
library(survex)
library(survival)
library(ranger)

vet <- survival::veteran

cph <- coxph(Surv(time, status) ~ ., data = vet, x = TRUE, model = TRUE)
exp <- explain(cph, data = vet[, -c(3,4)], y = Surv(vet$time, vet$status))
#> Preparation of a new explainer is initiated 
#>   -> model label       :  coxph (  default  ) 
#>   -> data              :  137  rows  6  cols 
#>   -> target variable   :  137  values ( 128 events and 9 censored ) 
#>   -> times             :  50 unique time points , min = 1.5 , median survival time = 80 , max = 999 
#>   -> times             :  (  generated from y as uniformly distributed survival quantiles based on Kaplan-Meier estimator  ) 
#>   -> predict function  :  predict.coxph with type = 'risk' will be used (  default  ) 
#>   -> predict survival function  :  predictSurvProb.coxph will be used (  default  ) 
#>   -> predict cumulative hazard function  :  -log(predict_survival_function) will be used (  default  ) 
#>   -> model_info        :  package survival , ver. 3.7.0 , task survival (  default  ) 
#>   A new explainer has been created!


shap <- model_survshap(exp, veteran[c(1:4, 17:20, 110:113, 126:129), -c(3,4)])

plot(shap)

参考

  • survex: an R package for explaining machine learning survival models
相关推荐
CCPC不拿奖不改名10 小时前
“Token→整数索引” 的完整实现步骤
人工智能·python·rnn·神经网络·自然语言处理·token·josn
deephub10 小时前
多智能体强化学习(MARL)核心概念与算法概览
人工智能·机器学习·强化学习·多智能体
YangYang9YangYan10 小时前
2026高职计算机专业学习数据分析指南
学习·数据挖掘·数据分析
张小凡vip10 小时前
数据挖掘(五) -----JupyterHub 使用gitlab的账号体系进行认证
人工智能·数据挖掘·gitlab
叫我:松哥10 小时前
基于神经网络算法的多模态内容分析系统,采用Flask + Bootstrap + ECharts + LSTM-CNN + 注意力机制
前端·神经网络·算法·机器学习·flask·bootstrap·echarts
vx_bisheyuange10 小时前
基于SpringBoot的知识竞赛系统
大数据·前端·人工智能·spring boot·毕业设计
Ryan老房10 小时前
从LabelImg到TjMakeBot-标注工具的进化史
人工智能·yolo·目标检测·计算机视觉·ai
Coding茶水间10 小时前
基于深度学习的吸烟检测系统演示与介绍(YOLOv12/v11/v8/v5模型+Pyqt5界面+训练代码+数据集)
开发语言·人工智能·深度学习·yolo·目标检测·机器学习
Yuer202510 小时前
在金融决策场景中,任何不能退化的系统,本身就是系统性风险。
数据挖掘·ai量化·可控ai
Aaron_94510 小时前
VideoRAG:革新视频理解的检索增强生成技术深度解析
人工智能·音视频