论文标题
高维生存分析:方法和应用
High-Dimensional Survival Analysis: Methods and Applications
论文作者
论文摘要
在精确医学的时代,通常会收集诸如死亡时间或进展时间之类的事件结果,以及高通量协变量。这些高维数据违背了经典的生存回归模型,这些模型是不可行的,或者由于过度拟合而可能导致的可预测性低。为了克服这一问题,最近的重点已放在开发特征选择和生存预后的新方法上。我们将回顾各种具有高维预测变量的生存结果数据的尖端方法,强调了机器学习方法的最新创新,以进行生存预测。我们将介绍这些方法背后的统计直觉和原则,并以观察到竞争性事件的更复杂环境的扩展为结论。我们用适用于波士顿肺癌生存队列研究的应用来体现这些方法,这是研究肺癌的复杂机制的最大癌症流行病学人群之一。
In the era of precision medicine, time-to-event outcomes such as time to death or progression are routinely collected, along with high-throughput covariates. These high-dimensional data defy classical survival regression models, which are either infeasible to fit or likely to incur low predictability due to over-fitting. To overcome this, recent emphasis has been placed on developing novel approaches for feature selection and survival prognostication. We will review various cutting-edge methods that handle survival outcome data with high-dimensional predictors, highlighting recent innovations in machine learning approaches for survival prediction. We will cover the statistical intuitions and principles behind these methods and conclude with extensions to more complex settings, where competing events are observed. We exemplify these methods with applications to the Boston Lung Cancer Survival Cohort study, one of the largest cancer epidemiology cohorts investigating the complex mechanisms of lung cancer.