论文标题
在高维度中的多元分类响应回归的基于似然的方法
A likelihood-based approach for multivariate categorical response regression in high dimensions
论文作者
论文摘要
我们提出了一种受惩罚的可能性方法,以适合双变量分类响应回归模型。我们的方法允许从业者估计哪些预测因子无关紧要,哪些预测因子仅影响双变量响应的边际分布,哪些预测因子都会影响边际分布和对数比值比。为了计算我们的估计器,我们提出了一种有效的一阶算法,我们将其扩展到设置,其中某些受试者只能测量一个响应变量,即半监督的设置。我们得出了一个渐近误差绑定,该误差说明了我们在高维设置中的性能。提出了对多元分类响应回归模型的概括。最后,仿真研究和在泛滥人的风险预测中的应用证明了我们方法在可解释性和预测准确性方面的有用性。实施该方法的R软件包可在github.com/ajmolstad/bvcategorical上下载。
We propose a penalized likelihood method to fit the bivariate categorical response regression model. Our method allows practitioners to estimate which predictors are irrelevant, which predictors only affect the marginal distributions of the bivariate response, and which predictors affect both the marginal distributions and log odds ratios. To compute our estimator, we propose an efficient first order algorithm which we extend to settings where some subjects have only one response variable measured, i.e., the semi-supervised setting. We derive an asymptotic error bound which illustrates the performance of our estimator in high-dimensional settings. Generalizations to the multivariate categorical response regression model are proposed. Finally, simulation studies and an application in pan-cancer risk prediction demonstrate the usefulness of our method in terms of interpretability and prediction accuracy. An R package implementing the proposed method is available for download at github.com/ajmolstad/BvCategorical.