论文标题
在存在混杂的主要地层的情况下鉴定和估计因果效应
Identification and estimation of causal effects in the presence of confounded principal strata
论文作者
论文摘要
主要分层已成为解决广泛的因果推理问题的流行工具,尤其是在处理不合规和截断的问题方面。主要地层中由中间变量的关节电位值(也称为主要因果效应)确定的因果关系通常在这些研究中感兴趣。文献中观察到的数据的主要因果效应的分析主要取决于治疗分配的无知性,这要求从业者准确地测量与协变量一样多的协变量,以便捕获所有可能的混杂来源。但是,在观察性研究中收集所有潜在的混杂因素通常是困难而昂贵的,因此无知性假设可能会值得怀疑。在本文中,通过利用越来越多地用于处理不受控制的混杂的可用负面对照,我们考虑当治疗和主要层被未观察到的变量混淆时,请考虑识别和估计因果影响。具体而言,我们表明,主要因果效应可以通过援引一对不直接影响结果所必需的一对阴性对照来非参数识别。然后,我们放宽了这一假设,并在各种半参数或参数模型下确定主要因果效应。我们还提出了一种主要因果效应的估计方法。广泛的仿真研究表明,该方法的良好表现,以及全国对年轻男性纵向调查的真实数据应用。
The principal stratification has become a popular tool to address a broad class of causal inference questions, particularly in dealing with non-compliance and truncation-by-death problems. The causal effects within principal strata which are determined by joint potential values of the intermediate variable, also known as the principal causal effects, are often of interest in these studies. Analyses of principal causal effects from observed data in the literature mostly rely on ignorability of the treatment assignment, which requires practitioners to accurately measure as many as covariates so that all possible confounding sources are captured. However, collecting all potential confounders in observational studies is often difficult and costly, the ignorability assumption may thus be questionable. In this paper, by leveraging available negative controls that have been increasingly used to deal with uncontrolled confounding, we consider identification and estimation of causal effects when the treatment and principal strata are confounded by unobserved variables. Specifically, we show that the principal causal effects can be nonparametrically identified by invoking a pair of negative controls that are both required not to directly affect the outcome. We then relax this assumption and establish identification of principal causal effects under various semiparametric or parametric models. We also propose an estimation method of principal causal effects. Extensive simulation studies show good performance of the proposed approach and a real data application from the National Longitudinal Survey of Young Men is used for illustration.