解释具有高维渐近学的治疗效应估计器之间的实际差异

论文标题

解释具有高维渐近学的治疗效应估计器之间的实际差异

Explaining Practical Differences Between Treatment Effect Estimators with High Dimensional Asymptotics

论文作者

Yadlowsky, Steve

论文摘要

我们重新审视了使用两阶段半参数方法在存在完全观察到的混杂变量的情况下估计平均治疗效果的经典因果推断问题。在现有的理论研究中，诸如G-Compuntion，逆倾向加权（IPW）和两个常见的双重稳定估计器等方法中 - 增强IPW（AIPW）和针对性的最大似然估计（TMLE） - 它们是偏见为主导的，或具有相似的渐进性统计特性。但是，当应用于真实数据集时，它们通常似乎具有明显不同的方差。我们比较这些方法在使用机器学习（ML）模型来估计半参数模型的滋扰参数，并突出显示一些重要差异。当结果模型估计几乎没有偏差时，这在某些关键的ML模型中很常见，G-Compuntion和TMLE在偏差和方差中都优于其他估计器。我们表明，可以使用高维统计理论来解释差异，其中混杂因子的数量$ d $与样本量$ n $的顺序相同。为了使这个理论问题可以解决，我们为混杂因素对治疗分配和结果的影响提出了广义线性模型。尽管做了参数假设，但对于某些用于调整两阶段半参数方法的机器学习方法，此设置还是有用的替代物。特别是，对第一阶段的估计增加了不消失的差异，迫使我们在渐近扩张中面对术语，通常将其作为有限样本缺陷抛在一边。但是，我们的模型强调了这些估计值以外的一阶渐近学之间的性能差异。

We revisit the classical causal inference problem of estimating the average treatment effect in the presence of fully observed confounding variables using two-stage semiparametric methods. In existing theoretical studies of methods such as G-computation, inverse propensity weighting (IPW), and two common doubly robust estimators -- augmented IPW (AIPW) and targeted maximum likelihood estimation (TMLE) -- they are either bias-dominated, or have similar asymptotic statistical properties. However, when applied to real datasets, they often appear to have notably different variance. We compare these methods when using a machine learning (ML) model to estimate the nuisance parameters of the semiparametric model, and highlight some of the important differences. When the outcome model estimates have little bias, which is common among some key ML models, G-computation and the TMLE outperforms the other estimators in both bias and variance. We show that the differences can be explained using high-dimensional statistical theory, where the number of confounders $d$ is of the same order as the sample size $n$. To make this theoretical problem tractable, we posit a generalized linear model for the effect of the confounders on the treatment assignment and outcomes. Despite making parametric assumptions, this setting is a useful surrogate for some machine learning methods used to adjust for confounding in two-stage semiparametric methods. In particular, the estimation of the first stage adds variance that does not vanish, forcing us to confront terms in the asymptotic expansion that normally are brushed aside as finite sample defects. However, our model emphasizes differences in performance between these estimators beyond first-order asymptotics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题