论文标题
自适应抽样后的统计推断以获取纵向数据
Statistical Inference After Adaptive Sampling for Longitudinal Data
论文作者
论文摘要
在线增强学习和其他自适应抽样算法越来越多地用于数字干预实验中,以随着时间的推移优化用户的治疗交付。在这项工作中,我们专注于由大量自适应采样算法收集的纵向用户数据,这些算法旨在使用来自多个用户的计算数据在线优化治疗决策。跨用户组合或“汇总”数据允许自适应采样算法可能更快地学习。但是,通过汇总,这些算法会在采样的用户数据轨迹之间诱导依赖性。我们表明,这可能会导致I.I.D.的标准方差估计器。数据以低估该数据类型上常见估计器的真实差异。我们开发了新的方法,通过Z估计对这种自适应采样数据进行各种统计分析。具体而言,我们介绍了\ textIt {自适应}三明治方差估计器,这是一个校正的三明治估计器,可导致自适应采样下的一致方差估计值。此外,为了证明我们的结果,我们开发了在非I.I.D.上的经验过程的新型理论工具,可以自适应地采样可能具有独立关注的纵向数据。这项工作是由于我们在设计实验方面的努力而进行的,在线加强学习算法优化了治疗决策,但是统计推断对于在实验结束后进行分析至关重要。
Online reinforcement learning and other adaptive sampling algorithms are increasingly used in digital intervention experiments to optimize treatment delivery for users over time. In this work, we focus on longitudinal user data collected by a large class of adaptive sampling algorithms that are designed to optimize treatment decisions online using accruing data from multiple users. Combining or "pooling" data across users allows adaptive sampling algorithms to potentially learn faster. However, by pooling, these algorithms induce dependence between the sampled user data trajectories; we show that this can cause standard variance estimators for i.i.d. data to underestimate the true variance of common estimators on this data type. We develop novel methods to perform a variety of statistical analyses on such adaptively sampled data via Z-estimation. Specifically, we introduce the \textit{adaptive} sandwich variance estimator, a corrected sandwich estimator that leads to consistent variance estimates under adaptive sampling. Additionally, to prove our results we develop novel theoretical tools for empirical processes on non-i.i.d., adaptively sampled longitudinal data which may be of independent interest. This work is motivated by our efforts in designing experiments in which online reinforcement learning algorithms optimize treatment decisions, yet statistical inference is essential for conducting analyses after experiments conclude.