论文标题
一种集成的测试方法,无法随机丢失
An integrated approach to test for missing not at random
论文作者
论文摘要
缺少的数据可能导致分析中的效率低下和偏见,特别是当数据丢失并非随机丢失时(MNAR)。因此,重要的是要理解并正确识别缺失的数据机制。通过后续样本恢复缺失值可以使研究人员对MNAR进行假设检验,而仅使用原始不完整数据时,这是不可能的。文献中很少探索这些测试的特性如何受到后续样品设计的影响。我们的结果基于常用的选择模型框架提供了对一个这样一个测试的特性的全面见解。我们确定允许测试适当有效地应用测试的恢复样品的条件,即使用已知的I型错误率并在功率方面进行了优化。因此,我们提供了一个集成的框架,用于测试MNAR的存在并以有效的成本效益的方式设计后续样品。通过模拟研究以及实际数据样本评估我们的方法论的性能。
Missing data can lead to inefficiencies and biases in analyses, in particular when data are missing not at random (MNAR). It is thus vital to understand and correctly identify the missing data mechanism. Recovering missing values through a follow up sample allows researchers to conduct hypothesis tests for MNAR, which are not possible when using only the original incomplete data. Investigating how properties of these tests are affected by the follow up sample design is little explored in the literature. Our results provide comprehensive insight into the properties of one such test, based on the commonly used selection model framework. We determine conditions for recovery samples that allow the test to be applied appropriately and effectively, i.e. with known Type I error rates and optimized with respect to power. We thus provide an integrated framework for testing for the presence of MNAR and designing follow up samples in an efficient cost-effective way. The performance of our methodology is evaluated through simulation studies as well as on a real data sample.