论文标题
使用最佳运输的无分配关节独立性测试和鲁棒独立组件分析
Distribution-free joint independence testing and robust independent component analysis using optimal transport
论文作者
论文摘要
在本文中,我们研究了测量和测试关节独立性的问题,以收集多元随机变量。使用基于最佳运输(OT)的多元级别的新兴理论,我们提出了用于多元关节独立性的无分配测试。为此,我们介绍了等级关节距离协方差(RJDCOV)的概念,即著名距离协方差措施的高阶等级类似物,捕获了变量所有子集之间的依赖关系。 RJDCOV可以轻松地从数据中估算,而无需任何时刻假设,并且关节独立性的相关测试普遍一致。我们可以在不了解(未知)边际分布的情况下(由于无分配特性)校准测试,无论是渐近的和有限的样品。除了无分配和普遍一致外,提出的测试在统计上也有效,也就是说,它具有非平凡的渐近(Pitman)效率。我们通过计算混合替代品和konijn替代方案的测试的局部功率来证明这一点。我们还使用RJDCOV度量来开发一种用于独立组件分析(ICA)的方法,该方法易于实现和对异常值和污染的稳定性。与其他现有方法相比,进行了广泛的模拟以说明所提出的测试的功效。最后,我们使用拟议的测试来根据股票价格学习不同行业之间的高阶依赖结构。
In this paper we study the problem of measuring and testing joint independence for a collection of multivariate random variables. Using the emerging theory of optimal transport (OT) based multivariate ranks, we propose a distribution-free test for multivariate joint independence. Towards this we introduce the notion of rank joint distance covariance (RJdCov), the higher-order rank analogue of the celebrated distance covariance measure, that captures the dependencies among all the subsets of the variables. The RJdCov can be easily estimated from the data without any moment assumptions and the associated test for joint independence is universally consistent. We can calibrate the test without any knowledge of the (unknown) marginal distributions (due to the distribution-free property), both asymptotically and in finite samples. In addition to being distribution-free and universally consistent, the proposed test is also statistically efficient, that is, it has non-trivial asymptotic (Pitman) efficiency. We demonstrate this by computing the limiting local power of the test for both mixture alternatives and joint Konijn alternatives. We also use the RJdCov measure to develop a method for independent component analysis (ICA) that is easy to implement and robust to outliers and contamination. Extensive simulations are performed to illustrate the efficacy of the proposed test in comparison to other existing methods. Finally, we apply the proposed test to learn the higher-order dependence structure among different US industries based on stock prices.