论文标题
Cauchy强大的主成分分析,并应用于高差异数据集
Cauchy robust principal component analysis with applications to high-deimensional data sets
论文作者
论文摘要
主成分分析(PCA)是在各种研究和应用领域中使用的标准维度降低技术。从算法的角度来看,可以根据多元高斯可能性的操作来制定经典的PCA。由于隐含的高斯配方,主要组成部分对异常值不强。在本文中,我们基于使用多元cauchy的可能性而不是高斯的可能性,提出了一种修改的公式,该可能具有鲁棒化的主要成分。我们提出了一种计算这些可靠的主组件的算法。我们还得出了第一个组件的相关影响函数,并检查其理论特性。高维数据集上的仿真实验表明,基于Cauchy可能性跑赢大盘的估计主组件或与现有强大的PCA技术相提并论。
Principal component analysis (PCA) is a standard dimensionality reduction technique used in various research and applied fields. From an algorithmic point of view, classical PCA can be formulated in terms of operations on a multivariate Gaussian likelihood. As a consequence of the implied Gaussian formulation, the principal components are not robust to outliers. In this paper, we propose a modified formulation, based on the use of a multivariate Cauchy likelihood instead of the Gaussian likelihood, which has the effect of robustifying the principal components. We present an algorithm to compute these robustified principal components. We additionally derive the relevant influence function of the first component and examine its theoretical properties. Simulation experiments on high-dimensional datasets demonstrate that the estimated principal components based on the Cauchy likelihood outperform or are on par with existing robust PCA techniques.