论文标题
多元极端的内核PCA
Kernel PCA for multivariate extremes
论文作者
论文摘要
我们建议内核PCA作为分析多元极端的依赖性结构的一种方法,并证明它可以成为聚类和降低尺寸的强大工具。我们的工作为内核PCA获得的预图提供了一些理论上的见解,表明在某些条件下,它们可以有效地识别数据中的簇。我们基于这些新见解,以严格表征基于极端样本的内核PCA的性能,即半径超过较大阈值的随机向量的角部分。更具体地说,我们专注于极值理论中以角度或光谱度量为特征的多变量极端的渐近依赖性,并在从线性因子模型产生极端的情况下提供了仔细的分析。我们通过利用其渐近分布以及Davis-Kahan扰动界限来提供理论上的内核PCA预示性能。我们的理论发现补充了数字实验,以说明我们方法的有限样本性能。
We propose kernel PCA as a method for analyzing the dependence structure of multivariate extremes and demonstrate that it can be a powerful tool for clustering and dimension reduction. Our work provides some theoretical insight into the preimages obtained by kernel PCA, demonstrating that under certain conditions they can effectively identify clusters in the data. We build on these new insights to characterize rigorously the performance of kernel PCA based on an extremal sample, i.e., the angular part of random vectors for which the radius exceeds a large threshold. More specifically, we focus on the asymptotic dependence of multivariate extremes characterized by the angular or spectral measure in extreme value theory and provide a careful analysis in the case where the extremes are generated from a linear factor model. We give theoretical guarantees on the performance of kernel PCA preimages of such extremes by leveraging their asymptotic distribution together with Davis-Kahan perturbation bounds. Our theoretical findings are complemented with numerical experiments illustrating the finite sample performance of our methods.