论文标题

联合光谱嵌入中的偏差变化权衡取舍

Bias-Variance Tradeoffs in Joint Spectral Embeddings

论文作者

Draves, Benjamin, Sussman, Daniel L.

论文摘要

联合光谱嵌入通过同时将每个网络中的顶点映射到欧几里得空间中的点,从而促进了多个网络数据的分析,然后进行统计推断。在这项工作中,我们考虑了一种这样的联合嵌入技术,即Arxiv:1705.09355的综合嵌入,该技术已成功用于社区检测,异常检测和假设测试任务。迄今为止,这种方法的理论属性仅在有条件地是有条件地是I.I.D.随机点产品图。在本文中,我们迈出了在存在异质网络数据的情况下表征综合嵌入的理论特性的第一步。在潜在位置模型下,我们显示综合嵌入隐式定于其潜在位置估计值,从而导致有限样本的偏置偏差折衷权衡对潜在位置估计。我们建立一个明确的偏置表达,得出在残留物上的均匀浓度,并证明了表征这些估计值的分布特性的中心极限定理。这些明确的偏见和差异表达式使我们能够说明足够的条件,以便在社区检测任务中精确恢复并开发关键测试统计量,以确定两个图是否共享相同的潜在位置;尽管估计器的不一致,但表明准确的推断是可以实现的。这些结果在几种实验环境中得到了证明,其中利用综合嵌入的统计程序具有竞争力,并且通常比可比的嵌入技术更有竞争力,并且通常更可取。这些观察值强调了综合嵌入以外的多个图推理以外的均匀网络设置的可行性。

Joint spectral embeddings facilitate analysis of multiple network data by simultaneously mapping vertices in each network to points in Euclidean space where statistical inference is then performed. In this work, we consider one such joint embedding technique, the omnibus embedding of arXiv:1705.09355 , which has been successfully used for community detection, anomaly detection, and hypothesis testing tasks. To date the theoretical properties of this method have only been established under the strong assumption that the networks are conditionally i.i.d. random dot product graphs. Herein, we take a first step in characterizing the theoretical properties of the omnibus embedding in the presence of heterogeneous network data. Under a latent position model, we show the omnibus embedding implicitly regularizes its latent position estimates which induces a finite-sample bias-variance tradeoff for latent position estimation. We establish an explicit bias expression, derive a uniform concentration bound on the residual, and prove a central limit theorem characterizing the distributional properties of these estimates. These explicit bias and variance expressions enable us to state sufficient conditions for exact recovery in community detection tasks and develop a pivotal test statistic to determine whether two graphs share the same set of latent positions; demonstrating that accurate inference is achievable despite the estimator's inconsistency. These results are demonstrated in several experimental settings where statistical procedures utilizing the omnibus embedding are competitive, and oftentimes preferable, to comparable embedding techniques. These observations accentuate the viability of the omnibus embedding for multiple graph inference beyond the homogeneous network setting.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源