论文标题

精确恢复和随机ISING模型的急剧阈值

Exact recovery and sharp thresholds of Stochastic Ising Block Model

论文作者

Ye, Min

论文摘要

随机块模型(SBM)是一个随机图模型,其中根据顶点上的基础群集结构生成边缘。另一方面,(铁磁)模型(铁磁模型)根据基础图结构将$ \ pm 1 $标签分配给顶点,即如果在图中连接了两个顶点,则更可能将它们分配给它们相同的标签。在SBM中,一个目标是在ISING模型中从图形结构中恢复基础簇,一个广泛研究的问题是基于I.I.D.恢复基础图结构。样品(顶点的标签)。 在本文中,我们提出了SBM的天然组成和ISING模型,我们称之为随机ISING块模型(SIBM)。在SIBM中,我们以最简单的形式采用SBM,其中$ n $顶点分为两个相等大小的群集,边缘在簇中的概率$ p $独立连接,并且在群集中$ q $。然后,我们将SBM生成的图形$ G $作为Ising模型的基础图,并绘制$ m $ i.i.d.从中样本。目的是从Ising模型生成的样品中准确地恢复SBM中的两个簇,而无需观察图$ g $。作为本文的主要结果,我们在适当选择的制度中,在此精确恢复问题的样本复杂性上建立了一个清晰的阈值$ m^\ ast $,其中$ m^\ ast $可以从SIBM的参数中计算出来。我们表明,当$ m \ ge m^\ ast $时,人们可以从$ o(n)$时间中的$ m $样本中恢复簇,因为$ n $的数量将变为无限。当$ m <m <m^\ ast $时,我们进一步表明,对于SIBM的几乎所有参数选择,任何恢复算法的成功概率接近$ 0 $ as $ n \ to \ to \ infty $。

The stochastic block model (SBM) is a random graph model in which the edges are generated according to the underlying cluster structure on the vertices. The (ferromagnetic) Ising model, on the other hand, assigns $\pm 1$ labels to vertices according to an underlying graph structure in a way that if two vertices are connected in the graph then they are more likely to be assigned the same label. In SBM, one aims to recover the underlying clusters from the graph structure while in Ising model, an extensively-studied problem is to recover the underlying graph structure based on i.i.d. samples (labelings of the vertices). In this paper, we propose a natural composition of SBM and the Ising model, which we call the Stochastic Ising Block Model (SIBM). In SIBM, we take SBM in its simplest form, where $n$ vertices are divided into two equal-sized clusters and the edges are connected independently with probability $p$ within clusters and $q$ across clusters. Then we use the graph $G$ generated by the SBM as the underlying graph of the Ising model and draw $m$ i.i.d. samples from it. The objective is to exactly recover the two clusters in SBM from the samples generated by the Ising model, without observing the graph $G$. As the main result of this paper, we establish a sharp threshold $m^\ast$ on the sample complexity of this exact recovery problem in a properly chosen regime, where $m^\ast$ can be calculated from the parameters of SIBM. We show that when $m\ge m^\ast$, one can recover the clusters from $m$ samples in $O(n)$ time as the number of vertices $n$ goes to infinity. When $m<m^\ast$, we further show that for almost all choices of parameters of SIBM, the success probability of any recovery algorithms approaches $0$ as $n\to\infty$.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源