论文标题

使用广义力量法的两部分图的最小值聚类

Minimax Optimal Clustering of Bipartite Graphs with a Generalized Power Method

论文作者

Braun, Guillaume, Tyagi, Hemant

论文摘要

聚集二分图是网络分析中的一项基本任务。在高维度的行中,该行的数量$ n_1 $和相关邻接矩阵的$ n_2 $ $ n_2 $的数量不同,从用于对称图的矩阵的现有方法可以带有次优的保证。由于在高维度下的两分图应用程序的数量增加,因此在此设置中设计最佳算法至关重要。 Ndaoud等人的最新工作。 (2022)在特殊情况下,在列(分别行)可以将$ l = 2 $(分别$ k = 2 $)社区分配到列(分别行)的特殊情况下,改善了现有的上限率。不幸的是,它们的算法不能扩展到更通用的设置,其中$ k \ neq l \ geq 2 $。我们通过基于功率方法引入新算法来克服这一限制。我们得出了在$ k \ neq l \ geq 2 $的一般环境中精确恢复的条件,并证明它在Ndaoud等人中恢复了结果。 (2022)。当$ k = l $在我们的模型的对称版本下$ k = l $时,我们还会在少量分类误差上得出一个minimax下限,该版本将相应的上限匹配到一个因子,具体取决于$ k $。

Clustering bipartite graphs is a fundamental task in network analysis. In the high-dimensional regime where the number of rows $n_1$ and the number of columns $n_2$ of the associated adjacency matrix are of different order, existing methods derived from the ones used for symmetric graphs can come with sub-optimal guarantees. Due to increasing number of applications for bipartite graphs in the high dimensional regime, it is of fundamental importance to design optimal algorithms for this setting. The recent work of Ndaoud et al. (2022) improves the existing upper-bound for the misclustering rate in the special case where the columns (resp. rows) can be partitioned into $L = 2$ (resp. $K = 2$) communities. Unfortunately, their algorithm cannot be extended to the more general setting where $K \neq L \geq 2$. We overcome this limitation by introducing a new algorithm based on the power method. We derive conditions for exact recovery in the general setting where $K \neq L \geq 2$, and show that it recovers the result in Ndaoud et al. (2022). We also derive a minimax lower bound on the misclustering error when $K = L$ under a symmetric version of our model, which matches the corresponding upper bound up to a factor depending on $K$.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源