论文标题
三个网络中的连接致密连接子图
Connected-Dense-Connected Subgraphs in Triple Networks
论文作者
论文摘要
找到有意义的社区 - 大规模网络中感兴趣的子网 - 是各种应用程序的问题。大多数现有的社区检测工作都集中在一个网络上。但是,许多现实生活中的应用自然会产生我们所谓的三重网络。三个网络由两个网络组成,以及它们的节点之间的两部分连接网络。在本文中,我们制定并研究了找到连接的密集连接子图(CDC)的问题,该子网络是两分网络中密度最大的子网,并且在每个网络中的终点集集诱导连接的子网络。这些模式代表了基于网络之间的两分性关联的社区。据我们所知,对于单个网络或异构网络,现有算法无法检测到这种模式。我们表明,找到CDC子图是NP-刺的,并开发了新型的启发式方法以获得可行的溶液,其中最快的是具有N节点和M边缘的O(NLOGN+M)。我们还研究了CDC子图的不同变化。我们对各种真实和合成三重网络进行实验,以评估开发方法的有效性和效率。采用这些启发式方法,我们演示了如何确定类似观点和研究兴趣的社区以及影响社区的因素。
Finding meaningful communities - subnetworks of interest within a large scale network - is a problem with a variety of applications. Most existing work towards community detection focuses on a single network. However, many real-life applications naturally yield what we refer to as Triple Networks. Triple Networks are comprised of two networks, and the network of bipartite connections between their nodes. In this paper, we formulate and investigate the problem of finding Connected-Dense-Connected subgraph (CDC), a subnetwork which has the largest density in the bipartite network and whose sets of end points within each network induce connected subnetworks. These patterns represent communities based on the bipartite association between the networks. To our knowledge, such patterns cannot be detected by existing algorithms for a single network or heterogeneous networks. We show that finding CDC subgraphs is NP-hard and develop novel heuristics to obtain feasible solutions, the fastest of which is O(nlogn+m) with n nodes and m edges. We also study different variations of the CDC subgraphs. We perform experiments on a variety of real and synthetic Triple Networks to evaluate the effectiveness and efficiency of the developed methods. Employing these heuristics, we demonstrate how to identify communities of similar opinions and research interests, and factors influencing communities.