论文标题

基于共同成员的通用异常社区检测

Co-Membership-based Generic Anomalous Communities Detection

论文作者

Lapid, Shay, Kagan, Dima, Fire, Michael

论文摘要

如今,检测网络中的异常社区是研究的重要任务,因为它有助于发现对社区结构化网络的见解。大多数现有方法利用有关顶点属性或社区拓扑结构的信息。在这项研究中,我们介绍了基于共有的通用异常群落检测算法(称为CMMAC),这是一种新颖而通用的方法,它利用顶点的信息在多个社区中共同成员。 CMMAC不含域,几乎不受社区规模和密度的影响。具体来说,我们培训分类器,以预测社区中每个顶点的可能性。然后,我们通过每个社区顶点的汇总成员资格概率对社区进行排名。排名最低的社区被认为是异常的。此外,我们提出了一种算法,用于产生社区结构的随机网络,从而使异常社区的注入能够促进该领域的研究。我们利用它来生成两个数据集,这些数据集由数千个被标记的注入异常网络组成,并发布了它们。我们对成千上万的模拟和现实世界网络进行了广泛的实验,这些网络注入了人工异常。 CMMAC在一系列设置中优于其他现有方法。此外,我们证明了CMMAC可以在不同领域(例如Reddit和Wikipedia)中识别现实世界中未标记的网络中的异常社区。

Nowadays, detecting anomalous communities in networks is an essential task in research, as it helps discover insights into community-structured networks. Most of the existing methods leverage either information regarding attributes of vertices or the topological structure of communities. In this study, we introduce the Co-Membership-based Generic Anomalous Communities Detection Algorithm (referred as to CMMAC), a novel and generic method that utilizes the information of vertices co-membership in multiple communities. CMMAC is domain-free and almost unaffected by communities' sizes and densities. Specifically, we train a classifier to predict the probability of each vertex in a community being a member of the community. We then rank the communities by the aggregated membership probabilities of each community's vertices. The lowest-ranked communities are considered to be anomalous. Furthermore, we present an algorithm for generating a community-structured random network enabling the infusion of anomalous communities to facilitate research in the field. We utilized it to generate two datasets, composed of thousands of labeled anomaly-infused networks, and published them. We experimented extensively on thousands of simulated, and real-world networks, infused with artificial anomalies. CMMAC outperformed other existing methods in a range of settings. Additionally, we demonstrated that CMMAC can identify abnormal communities in real-world unlabeled networks in different domains, such as Reddit and Wikipedia.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源