论文标题
在汇总功能下寻找顶级影响力的社区
Finding Top-r Influential Communities under Aggregation Functions
论文作者
论文摘要
社区搜索是一个问题,它在满足某些拓扑约束(例如学位约束)的图中寻求凝聚力和连接的子图。现有的大多数作品仅着重于拓扑,而忽略了节点在社区中的影响。为了解决这一缺陷,进一步提出了有影响力的社区搜索,以包括节点的影响。在有影响力的社区搜索问题中,每个节点都有一个重量,即影响价值,以代表其网络影响。社区的影响价值是由汇总功能(例如Max,Min,AVG和总和)对同一社区中节点的影响值产生的。有影响力的社区搜索问题的目的是在满足拓扑限制的同时,找到具有最高影响价值的顶级R社区。现有关于有影响力的社区搜索的研究有几个局限性:(i)它们仅关注简单的聚合功能,例如Min,在许多现实世界中,它们可能没有某些要求,并且(ii)它们对社区的规模没有任何限制,而大多数现实世界中的情况都没有。这激发了我们进行一项新的研究以填补这一空白。我们考虑识别具有/没有大小约束的顶级影响社区的问题,同时使用更复杂的聚合功能,例如SUM或AVG。我们进行了理论分析,证明了问题的硬度,并为我们的TOPR有影响力的社区搜索问题提出了有效有效的启发式解决方案。在实际大图上进行的广泛实验表明,我们所提出的解决方案比基线解决方案要高得多。
Community search is a problem that seeks cohesive and connected subgraphs in a graph that satisfy certain topology constraints, e.g., degree constraints. The majority of existing works focus exclusively on the topology and ignore the nodes' influence in the communities. To tackle this deficiency, influential community search is further proposed to include the node's influence. Each node has a weight, namely influence value, in the influential community search problem to represent its network influence. The influence value of a community is produced by an aggregated function, e.g., max, min, avg, and sum, over the influence values of the nodes in the same community. The objective of the influential community search problem is to locate the top-r communities with the highest influence values while satisfying the topology constraints. Existing studies on influential community search have several limitations: (i) they focus exclusively on simple aggregation functions such as min, which may fall short of certain requirements in many real-world scenarios, and (ii) they impose no limitation on the size of the community, whereas most real-world scenarios do. This motivates us to conduct a new study to fill this gap. We consider the problem of identifying the top-r influential communities with/without size constraints while using more complicated aggregation functions such as sum or avg. We give a theoretical analysis demonstrating the hardness of the problems and propose efficient and effective heuristic solutions for our topr influential community search problems. Extensive experiments on real large graphs demonstrate that our proposed solution is significantly more efficient than baseline solutions.