论文标题

综合社区意识多样化的影响最大化,有效近似

Composite Community-Aware Diversified Influence Maximization with Efficient Approximation

论文作者

Guo, Jianxiong, Ni, Qiufen, Wu, Weili, Du, Ding-Zhu

论文摘要

影响最大化(IM)是移动网络和社交计算中的一个著名话题,旨在找到一小部分用户,以最大程度地利用在线信息级联来传播影响的影响。最近,一些谨慎的研究人员关注信息传播的多样性,尤其是社区意识的多样性,并提出了多元化的IM问题。在许多现实世界中,多样性无处不在,但它们都是基于给定的社区结构的。在社交网络中,我们可以根据不同的指标为同一组用户形成异质的社区结构。因此,如何基于多个社区结构量化多样性是一个有趣的问题。在本文中,我们提出了综合社区意识到的多元化IM(CC-DIM)问题,该问题旨在选择一个种子,以最大程度地利用影响的影响,而复合多样性对所有可能的社区结构。为了解决CC-DIM问题的NP硬度,我们采用了反向影响采样的技术,并设计了一个随机的广义反向到达(G-RR),以估计目标函数。随机G-RR集的组成比用于IM问题的RR集更为复杂,这将导致基于传统采样的近似算法的效率低下。因此,我们进一步提出了一种两阶段的算法,广义历史(G-Hist)。它不仅可以返回$(1-1/e- \ varepsilon)$(至少$(1-δ)$概率),而且还提高了采样效率,并通过显着降低G-RR集的平均大小来减轻搜索的难度。最后,我们针对现有算法评估了实际数据集上的G-Hist。实验结果表明,我们提出的算法的有效性及其优于其他基线算法。

Influence Maximization (IM) is a famous topic in mobile networks and social computing, which aims at finding a small subset of users to maximize the influence spread through online information cascade. Recently, some careful researchers paid attention to diversity of information dissemination, especially community-aware diversity, and formulated the diversified IM problem. The diversity is ubiquitous in a lot of real-world applications, but they are all based on a given community structure. In social networks, we can form heterogeneous community structures for the same group of users according to different metrics. Therefore, how to quantify the diversity based on multiple community structures is an interesting question. In this paper, we propose the Composite Community-Aware Diversified IM (CC-DIM) problem, which aims at selecting a seed set to maximize the influence spread and the composite diversity over all possible community structures under consideration. To address the NP-hardness of CC-DIM problem, we adopt the technique of reverse influence sampling and design a random Generalized Reverse Reachable (G-RR) set to estimate the objective function. The composition of a random G-RR set is much more complex than the RR set used for the IM problem, which will lead to inefficiency of traditional sampling-based approximation algorithms. Because of this, we further propose a two-stage algorithm, Generalized HIST (G-HIST). It can not only return a $(1-1/e-\varepsilon)$ approximate solution with at least $(1-δ)$ probability, but also improve the efficiency of sampling and ease the difficulty of searching by significantly reducing the average size of G-RR sets. Finally, we evaluate our G-HIST on real datasets against existing algorithms. The experimental results show the effectiveness of our proposed algorithm and its superiority over other baseline algorithms.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源