论文标题
意识形态检测管道和社会心理应用的实用指南
Practical Guidelines for Ideology Detection Pipelines and Psychosocial Applications
论文作者
论文摘要
在线意识形态检测对于下游任务至关重要,例如反对以意识形态动机的暴力极端主义和建模意见动态为重要。但是,从业人员的部署出现了两个重要问题。首先,金标准培训数据是劳动力密集型的收集,并且超出其收集环境(即时间,位置和平台)的可重复使用性有限。其次,为了规避费用,研究人员采用了意识形态信号(例如共享标签)。不幸的是,这些信号的注释要求和上下文可转让性在很大程度上未知,并且它们引起的偏见仍然没有量化。这项研究为需要实时检测大规模在线设置中的左,右和极端意识形态的从业者提供了指南。我们提出了一个用于管道结构的框架,描述了意识形态信号的相关劳动和上下文可转让性。我们评估了许多构造,量化了与信号相关的偏见,并描述了一条胜过最先进方法的管道($ 0.95 $ auc auc roc)。我们在包含超过112万用户的五个数据集上展示了管道的功能。我们着手研究为离线环境开发的社会心理文献中的发现是否适用于在线环境。我们按大规模评估了几种思想,申诉,民族主义和二分法思想的思想思想的思想。我们发现,右翼意识形态使用更多的副教育语言,具有更多的申诉语言,表现出更高的黑白思维方式,并与国旗有更大的联系。这项研究使从业人员拥有意识形态检测准则,并为其应用案例研究,促进了更安全,更好地理解的数字景观。
Online ideology detection is crucial for downstream tasks, like countering ideologically motivated violent extremism and modeling opinion dynamics. However, two significant issues arise in practitioners' deployment. Firstly, gold-standard training data is prohibitively labor-intensive to collect and has limited reusability beyond its collection context (i.e., time, location, and platform). Secondly, to circumvent expense, researchers employ ideological signals (such as hashtags shared). Unfortunately, these signals' annotation requirements and context transferability are largely unknown, and the bias they induce remains unquantified. This study provides guidelines for practitioners requiring real-time detection of left, right, and extreme ideologies in large-scale online settings. We propose a framework for pipeline constructions, describing ideology signals by their associated labor and context transferability. We evaluate many constructions, quantifying the bias associated with signals and describing a pipeline that outperforms state-of-the-art methods ($0.95$ AUC ROC). We showcase the capabilities of our pipeline on five datasets containing more than 1.12 million users. We set out to investigate whether the findings in the psychosocial literature, developed for the offline environment, apply to the online setting. We evaluate at scale several psychosocial hypotheses that delineate ideologies concerning morality, grievance, nationalism, and dichotomous thinking. We find that right-wing ideologies use more vice-moral language, have more grievance-filled language, exhibit increased black-and-white thinking patterns, and have a greater association with national flags. This research empowers practitioners with guidelines for ideology detection, and case studies for its application, fostering a safer and better understood digital landscape.