论文标题

信息流估计:Twitter上的新闻研究

Information flow estimation: a study of news on Twitter

论文作者

South, Tobin, Smart, Bridget, Roughan, Matthew, Mitchell, Lewis

论文摘要

长期以来,新闻媒体一直是创建,繁殖和批评的生态系统,新闻媒体报道了时事,并为正在进行的故事增加了评论。了解新闻信息创建和分散的动态对于准确地归因于有影响力的工作并了解社会叙事的发展很重要。这些动态可以通过信息理论自然语言处理和网络的结合来建模;并且可以使用大量文本数据进行参数化。但是,看到“树木的木材”,即检测到噪音之海中的小但重要的信息流是一项挑战。在这里,我们开发了新的比较技术,以估计文本生产者对之间的时间信息流。使用模拟和实际文本数据,我们比较了用于估计文本信息流的方法的可靠性和灵敏度,这表明通过本地邻域结构进行正常化的度量标准提供了大型网络中信息流的强大估计。我们将此指标应用于Twitter上的大量新闻机构,并证明了其在识别信息生态系统中影响的有用性,发现对网络的平均信息贡献与关注者的数量或推文的数量无关。这表明,平均追随者数量较低的小型本地组织和右翼组织仍然为生态系统提供重要信息。此外,这些方法应用于跨新闻网站和Twitter上俄罗斯巨魔帐户的较小的全文数据集。信息流估计揭示了这些事件如何发展的特征和巨魔群在设定虚假信息叙事中的作用。

News media has long been an ecosystem of creation, reproduction, and critique, where news outlets report on current events and add commentary to ongoing stories. Understanding the dynamics of news information creation and dispersion is important to accurately ascribe credit to influential work and understand how societal narratives develop. These dynamics can be modelled through a combination of information-theoretic natural language processing and networks; and can be parameterised using large quantities of textual data. However, it is challenging to see "the wood for the trees", i.e., to detect small but important flows of information in a sea of noise. Here we develop new comparative techniques to estimate temporal information flow between pairs of text producers. Using both simulated and real text data we compare the reliability and sensitivity of methods for estimating textual information flow, showing that a metric that normalises by local neighbourhood structure provides a robust estimate of information flow in large networks. We apply this metric to a large corpus of news organisations on Twitter and demonstrate its usefulness in identifying influence within an information ecosystem, finding that average information contribution to the network is not correlated with the number of followers or the number of tweets. This suggests that small local organisations and right-wing organisations which have lower average follower counts still contribute significant information to the ecosystem. Further, the methods are applied to smaller full-text datasets of specific news events across news sites and Russian troll accounts on Twitter. The information flow estimation reveals and quantifies features of how these events develop and the role of groups of trolls in setting disinformation narratives.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源