论文标题

Wikipedia上的趋势是什么?捕获Wikipedia版本的趋势和语言偏见

What is Trending on Wikipedia? Capturing Trends and Language Biases Across Wikipedia Editions

论文作者

Miz, Volodymyr, Hanna, Joëlle, Aspert, Nicolas, Ricaud, Benjamin, Vandergheynst, Pierre

论文摘要

在这项工作中,我们建议对Wikipedia读者的浏览行为进行自动评估和比较,可将其应用于Wikipedia的任何语言版本。例如,我们在2018年的最后四个月中专注于英语,法语和俄罗斯语言。拟议的方法有三个步骤。首先,它在所选的一段时间内提取最流行的文章。其次,它执行半监督主题提取,第三,它比较了跨语言的主题。自动化处理与结合了Wikipedia的超链接,页面浏览量统计信息和页面摘要的数据合作。 结果表明,人们对娱乐有着共同的兴趣和好奇心,例如电影,音乐,运动独立于他们的语言。与当地事件或文化特殊性有关的主题中出现差异。互动式可视化显示了每个语言版本中趋势页面的簇https://wiki-insights.epfl.ch/wikitrends

In this work, we propose an automatic evaluation and comparison of the browsing behavior of Wikipedia readers that can be applied to any language editions of Wikipedia. As an example, we focus on English, French, and Russian languages during the last four months of 2018. The proposed method has three steps. Firstly, it extracts the most trending articles over a chosen period of time. Secondly, it performs a semi-supervised topic extraction and thirdly, it compares topics across languages. The automated processing works with the data that combines Wikipedia's graph of hyperlinks, pageview statistics and summaries of the pages. The results show that people share a common interest and curiosity for entertainment, e.g. movies, music, sports independently of their language. Differences appear in topics related to local events or about cultural particularities. Interactive visualizations showing clusters of trending pages in each language edition are available online https://wiki-insights.epfl.ch/wikitrends

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源