checkthat的check_square！ 2020年：通过融合变压器和句法特征在社交媒体中索赔检测

论文标题

checkthat的check_square！ 2020年：通过融合变压器和句法特征在社交媒体中索赔检测

Check_square at CheckThat! 2020: Claim Detection in Social Media via Fusion of Transformer and Syntactic Features

论文作者

Cheema, Gullal S., Hakimov, Sherzod, Ewerth, Ralph

论文摘要

在这个新闻消费的数字时代，新闻阅读器具有以高度互动和快速的方式与他人做出反应，表达和分享意见的能力。结果，由于大型公司和个人在互联网上验证新闻的能力非常有限，假新闻已经进入了我们的日常生活。在本文中，我们专注于解决两个问题，这是事实检查生态系统的一部分，这些问题可以帮助在社交媒体上越来越多的内容流中自动化事实检查主张。对于第一个问题，请求检验值预测，我们探讨了句法特征和深度变压器双向编码器的融合，从变压器（BERT）嵌入式融合，以对推文的检查值进行分类，即是否包括索赔。我们进行了详细的功能分析，并为英语和阿拉伯语推文提供了最佳性能模型。对于第二个问题，请求检索，我们探索了经过专门训练用于语义文本相似性的暹罗网络变压器模型（句子转换器）的预训练的嵌入，并执行KD搜索以检索有关查询推文的验证索赔。

In this digital age of news consumption, a news reader has the ability to react, express and share opinions with others in a highly interactive and fast manner. As a consequence, fake news has made its way into our daily life because of very limited capacity to verify news on the Internet by large companies as well as individuals. In this paper, we focus on solving two problems which are part of the fact-checking ecosystem that can help to automate fact-checking of claims in an ever increasing stream of content on social media. For the first problem, claim check-worthiness prediction, we explore the fusion of syntactic features and deep transformer Bidirectional Encoder Representations from Transformers (BERT) embeddings, to classify check-worthiness of a tweet, i.e. whether it includes a claim or not. We conduct a detailed feature analysis and present our best performing models for English and Arabic tweets. For the second problem, claim retrieval, we explore the pre-trained embeddings from a Siamese network transformer model (sentence-transformers) specifically trained for semantic textual similarity, and perform KD-search to retrieve verified claims with respect to a query tweet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题