新闻标题生成基于变压器的模型的进步

论文标题

新闻标题生成基于变压器的模型的进步

Advances of Transformer-Based Models for News Headline Generation

论文作者

Bukhtiyarov, Alexey, Gusev, Ilya

论文摘要

基于变压器体系结构的验证语言模型是NLP许多领域的最新突破的原因，包括情感分析，问题答案，指定的实体识别。标题生成是一种特殊的文本摘要任务。模型需要具有强烈的自然语言理解，这超出了单个单词和句子的含义，并具有区分基本信息成功的能力。在本文中，我们针对该任务微调了两个预处理的基于变压器的模型（Mbart和Bertsumabs），并在俄罗斯新闻的RIA和Lenta数据集上实现了新的最新结果。 Bertsumabs平均比基于短语的注意力变形金刚和courynet获得的先前最佳分数平均增加2.9点和2.0点。

Pretrained language models based on Transformer architecture are the reason for recent breakthroughs in many areas of NLP, including sentiment analysis, question answering, named entity recognition. Headline generation is a special kind of text summarization task. Models need to have strong natural language understanding that goes beyond the meaning of individual words and sentences and an ability to distinguish essential information to succeed in it. In this paper, we fine-tune two pretrained Transformer-based models (mBART and BertSumAbs) for that task and achieve new state-of-the-art results on the RIA and Lenta datasets of Russian news. BertSumAbs increases ROUGE on average by 2.9 and 2.0 points respectively over previous best score achieved by Phrase-Based Attentional Transformer and CopyNet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题