学习打破循环：分析和减轻神经文本生成的重复

论文标题

学习打破循环：分析和减轻神经文本生成的重复

Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation

论文作者

Xu, Jin, Liu, Xiaojiang, Yan, Jianhao, Cai, Deng, Li, Huayang, Li, Jian

论文摘要

尽管大规模的神经语言模型（例如GPT2和BART）在各种文本生成任务上取得了令人印象深刻的结果，但它们倾向于将基于最大化的解码算法（\ textit {efectit {e.g。}，贪婪的搜索，贪婪的搜索）陷入不良句子级别的循环中。这种现象是违反直觉的，因为人类语料库中很少有句子级重复（例如，Wikitext-103中的0.02 \％）。为了调查产生连续句子级重复的根本原因，我们研究了重复令牌的概率与其以前的重复性之间的关系。通过我们的定量实验，我们发现1）语言模型倾向于重复上一句话； 2）句子级的重复具有一个\ textit {自我强化效应}：在上下文中重复句子的次数越多，继续生成该句子的概率就越高； 3）具有较高初始概率的句子通常具有更强的自我强化效果。 Motivated by our findings, we propose a simple and effective training method \textbf{DITTO} (Pseu\underline{D}o-Repet\underline{IT}ion Penaliza\underline{T}i\underline{O}n), where the model learns to penalize probabilities of sentence-level repetitions from pseudo repetitive data.尽管我们的方法是通过缓解重复来激发的，但实验表明，同上不仅可以在不牺牲困惑的情况下减轻重复问题，而且还可以提高更好的生成质量。关于开放式文本生成（Wikitext-103）和文本摘要（CNN/DailyMail）的广泛实验证明了我们方法的一般性和有效性。

While large-scale neural language models, such as GPT2 and BART, have achieved impressive results on various text generation tasks, they tend to get stuck in undesirable sentence-level loops with maximization-based decoding algorithms (\textit{e.g.}, greedy search). This phenomenon is counter-intuitive since there are few consecutive sentence-level repetitions in human corpora (e.g., 0.02\% in Wikitext-103). To investigate the underlying reasons for generating consecutive sentence-level repetitions, we study the relationship between the probabilities of the repetitive tokens and their previous repetitions in the context. Through our quantitative experiments, we find that 1) Language models have a preference to repeat the previous sentence; 2) The sentence-level repetitions have a \textit{self-reinforcement effect}: the more times a sentence is repeated in the context, the higher the probability of continuing to generate that sentence; 3) The sentences with higher initial probabilities usually have a stronger self-reinforcement effect. Motivated by our findings, we propose a simple and effective training method \textbf{DITTO} (Pseu\underline{D}o-Repet\underline{IT}ion Penaliza\underline{T}i\underline{O}n), where the model learns to penalize probabilities of sentence-level repetitions from pseudo repetitive data. Although our method is motivated by mitigating repetitions, experiments show that DITTO not only mitigates the repetition issue without sacrificing perplexity, but also achieves better generation quality. Extensive experiments on open-ended text generation (Wikitext-103) and text summarization (CNN/DailyMail) demonstrate the generality and effectiveness of our method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题