一种简单但很难进行的数据增强方法，用于自然语言理解和产生

论文标题

一种简单但很难进行的数据增强方法，用于自然语言理解和产生

A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation

论文作者

Shen, Dinghan, Zheng, Mingzhi, Shen, Yelong, Qu, Yanru, Chen, Weizhu

论文摘要

对抗性训练已显示出具有更强的概括能力的赋予学说的有效性。但是，通常需要昂贵的计算来确定注射扰动的方向。在本文中，我们介绍了一组简单而有效的数据增强策略，称为临界值，其中一部分信息在输入句子中被删除以产生其受限观点（在微调阶段）。值得注意的是，此过程仅依靠随机抽样，因此几乎没有计算开销。詹森 - 香农的分歧一致性损失进一步利用将这些增强样品纳入训练目标中。为了验证拟议策略的有效性，我们将截止性应用于自然语言理解和产生问题。在胶水基准上，证明临界值是简单的，在标准杆或比几种基于竞争性的对抗性方法上表现出色或更好。我们进一步扩展到机器翻译，并观察到BLEU得分的显着增长（基于变压器基本模型）。此外，临界值始终优于对抗性训练，并在IWSLT2014德语 - 英语数据集上实现最先进的结果。

Adversarial training has been shown effective at endowing the learned representations with stronger generalization ability. However, it typically requires expensive computation to determine the direction of the injected perturbations. In this paper, we introduce a set of simple yet effective data augmentation strategies dubbed cutoff, where part of the information within an input sentence is erased to yield its restricted views (during the fine-tuning stage). Notably, this process relies merely on stochastic sampling and thus adds little computational overhead. A Jensen-Shannon Divergence consistency loss is further utilized to incorporate these augmented samples into the training objective in a principled manner. To verify the effectiveness of the proposed strategies, we apply cutoff to both natural language understanding and generation problems. On the GLUE benchmark, it is demonstrated that cutoff, in spite of its simplicity, performs on par or better than several competitive adversarial-based approaches. We further extend cutoff to machine translation and observe significant gains in BLEU scores (based upon the Transformer Base model). Moreover, cutoff consistently outperforms adversarial training and achieves state-of-the-art results on the IWSLT2014 German-English dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题