论文标题
一到一代的目标条件
Target Conditioning for One-to-Many Generation
论文作者
论文摘要
神经机器翻译(NMT)模型即使与搜索算法(如光束搜索)配对时,其生成的翻译也常常缺乏多样性。一个挑战是翻译的多样性是由目标语言的变异性引起的,不能仅从源句子中推断出来。在本文中,我们建议通过在代表目标句子域的潜在变量上调节NMT模型的解码器来明确对这一一对多映射进行建模。该域是由目标编码器生成的离散变量,该变量与NMT模型共同训练。目标句子的预测域作为在训练期间的输入。在推断时,我们可以通过解码不同的域来产生不同的翻译。与我们最强的基线不同(Shen等,2019),我们的方法可以扩展到任何数量的域而不会影响性能或训练时间。我们在三个不同的数据集上评估了模型通过几个指标产生的翻译的质量和多样性。
Neural Machine Translation (NMT) models often lack diversity in their generated translations, even when paired with search algorithm, like beam search. A challenge is that the diversity in translations are caused by the variability in the target language, and cannot be inferred from the source sentence alone. In this paper, we propose to explicitly model this one-to-many mapping by conditioning the decoder of a NMT model on a latent variable that represents the domain of target sentences. The domain is a discrete variable generated by a target encoder that is jointly trained with the NMT model. The predicted domain of target sentences are given as input to the decoder during training. At inference, we can generate diverse translations by decoding with different domains. Unlike our strongest baseline (Shen et al., 2019), our method can scale to any number of domains without affecting the performance or the training time. We assess the quality and diversity of translations generated by our model with several metrics, on three different datasets.