论文标题
更好的数据存储,更好的翻译:从预训练的模型中生成数据存储
Better Datastore, Better Translation: Generating Datastores from Pre-Trained Models for Nearest Neural Machine Translation
论文作者
论文摘要
最近的邻居机器翻译(KNNMT)是一种使用令牌级的最近邻居检索机构增强神经机转换(NMT)的简单有效方法。 KNNMT的有效性直接取决于检索到的邻居的质量。但是,原始KNNMT根据NMT模型的表示形式构建数据存储,当NMT模型不够好时,这将导致检索准确性不佳,从而导致了次优的翻译性能。在本文中,我们提出了一个框架,该框架利用KNN-MT中的数据存储的预训练模型。预先训练的模型的更好表示使我们能够构建质量更好的数据存储。我们还设计了一个新颖的对比对准目标,以减轻NMT模型和预训练模型之间的表示差距,从而使NMT模型可以从更好的数据存储中检索。我们对双语和多语言翻译基准进行了广泛的实验,包括WMT17英语$ \ leftrightArrow $中文,WMT14英语$ \ leftrightArrow $ derman,iwslt14 derman $ \ leftrightightArrow $英语,以及iwslt14 formiantianting DataSetal。经验结果证明了pred的有效性。
Nearest Neighbor Machine Translation (kNNMT) is a simple and effective method of augmenting neural machine translation (NMT) with a token-level nearest neighbor retrieval mechanism. The effectiveness of kNNMT directly depends on the quality of retrieved neighbors. However, original kNNMT builds datastores based on representations from NMT models, which would result in poor retrieval accuracy when NMT models are not good enough, leading to sub-optimal translation performance. In this paper, we propose PRED, a framework that leverages Pre-trained models for Datastores in kNN-MT. Better representations from pre-trained models allow us to build datastores of better quality. We also design a novel contrastive alignment objective to mitigate the representation gap between the NMT model and pre-trained models, enabling the NMT model to retrieve from better datastores. We conduct extensive experiments on both bilingual and multilingual translation benchmarks, including WMT17 English $\leftrightarrow$ Chinese, WMT14 English $\leftrightarrow$ German, IWSLT14 German $\leftrightarrow$ English, and IWSLT14 multilingual datasets. Empirical results demonstrate the effectiveness of PRED.