地图集：通过检索增强语言模型的几乎没有学习的学习

论文标题

地图集：通过检索增强语言模型的几乎没有学习的学习

Atlas: Few-shot Learning with Retrieval Augmented Language Models

论文作者

Izacard, Gautier, Lewis, Patrick, Lomeli, Maria, Hosseini, Lucas, Petroni, Fabio, Schick, Timo, Dwivedi-Yu, Jane, Joulin, Armand, Riedel, Sebastian, Grave, Edouard

论文摘要

大型语言模型在各种任务上显示出令人印象深刻的几次结果。但是，当知识是此类结果的关键时，就像问题回答和事实检查之类的任务一样，似乎需要存储知识的大量参数计数。众所周知，检索增强模型可以在不需要多个参数的情况下在知识密集的任务上表现出色，但是目前尚不清楚它们是否在几个弹药设置中起作用。在这项工作中，我们介绍了地图集，这是一个精心设计和预训练的检索增强语言模型，能够通过很少的培训示例学习知识密集型任务。我们对包括MMLU，苏格兰短裙和归类等各种任务进行评估，并研究文档索引内容的影响，表明它可以很容易地进行更新。值得注意的是，在自然问题上仅使用64个示例在自然问题上达到超过42％的精度，尽管参数减少了50倍，但比540b参数模型的表现优于540b参数模型。

Large language models have shown impressive few-shot results on a wide range of tasks. However, when knowledge is key for such results, as is the case for tasks such as question answering and fact checking, massive parameter counts to store knowledge seem to be needed. Retrieval augmented models are known to excel at knowledge intensive tasks without the need for as many parameters, but it is unclear whether they work in few-shot settings. In this work we present Atlas, a carefully designed and pre-trained retrieval augmented language model able to learn knowledge intensive tasks with very few training examples. We perform evaluations on a wide range of tasks, including MMLU, KILT and NaturalQuestions, and study the impact of the content of the document index, showing that it can easily be updated. Notably, Atlas reaches over 42% accuracy on Natural Questions using only 64 examples, outperforming a 540B parameters model by 3% despite having 50x fewer parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题