寻求知识的语言模型：模块化搜索和生成对话和及时完成

论文标题

寻求知识的语言模型：模块化搜索和生成对话和及时完成

Language Models that Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt Completion

论文作者

Shuster, Kurt, Komeili, Mojtaba, Adolphs, Leonard, Roller, Stephen, Szlam, Arthur, Weston, Jason

论文摘要

最近已证明语言模型（LMS）通过使用模块化（Zhou等，2021）与检索结合（Adolphs等，2021）来产生更多的事实响应。我们扩展了Adolphs等人的最新方法。（2021）将互联网搜索作为模块。因此，我们的搜索者（搜索引擎 - >知识 - >响应）方法将单个LM应用于三个模块化任务：搜索，生成知识和生成最终响应。我们表明，在将寻求者用作对话模型时，就一致性，知识和逐步引人入用而言，在开放域知识的对话中，在开放域知识的对话中，它优于最先进的模型Blenderbot 2（Chen等，2021）。尽管GPT3是一个较大的模型，但寻求者作为标准语言模型作为标准语言模型的局部及时完成都优于GPT2（Radford等，2019）和GPT3（Brown等，2020）。我们的代码和模型公开可用。

Language models (LMs) have recently been shown to generate more factual responses by employing modularity (Zhou et al., 2021) in combination with retrieval (Adolphs et al., 2021). We extend the recent approach of Adolphs et al. (2021) to include internet search as a module. Our SeeKeR (Search engine->Knowledge->Response) method thus applies a single LM to three modular tasks in succession: search, generating knowledge, and generating a final response. We show that, when using SeeKeR as a dialogue model, it outperforms the state-of-the-art model BlenderBot 2 (Chen et al., 2021) on open-domain knowledge-grounded conversations for the same number of parameters, in terms of consistency, knowledge and per-turn engagingness. SeeKeR applied to topical prompt completions as a standard language model outperforms GPT2 (Radford et al., 2019) and GPT3 (Brown et al., 2020) in terms of factuality and topicality, despite GPT3 being a vastly larger model. Our code and models are made publicly available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题