插件对话模型

论文标题

插件对话模型

Plug-and-Play Conversational Models

论文作者

Madotto, Andrea, Ishii, Etsuko, Lin, Zhaojiang, Dathathri, Sumanth, Fung, Pascale

论文摘要

对会话模型取得了很大的进步，这些模型产生连贯和流利的响应。但是，这通常涉及在大型对话数据集（例如Reddit）上培训大型语言模型。这些大型的对话模型几乎没有控制生成的响应，并且在没有带注释的会话数据集的属性特定生成的情况下，该控制受到了进一步的限制，可用于微调模型。在本文中，我们首先提出和评估可控响应生成的插件方法，该方法不需要特定于对话的数据集，并且不依赖对大型模型进行微调。虽然有效，但解码过程会引起大量的计算开销，从而使对话模型不适合交互式使用。为了克服这一点，我们介绍了一种方法，该方法不需要在解码时进行进一步的计算，而也不需要对大型语言模型进行任何微调。我们通过广泛的自动和人类评估证明，对多个所需属性的产生的对话反应具有高度的控制，同时流利。

There has been considerable progress made towards conversational models that generate coherent and fluent responses; however, this often involves training large language models on large dialogue datasets, such as Reddit. These large conversational models provide little control over the generated responses, and this control is further limited in the absence of annotated conversational datasets for attribute specific generation that can be used for fine-tuning the model. In this paper, we first propose and evaluate plug-and-play methods for controllable response generation, which does not require dialogue specific datasets and does not rely on fine-tuning a large model. While effective, the decoding procedure induces considerable computational overhead, rendering the conversational model unsuitable for interactive usage. To overcome this, we introduce an approach that does not require further computation at decoding time, while also does not require any fine-tuning of a large language model. We demonstrate, through extensive automatic and human evaluation, a high degree of control over the generated conversational responses with regard to multiple desired attributes, while being fluent.

下载PDF全文

下载文献需遵守相关版权规定

论文标题