大型语言模型与哈利·波特（Harry Potter）：一个双语数据集，用于使对话代理与字符结合

论文标题

大型语言模型与哈利·波特（Harry Potter）：一个双语数据集，用于使对话代理与字符结合

Large Language Models Meet Harry Potter: A Bilingual Dataset for Aligning Dialogue Agents with Characters

论文作者

Chen, Nuo, Wang, Yan, Jiang, Haiyun, Cai, Deng, Li, Yuhan, Chen, Ziyang, Wang, Longyue, Li, Jia

论文摘要

近年来，对话风格的大型语言模型（LLM）（例如Chatgpt和GPT4）在构建开放域对话代理方面具有巨大的潜力。但是，由于性格表现形式的复杂性和缺乏全面的注释，使这些代理与特定角色或个人保持一致仍然是一个巨大的挑战。在本文中，我们介绍了Harry Potter对话（HPD）数据集，该数据集旨在推进对话代理和性格一致性的研究。数据集涵盖了哈利·波特系列中的所有对话会议（英语和中文），并带有重要的背景信息，包括对话场景，演讲者，角色关系和属性。这些广泛的注释可能会使LLM能够解锁角色驱动的对话能力。此外，它可以作为评估LLM与特定特征的良好状态的通用基准。我们使用微调和内在的学习设置对HPD进行基准LLM。评估结果表明，尽管在产生高质量，角色一致的响应方面有很大的改进空间，但所提出的数据集在指导模型方面朝着更好地与哈利·波特（Harry Potter）的特征保持一致的响应方面有价值。

In recent years, Dialogue-style Large Language Models (LLMs) such as ChatGPT and GPT4 have demonstrated immense potential in constructing open-domain dialogue agents. However, aligning these agents with specific characters or individuals remains a considerable challenge due to the complexities of character representation and the lack of comprehensive annotations. In this paper, we introduce the Harry Potter Dialogue (HPD) dataset, designed to advance the study of dialogue agents and character alignment. The dataset encompasses all dialogue sessions (in both English and Chinese) from the Harry Potter series and is annotated with vital background information, including dialogue scenes, speakers, character relationships, and attributes. These extensive annotations may empower LLMs to unlock character-driven dialogue capabilities. Furthermore, it can serve as a universal benchmark for evaluating how well can a LLM aligning with a specific character. We benchmark LLMs on HPD using both fine-tuning and in-context learning settings. Evaluation results reveal that although there is substantial room for improvement in generating high-quality, character-aligned responses, the proposed dataset is valuable in guiding models toward responses that better align with the character of Harry Potter.

下载PDF全文

下载文献需遵守相关版权规定

论文标题