论文标题
接地钥匙到文本一代:迈向事实开放式一代
Grounded Keys-to-Text Generation: Towards Factual Open-Ended Generation
论文作者
论文摘要
大型的预训练的语言模型最近启用了开放式生成框架(例如,及时到文本NLG),以解决超出传统数据之间的各种任务。尽管此框架更为笼统,但它的规定不足,并且通常导致缺乏可控性限制其现实世界的用法。我们提出了一个新的接地密钥到文本生成任务:任务是生成有关实体的事实描述,给定一组指导键和接地段落。为了解决此任务,我们引入了一个名为Entdegen的新数据集。受到最新基于质量检查的评估措施的启发,我们提出了一个自动指标,即MAFE,以实现生成的描述的事实正确性。我们的EntDeScriptor模型配备了强大的排名者来获取有用的段落并生成实体描述。实验结果表明,我们提出的指标和人类对事实性的判断之间存在良好的相关性(60.14)。我们的排名显着提高了生成的描述的事实正确性(召回和精度相对增长15.95%和34.51%)。最后,我们的消融研究突出了结合钥匙和基础的好处。
Large pre-trained language models have recently enabled open-ended generation frameworks (e.g., prompt-to-text NLG) to tackle a variety of tasks going beyond the traditional data-to-text generation. While this framework is more general, it is under-specified and often leads to a lack of controllability restricting their real-world usage. We propose a new grounded keys-to-text generation task: the task is to generate a factual description about an entity given a set of guiding keys, and grounding passages. To address this task, we introduce a new dataset, called EntDeGen. Inspired by recent QA-based evaluation measures, we propose an automatic metric, MAFE, for factual correctness of generated descriptions. Our EntDescriptor model is equipped with strong rankers to fetch helpful passages and generate entity descriptions. Experimental result shows a good correlation (60.14) between our proposed metric and human judgments of factuality. Our rankers significantly improved the factual correctness of generated descriptions (15.95% and 34.51% relative gains in recall and precision). Finally, our ablation study highlights the benefit of combining keys and groundings.