论文标题
通过基于压缩的后编辑,改善事实一致性
Improving Factual Consistency in Summarization with Compression-Based Post-Editing
论文作者
论文摘要
最新的摘要模型仍然很难与输入文本保持一致。解决此问题的一种模型不足的方法是在编辑生成的摘要后。但是,如果不可用或可能插入错误的内容,则现有方法通常无法删除实体错误。在我们的工作中,我们专注于消除外部实体错误或不在源中的实体,以提高一致性,同时保留摘要的基本信息和形式。我们建议使用句子压缩数据来训练后编辑模型,以摘要具有特殊令牌标记的外部实体错误,并输出一个被删除的压缩,良好的摘要。我们表明,该模型在维持胭脂的同时提高了事实的一致性,将实体精度提高了30%的XSUM,并且该模型可以在另一个后编辑器的顶部应用,从而将实体精度提高了38%。我们对后编辑方法进行了广泛的比较,这些方法证明了事实一致性,信息性和语法性之间的权衡,并分析了后编辑显示最大改善的设置。
State-of-the-art summarization models still struggle to be factually consistent with the input text. A model-agnostic way to address this problem is post-editing the generated summaries. However, existing approaches typically fail to remove entity errors if a suitable input entity replacement is not available or may insert erroneous content. In our work, we focus on removing extrinsic entity errors, or entities not in the source, to improve consistency while retaining the summary's essential information and form. We propose to use sentence-compression data to train the post-editing model to take a summary with extrinsic entity errors marked with special tokens and output a compressed, well-formed summary with those errors removed. We show that this model improves factual consistency while maintaining ROUGE, improving entity precision by up to 30% on XSum, and that this model can be applied on top of another post-editor, improving entity precision by up to a total of 38%. We perform an extensive comparison of post-editing approaches that demonstrate trade-offs between factual consistency, informativeness, and grammaticality, and we analyze settings where post-editors show the largest improvements.