论文标题

提示:让密钥线索指导跨语义的抽象摘要

ClueGraphSum: Let Key Clues Guide the Cross-Lingual Abstractive Summarization

论文作者

Jiang, Shuyu, Tu, Dengbiao, Chen, Xingshu, Tang, Rui, Wang, Wenxian, Wang, Haizhou

论文摘要

跨语性摘要(CLS)是用一种语言以另一种语言为文章生成摘要的任务。对CLS的先前研究主要采用管道方法或使用翻译的并行数据训练端到端模型。但是,生成的跨语言摘要的质量需要进一步的进一步改进,并且模型性能从未在手写的CLS数据集上进行评估。因此,我们首先提出了一种线索引导的跨语性抽象摘要方法,以提高跨语性摘要的质量,然后构建一个新颖的手写CLS数据集进行评估。具体来说,我们将输入文章的关键字,命名实体等提取为摘要的关键线索,然后设计一种线索引导的算法,以将文章转换为具有较少嘈杂句子的图表。构建一个图形编码器是为了学习句子语义和文章结构,并且构建了一个线索编码器,以编码和翻译关键线索,确保重要部分的信息保留在生成的摘要中。这两个编码器由一个解码器连接,以直接学习跨语性语义。实验结果表明,对于更长的输入,我们的方法具有更强的鲁棒性,并显着提高了强大基线的性能,从而改善了8.55 Rouge-1(英语对英语摘要)和2.13 MoverScore(中文到英格兰)的得分。

Cross-Lingual Summarization (CLS) is the task to generate a summary in one language for an article in a different language. Previous studies on CLS mainly take pipeline methods or train the end-to-end model using the translated parallel data. However, the quality of generated cross-lingual summaries needs more further efforts to improve, and the model performance has never been evaluated on the hand-written CLS dataset. Therefore, we first propose a clue-guided cross-lingual abstractive summarization method to improve the quality of cross-lingual summaries, and then construct a novel hand-written CLS dataset for evaluation. Specifically, we extract keywords, named entities, etc. of the input article as key clues for summarization and then design a clue-guided algorithm to transform an article into a graph with less noisy sentences. One Graph encoder is built to learn sentence semantics and article structures and one Clue encoder is built to encode and translate key clues, ensuring the information of important parts are reserved in the generated summary. These two encoders are connected by one decoder to directly learn cross-lingual semantics. Experimental results show that our method has stronger robustness for longer inputs and substantially improves the performance over the strong baseline, achieving an improvement of 8.55 ROUGE-1 (English-to-Chinese summarization) and 2.13 MoverScore (Chinese-to-English summarization) scores over the existing SOTA.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源