论文标题

学会总结段落:Wikipedia修订历史的采矿通道 - 苏装对

Learning to Summarize Passages: Mining Passage-Summary Pairs from Wikipedia Revision Histories

论文作者

Zhou, Qingyu, Wei, Furu, Zhou, Ming

论文摘要

在本文中,我们提出了一种通过挖掘Wikipedia页面修订历史记录来自动构建通道到夏季数据集的方法。特别是,该方法将主体段落和引言句子同时添加到页面上。构造的数据集包含超过十万通道 - 萨金对。质量分析表明,可以将数据集用作通行摘要的培训和验证集。我们验证和分析提出的数据集上各种摘要系统的性能。该数据集将在https://res.qyzhou.me在线提供。

In this paper, we propose a method for automatically constructing a passage-to-summary dataset by mining the Wikipedia page revision histories. In particular, the method mines the main body passages and the introduction sentences which are added to the pages simultaneously. The constructed dataset contains more than one hundred thousand passage-summary pairs. The quality analysis shows that it is promising that the dataset can be used as a training and validation set for passage summarization. We validate and analyze the performance of various summarization systems on the proposed dataset. The dataset will be available online at https://res.qyzhou.me.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源