提取性摘要的实验：整数线性编程，术语/句子评分和标题驱动的模型

论文标题

提取性摘要的实验：整数线性编程，术语/句子评分和标题驱动的模型

Experiments in Extractive Summarization: Integer Linear Programming, Term/Sentence Scoring, and Title-driven Models

论文作者

Lee, Daniel, Verma, Rakesh, Das, Avisha, Mukherjee, Arjun

论文摘要

在本文中，我们重新审视了无监督的单案摘要的具有挑战性的问题，并研究了以下方面：基于整数线性编程（ILP）算法，术语和句子得分的参数归一化以及标题驱动的方法用于摘要。我们描述了一个新的框架Newssumm，其中包括许多现有的和新的摘要方法，包括ILP和标题驱动的方法。 Newssumm的灵活性允许将不同的算法和句子评分方案相结合。我们将句子评分与ILP和归一化结合的结果与先前在此主题上的工作形成鲜明对比，这表明了更广泛搜索最佳参数的重要性。我们还表明，新的标题驱动的减少想法可改善所考虑的无监督和监督方法的性能。

In this paper, we revisit the challenging problem of unsupervised single-document summarization and study the following aspects: Integer linear programming (ILP) based algorithms, Parameterized normalization of term and sentence scores, and Title-driven approaches for summarization. We describe a new framework, NewsSumm, that includes many existing and new approaches for summarization including ILP and title-driven approaches. NewsSumm's flexibility allows to combine different algorithms and sentence scoring schemes seamlessly. Our results combining sentence scoring with ILP and normalization are in contrast to previous work on this topic, showing the importance of a broader search for optimal parameters. We also show that the new title-driven reduction idea leads to improvement in performance for both unsupervised and supervised approaches considered.

下载PDF全文

下载文献需遵守相关版权规定

论文标题