论文标题

当地对长期文本的自我发表以进行有效的文件检索

Local Self-Attention over Long Text for Efficient Document Retrieval

论文作者

Hofstätter, Sebastian, Zamani, Hamed, Mitra, Bhaskar, Craswell, Nick, Hanbury, Allan

论文摘要

神经网络,尤其是基于变压器的架构,已在几种检索基准上取得了重大的性能提高。当要检索的项目是文档时,通过完整的文档术语使用变压器的时间和内存成本可能会令人望而却步。一种流行的策略涉及仅考虑文档的前N术语。但是,这可能导致一个有偏见的系统,该系统在检索更长的文档下。在这项工作中,我们提出了一个本地自我关注,该自我注意力考虑了文档条款上的一个移动窗口,并且每个学期仅在同一窗口中参加其他术语。这种本地注意力会在整个文档中占一小部分计算和记忆成本。窗口的方法还导致更紧凑的填充填充物在小匹配中,从而节省了更多节省。我们还采用了学习的饱和功能和两期汇总的策略来确定文档的相关区域。具有这些更改的变压器内核合并模型可以有效地从文档中获取与数千个令牌的相关信息。我们从TREC 2019深度学习轨道上对文档排名任务进行了基准修改,并观察到检索质量的显着改善,并增加了在计算和内存成本中等增加的较长文档的检索。

Neural networks, particularly Transformer-based architectures, have achieved significant performance improvements on several retrieval benchmarks. When the items being retrieved are documents, the time and memory cost of employing Transformers over a full sequence of document terms can be prohibitive. A popular strategy involves considering only the first n terms of the document. This can, however, result in a biased system that under retrieves longer documents. In this work, we propose a local self-attention which considers a moving window over the document terms and for each term attends only to other terms in the same window. This local attention incurs a fraction of the compute and memory cost of attention over the whole document. The windowed approach also leads to more compact packing of padded documents in minibatches resulting in additional savings. We also employ a learned saturation function and a two-staged pooling strategy to identify relevant regions of the document. The Transformer-Kernel pooling model with these changes can efficiently elicit relevance information from documents with thousands of tokens. We benchmark our proposed modifications on the document ranking task from the TREC 2019 Deep Learning track and observe significant improvements in retrieval quality as well as increased retrieval of longer documents at moderate increase in compute and memory costs.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源