论文标题

一种学习判别性自动编码器的对比预训练方法,以进行密集检索

A Contrastive Pre-training Approach to Learn Discriminative Autoencoder for Dense Retrieval

论文作者

Ma, Xinyu, Zhang, Ruqing, Guo, Jiafeng, Fan, Yixing, Cheng, Xueqi

论文摘要

密集检索(DR)在信息检索中显示出令人鼓舞的结果。从本质上讲,DR需要高质量的文本表示,以支持表示空间中的有效搜索。最近的研究表明,基于较弱的解码器的基于自动编码器的预训练的语言模型可以提供高质量的文本表示形式,从而提高了DR模型的有效性和少量的能力。但是,即使是弱回旋解码器,也对编码器具有旁路效应。更重要的是,由于每个令牌在解码输入文本中同样重要,因此学习表示的判别能力可能受到限制。为了解决上述问题,在本文中,我们提出了一种使用轻量级多层感知(MLP)解码器的对比培训方法,以学习判别性自动编码器。基本思想是以非自动回归方式生成输入文本的单词分布,并将一个文本的两个蒙版版本的单词分布闭合,同时远离其他文本。从理论上讲,我们表明我们的对比策略可以抑制共同的单词并在解码中突出表示代表性的词,从而导致歧视性表示。经验结果表明,我们的方法可以显着胜过基于自动编码器的最先进的语言模型和其他预训练的预训练模型。

Dense retrieval (DR) has shown promising results in information retrieval. In essence, DR requires high-quality text representations to support effective search in the representation space. Recent studies have shown that pre-trained autoencoder-based language models with a weak decoder can provide high-quality text representations, boosting the effectiveness and few-shot ability of DR models. However, even a weak autoregressive decoder has the bypass effect on the encoder. More importantly, the discriminative ability of learned representations may be limited since each token is treated equally important in decoding the input texts. To address the above problems, in this paper, we propose a contrastive pre-training approach to learn a discriminative autoencoder with a lightweight multi-layer perception (MLP) decoder. The basic idea is to generate word distributions of input text in a non-autoregressive fashion and pull the word distributions of two masked versions of one text close while pushing away from others. We theoretically show that our contrastive strategy can suppress the common words and highlight the representative words in decoding, leading to discriminative representations. Empirical results show that our method can significantly outperform the state-of-the-art autoencoder-based language models and other pre-trained models for dense retrieval.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源