论文标题
DPTDR:深度及时调整密集通道检索
DPTDR: Deep Prompt Tuning for Dense Passage Retrieval
论文作者
论文摘要
深度及时调整(DPT)在大多数自然语言处理〜(NLP)任务中取得了巨大成功。但是,它在浓缩检索中并没有很好地对其进行评估,在这里进行微调〜(ft)仍然占主导地位。当使用相同的骨干模型〜(例如,罗伯塔)部署多个检索任务时,基于FT的方法在部署成本方面是不友好的:每个新的检索模型都需要在不重复使用的情况下反复部署骨干模型。为了在这种情况下降低部署成本,这项工作调查了在密集检索中应用DPT。挑战是,直接在密集检索中直接应用DPT在很大程度上表现不佳。为了弥补性能下降,我们提出了针对基于DPT的检索器的两种模型不合时宜的和任务不合时宜的策略,即以检索为导向的中间训练和统一的负面挖掘,作为一种一般方法,可以与任何预先培训的语言模型兼容。实验结果表明,提出的方法(称为DPTDR)在MS-Marco和自然问题上都优于先前的最新模型。我们还进行消融研究以检查每种策略在DPTDR中的有效性。我们认为,这项工作有助于该行业,因为它节省了巨大的努力和部署成本,并增加了计算资源的效用。我们的代码可在https://github.com/tangzhy/dptdr上找到。
Deep prompt tuning (DPT) has gained great success in most natural language processing~(NLP) tasks. However, it is not well-investigated in dense retrieval where fine-tuning~(FT) still dominates. When deploying multiple retrieval tasks using the same backbone model~(e.g., RoBERTa), FT-based methods are unfriendly in terms of deployment cost: each new retrieval model needs to repeatedly deploy the backbone model without reuse. To reduce the deployment cost in such a scenario, this work investigates applying DPT in dense retrieval. The challenge is that directly applying DPT in dense retrieval largely underperforms FT methods. To compensate for the performance drop, we propose two model-agnostic and task-agnostic strategies for DPT-based retrievers, namely retrieval-oriented intermediate pretraining and unified negative mining, as a general approach that could be compatible with any pre-trained language model and retrieval task. The experimental results show that the proposed method (called DPTDR) outperforms previous state-of-the-art models on both MS-MARCO and Natural Questions. We also conduct ablation studies to examine the effectiveness of each strategy in DPTDR. We believe this work facilitates the industry, as it saves enormous efforts and costs of deployment and increases the utility of computing resources. Our code is available at https://github.com/tangzhy/DPTDR.