公正的学习排名：在线还是离线？

论文标题

公正的学习排名：在线还是离线？

Unbiased Learning to Rank: Online or Offline?

论文作者

Ai, Qingyao, Yang, Tao, Wang, Huazheng, Mao, Jiaxin

论文摘要

如何通过学习与偏见的用户反馈进行排名是IR的重要研究问题，如何获得公正的排名模型。现有关于排名的无偏学习（ULT）的现有工作可以广泛地分为两组 - 关于具有记录数据的无偏学习算法的研究，即\ textIt {offline}无偏见的学习，以及对无偏见参数估计的研究，具有实时用户互动，即\ textIt \ textIt \ textit {在线学习{在线学习。尽管它们对\ textit {textit {无偏见}的定义是不同的，但这两种类型的超算法共享了相同的目标 - 以根据其内在相关性或效用来找到对文档进行排名的最佳模型。但是，关于离线和在线公正学习排名的大多数研究都并行进行，而没有对其背景理论和经验表现进行详细比较。在本文中，我们将无偏学习的任务形式化，以排名并表明现有的离线无偏学习算法和在线学习排名只是同一枚硬币的两个方面。我们评估了六种最先进的超级算法，发现其中大多数都可以在离线设置和在线环境中使用或不进行较小的修改。此外，我们分析了不同的离线和在线学习范例如何影响每种算法对合成数据和真实搜索数据的理论基础和经验有效性。我们的发现可以为选择和部署超级算法的实践提供重要的见解和指南。

How to obtain an unbiased ranking model by learning to rank with biased user feedback is an important research question for IR. Existing work on unbiased learning to rank (ULTR) can be broadly categorized into two groups -- the studies on unbiased learning algorithms with logged data, namely the \textit{offline} unbiased learning, and the studies on unbiased parameters estimation with real-time user interactions, namely the \textit{online} learning to rank. While their definitions of \textit{unbiasness} are different, these two types of ULTR algorithms share the same goal -- to find the best models that rank documents based on their intrinsic relevance or utility. However, most studies on offline and online unbiased learning to rank are carried in parallel without detailed comparisons on their background theories and empirical performance. In this paper, we formalize the task of unbiased learning to rank and show that existing algorithms for offline unbiased learning and online learning to rank are just the two sides of the same coin. We evaluate six state-of-the-art ULTR algorithms and find that most of them can be used in both offline settings and online environments with or without minor modifications. Further, we analyze how different offline and online learning paradigms would affect the theoretical foundation and empirical effectiveness of each algorithm on both synthetic and real search data. Our findings could provide important insights and guideline for choosing and deploying ULTR algorithms in practice.

下载PDF全文

下载文献需遵守相关版权规定

论文标题