论文标题

使用增强学习学习开放域多跳搜索

Learning Open Domain Multi-hop Search Using Reinforcement Learning

论文作者

Noriega-Atala, Enrique, Surdeanu, Mihai, Morrison, Clayton T.

论文摘要

我们提出了一种教导自动化代理的方法,以学习如何搜索开放域中实体之间关系的多跳路径。该方法学习了一种指导现有信息检索和机器阅读资源的政策,以关注语料库的相关区域。该方法将学习问题作为马尔可夫决策过程提出,其状态表示,该状态表示编码搜索过程的动力学和奖励结构,该结构可以最大程度地减少必须处理的文档数量,同时仍在寻找多跳路径。我们在参与者批判性的增强学习算法中实现了该方法,并在来自英语Wikipedia子集的搜索问题数据集中进行了评估。该算法发现,与几种基线启发式算法相比,在处理更少的文档时,成功提取了所需信息的一系列政策。

We propose a method to teach an automated agent to learn how to search for multi-hop paths of relations between entities in an open domain. The method learns a policy for directing existing information retrieval and machine reading resources to focus on relevant regions of a corpus. The approach formulates the learning problem as a Markov decision process with a state representation that encodes the dynamics of the search process and a reward structure that minimizes the number of documents that must be processed while still finding multi-hop paths. We implement the method in an actor-critic reinforcement learning algorithm and evaluate it on a dataset of search problems derived from a subset of English Wikipedia. The algorithm finds a family of policies that succeeds in extracting the desired information while processing fewer documents compared to several baseline heuristic algorithms.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源