使用增强学习学习开放域多跳搜索

论文标题

使用增强学习学习开放域多跳搜索

Learning Open Domain Multi-hop Search Using Reinforcement Learning

论文作者

Noriega-Atala, Enrique, Surdeanu, Mihai, Morrison, Clayton T.

论文摘要

我们提出了一种教导自动化代理的方法，以学习如何搜索开放域中实体之间关系的多跳路径。该方法学习了一种指导现有信息检索和机器阅读资源的政策，以关注语料库的相关区域。该方法将学习问题作为马尔可夫决策过程提出，其状态表示，该状态表示编码搜索过程的动力学和奖励结构，该结构可以最大程度地减少必须处理的文档数量，同时仍在寻找多跳路径。我们在参与者批判性的增强学习算法中实现了该方法，并在来自英语Wikipedia子集的搜索问题数据集中进行了评估。该算法发现，与几种基线启发式算法相比，在处理更少的文档时，成功提取了所需信息的一系列政策。

We propose a method to teach an automated agent to learn how to search for multi-hop paths of relations between entities in an open domain. The method learns a policy for directing existing information retrieval and machine reading resources to focus on relevant regions of a corpus. The approach formulates the learning problem as a Markov decision process with a state representation that encodes the dynamics of the search process and a reward structure that minimizes the number of documents that must be processed while still finding multi-hop paths. We implement the method in an actor-critic reinforcement learning algorithm and evaluate it on a dataset of search problems derived from a subset of English Wikipedia. The algorithm finds a family of policies that succeeds in extracting the desired information while processing fewer documents compared to several baseline heuristic algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题