在州空间中的互动模仿学习

论文标题

在州空间中的互动模仿学习

Interactive Imitation Learning in State-Space

论文作者

Jauhri, Snehal, Celemin, Carlos, Kober, Jens

论文摘要

模仿学习技术使能够通过演示而不是手动工程来编程代理的行为。但是，它们受到可用演示数据的质量的限制。互动模仿学习技术可以提高学习的功效，因为它们涉及教师在执行任务时提供反馈的教师。在这项工作中，我们提出了一种新颖的交互式学习技术，该技术使用人类反馈在状态空间中训练和改善了代理行为（与在动作空间中使用反馈的替代方法相反）。我们的方法为国家空间中的教学模仿政策〜（提示）使代理人“改变其状态”的指导通常对人类示威者来说通常更直观。通过通过矫正反馈进行持续改进，通过非专家示威者训练的代理使用技巧优于演示者和常规模仿学习剂。

Imitation Learning techniques enable programming the behavior of agents through demonstrations rather than manual engineering. However, they are limited by the quality of available demonstration data. Interactive Imitation Learning techniques can improve the efficacy of learning since they involve teachers providing feedback while the agent executes its task. In this work, we propose a novel Interactive Learning technique that uses human feedback in state-space to train and improve agent behavior (as opposed to alternative methods that use feedback in action-space). Our method titled Teaching Imitative Policies in State-space~(TIPS) enables providing guidance to the agent in terms of `changing its state' which is often more intuitive for a human demonstrator. Through continuous improvement via corrective feedback, agents trained by non-expert demonstrators using TIPS outperformed the demonstrator and conventional Imitation Learning agents.

下载PDF全文

下载文献需遵守相关版权规定

论文标题