论文标题

本质上动机的期权学习:对最新方法的比较研究

Intrinsically motivated option learning: a comparative study of recent methods

论文作者

Božić, Djordje, Tadić, Predrag, Nikolić, Mladen

论文摘要

选项代表了在增强学习(RL)中跨多个时间尺度推理的框架。由于最近对RL研究社区无监督学习范式的积极兴趣,该期权框架的适应性适应了授权的概念,这与代理商对环境的影响及其感知到这种影响的能力相对应,并且可以在环境奖励结构提供的任何监督的情况下对其进行优化。许多最近的论文以各种方式修改了这一概念,从而取得了值得称赞的结果。但是,通过这些各种修改,授权的初始背景通常会丢失。在这项工作中,我们通过原始授权原则的角度提供了对此类论文的比较研究。

Options represent a framework for reasoning across multiple time scales in reinforcement learning (RL). With the recent active interest in the unsupervised learning paradigm in the RL research community, the option framework was adapted to utilize the concept of empowerment, which corresponds to the amount of influence the agent has on the environment and its ability to perceive this influence, and which can be optimized without any supervision provided by the environment's reward structure. Many recent papers modify this concept in various ways achieving commendable results. Through these various modifications, however, the initial context of empowerment is often lost. In this work we offer a comparative study of such papers through the lens of the original empowerment principle.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源