复杂观察的对比性变分强化学习

论文标题

复杂观察的对比性变分强化学习

Contrastive Variational Reinforcement Learning for Complex Observations

论文作者

Ma, Xiao, Chen, Siwei, Hsu, David, Lee, Wee Sun

论文摘要

深度强化学习（DRL）在各种机器人任务中取得了重大成功：操纵，导航等。但是，自然环境中复杂的视觉观察仍然是一个主要挑战。本文提出了对比的变分强化学习（CVRL），这是一种基于模型的方法，可应对DRL中复杂的视觉观察。 CVRL通过对比度学习来歧视潜在状态之间的相互信息，从而学习一个对比的变分模型。它避免了不必要地对复杂的观察空间进行建模，就像常用的生成观察模型经常这样做一样，并且明显更健壮。 CVRL通过基于标准的Mujoco任务的基于最新模型的DRL方法实现了可比的性能。它在天然的穆乔科任务和带有复杂观测值的机器人盒式任务上大大优于它们，例如动态阴影。 CVRL代码可在https://github.com/yusufma03/cvrl上公开获得。

Deep reinforcement learning (DRL) has achieved significant success in various robot tasks: manipulation, navigation, etc. However, complex visual observations in natural environments remains a major challenge. This paper presents Contrastive Variational Reinforcement Learning (CVRL), a model-based method that tackles complex visual observations in DRL. CVRL learns a contrastive variational model by maximizing the mutual information between latent states and observations discriminatively, through contrastive learning. It avoids modeling the complex observation space unnecessarily, as the commonly used generative observation model often does, and is significantly more robust. CVRL achieves comparable performance with state-of-the-art model-based DRL methods on standard Mujoco tasks. It significantly outperforms them on Natural Mujoco tasks and a robot box-pushing task with complex observations, e.g., dynamic shadows. The CVRL code is available publicly at https://github.com/Yusufma03/CVRL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题