论文标题

通过Wasserstein领域混乱的视觉传递用于加固学习

Visual Transfer for Reinforcement Learning via Wasserstein Domain Confusion

论文作者

Roy, Josh, Konidaris, George

论文摘要

我们介绍了Wasserstein对抗性近端政策优化(WAPPO),这是一种新颖的算法,用于增强学习中的视觉传递算法,明确学习以使源和目标任务之间提取特征的分布保持一致。 Wappo通过新颖的Wasserstein混乱目标近似,并最大程度地降低了来自源和目标域的特征分布之间的Wasserstein-1距离。 Wappo在视觉传输方面优于先前的最新技术,并成功地转移了视觉cartpole的策略和16个Openai Procgen环境的两个实例化。

We introduce Wasserstein Adversarial Proximal Policy Optimization (WAPPO), a novel algorithm for visual transfer in Reinforcement Learning that explicitly learns to align the distributions of extracted features between a source and target task. WAPPO approximates and minimizes the Wasserstein-1 distance between the distributions of features from source and target domains via a novel Wasserstein Confusion objective. WAPPO outperforms the prior state-of-the-art in visual transfer and successfully transfers policies across Visual Cartpole and two instantiations of 16 OpenAI Procgen environments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源