DreamingV2：通过离散世界模型的加强学习而无需重新建立

论文标题

DreamingV2：通过离散世界模型的加强学习而无需重新建立

DreamingV2: Reinforcement Learning with Discrete World Models without Reconstruction

论文作者

Okada, Masashi, Taniguchi, Tadahiro

论文摘要

本文提出了一种新颖的增强学习方法，其中包括世界模型，DreamingV2，Dreamerv2和Dreaming的协作扩展。 Dreamerv2是一种基于尖端模型的强化学习，从像素中使用，它使用离散的世界模型来表示具有分类变量的潜在状态。梦想也是从像素中进行强化学习的一种形式，试图通过涉及无重建对比度学习目标来避免一般世界模型培训中的自动编码过程。拟议的DreamingV2是一种新颖的方法，即采用DreamingV2的离散表示和无重构的梦想目标。与Dreamerv2和其他没有重建的基于模型的方法相比，DreamingV2在五个模拟挑战的3D机器人ARM任务上取得了最佳成绩。我们认为，DreamingV2将是机器人学习的可靠解决方案，因为其离散表示形式适合描述不连续的环境，并且无重建的时尚井可以管理复杂的视觉观察。

The present paper proposes a novel reinforcement learning method with world models, DreamingV2, a collaborative extension of DreamerV2 and Dreaming. DreamerV2 is a cutting-edge model-based reinforcement learning from pixels that uses discrete world models to represent latent states with categorical variables. Dreaming is also a form of reinforcement learning from pixels that attempts to avoid the autoencoding process in general world model training by involving a reconstruction-free contrastive learning objective. The proposed DreamingV2 is a novel approach of adopting both the discrete representation of DreamingV2 and the reconstruction-free objective of Dreaming. Compared to DreamerV2 and other recent model-based methods without reconstruction, DreamingV2 achieves the best scores on five simulated challenging 3D robot arm tasks. We believe that DreamingV2 will be a reliable solution for robot learning since its discrete representation is suitable to describe discontinuous environments, and the reconstruction-free fashion well manages complex vision observations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题