论文标题
通过合成体验引导DQN重播记忆
Bootstrapping a DQN Replay Memory with Synthetic Experiences
论文作者
论文摘要
许多深度强化学习算法的重要组成部分是经验重播,它是对经验的存储机制或记忆。这些经验用于培训,并帮助代理商通过问题空间稳定地找到完美的轨迹。然而,经典的体验重播只能利用它实际创造的经验,但是存储的样本具有有关可以提取的问题的形式的巨大潜力。我们提出了一种算法,该算法在非确定的离散环境中创建合成体验以帮助学习者。在Frozenlake环境中评估了插值的体验重播,我们表明它可以支持代理商比经典版本更快,甚至更好。
An important component of many Deep Reinforcement Learning algorithms is the Experience Replay which serves as a storage mechanism or memory of made experiences. These experiences are used for training and help the agent to stably find the perfect trajectory through the problem space. The classic Experience Replay however makes only use of the experiences it actually made, but the stored samples bear great potential in form of knowledge about the problem that can be extracted. We present an algorithm that creates synthetic experiences in a nondeterministic discrete environment to assist the learner. The Interpolated Experience Replay is evaluated on the FrozenLake environment and we show that it can support the agent to learn faster and even better than the classic version.