论文标题
Relmogen:在增强学习中利用运动来进行移动操作
ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation
论文作者
论文摘要
许多强化学习(RL)方法使用联合控制信号(位置,速度,扭矩)作为连续控制任务的动作空间。我们建议将动作空间以运动发生器的子观念形式提升到更高的水平(运动策划者和轨迹执行者的组合)。我们认为,通过提升动作空间并利用基于采样的运动计划者,我们可以有效地使用RL来求解复杂的,长的任务,这些任务无法用原始动作空间中的现有RL方法来解决。我们提出了Relmogen,这是一个框架,该框架结合了一项学习的政策,以预测子目标和运动发生器,以计划和执行到达这些子目标所需的运动。为了验证我们的方法,我们将Relmogen应用于两种类型的任务:1)交互式导航任务,到达目的地需要与环境相互作用的导航问题,以及2)移动操作任务,需要移动机器人基础的操纵任务。这些问题是具有挑战性的,因为它们通常是长途,在训练过程中很难探索,并且包括导航和互动的交替阶段。我们的方法是在光真逼真的仿真环境中的七个机器人技术任务中进行基准测试的。在所有情况下,Relmogen都优于最先进的强化学习和等级强化学习基线。 Relmogen在测试时间还显示出不同运动发生器之间的出色可传递性,这表明转移到真实机器人的可能性很大。
Many Reinforcement Learning (RL) approaches use joint control signals (positions, velocities, torques) as action space for continuous control tasks. We propose to lift the action space to a higher level in the form of subgoals for a motion generator (a combination of motion planner and trajectory executor). We argue that, by lifting the action space and by leveraging sampling-based motion planners, we can efficiently use RL to solve complex, long-horizon tasks that could not be solved with existing RL methods in the original action space. We propose ReLMoGen -- a framework that combines a learned policy to predict subgoals and a motion generator to plan and execute the motion needed to reach these subgoals. To validate our method, we apply ReLMoGen to two types of tasks: 1) Interactive Navigation tasks, navigation problems where interactions with the environment are required to reach the destination, and 2) Mobile Manipulation tasks, manipulation tasks that require moving the robot base. These problems are challenging because they are usually long-horizon, hard to explore during training, and comprise alternating phases of navigation and interaction. Our method is benchmarked on a diverse set of seven robotics tasks in photo-realistic simulation environments. In all settings, ReLMoGen outperforms state-of-the-art Reinforcement Learning and Hierarchical Reinforcement Learning baselines. ReLMoGen also shows outstanding transferability between different motion generators at test time, indicating a great potential to transfer to real robots.