论文标题
控制变压器:通过PRM引导的返回条件序列建模在未知环境中的机器人导航
Control Transformer: Robot Navigation in Unknown Environments through PRM-Guided Return-Conditioned Sequence Modeling
论文作者
论文摘要
学习长途任务(例如导航)为成功将强化学习应用于机器人技术带来了困难的挑战。从另一个角度来看,在已知的环境下,基于抽样的计划可以在不学习的情况下在环境中牢固地找到无碰撞的路径。在这项工作中,我们提出了控制变压器,该控制器模拟了以基于抽样的概率路线图(PRM)计划者为指导的低级政策返回条件序列。我们证明,我们的框架可以仅使用本地信息来解决长途导航任务。我们通过包括蚂蚁,点和人形生物在内的Mujoco机器人进行部分观察的迷宫导航评估我们的方法。我们表明,控制变压器可以成功浏览迷宫并转移到未知环境中。此外,我们将方法应用于差异驱动机器人(Turtlebot3),并在嘈杂的观测下显示零射击SIM2REAL转移。
Learning long-horizon tasks such as navigation has presented difficult challenges for successfully applying reinforcement learning to robotics. From another perspective, under known environments, sampling-based planning can robustly find collision-free paths in environments without learning. In this work, we propose Control Transformer that models return-conditioned sequences from low-level policies guided by a sampling-based Probabilistic Roadmap (PRM) planner. We demonstrate that our framework can solve long-horizon navigation tasks using only local information. We evaluate our approach on partially-observed maze navigation with MuJoCo robots, including Ant, Point, and Humanoid. We show that Control Transformer can successfully navigate through mazes and transfer to unknown environments. Additionally, we apply our method to a differential drive robot (Turtlebot3) and show zero-shot sim2real transfer under noisy observations.