深度全身控制：学习操纵和运动的统一政策

论文标题

深度全身控制：学习操纵和运动的统一政策

Deep Whole-Body Control: Learning a Unified Policy for Manipulation and Locomotion

论文作者

Fu, Zipeng, Cheng, Xuxin, Pathak, Deepak

论文摘要

附着的手臂可以显着提高腿部机器人在几个移动操作任务上的适用性，而车轮或轨道对应物不可能。这种腿部操纵器的标准层次控制管道是将控制器解散到操作和运动的情况下。但是，这是无效的。它需要巨大的工程来支持手臂和腿部之间的协调，并且错误可能会在跨模块中传播，从而导致非平滑的不自然运动。如果证据证明了四肢跨越强大的运动协同作用，这也是生物学上难以置信的。在这项工作中，我们建议使用强化学习来学习对腿部操纵器的全身控制的统一政策。我们提出了正规化的在线改编，以弥合SIM2REAL间隙以进行高DOF控制，并优势混合利用动作空间中的因果关系依赖性，以在训练全身系统期间克服本地最小值。我们还为低成本的操纵器提供了一个简单的设计，发现我们的统一政策可以在几个任务设置中展示动态和敏捷的行为。视频在https://maniploco.github.io上

An attached arm can significantly increase the applicability of legged robots to several mobile manipulation tasks that are not possible for the wheeled or tracked counterparts. The standard hierarchical control pipeline for such legged manipulators is to decouple the controller into that of manipulation and locomotion. However, this is ineffective. It requires immense engineering to support coordination between the arm and legs, and error can propagate across modules causing non-smooth unnatural motions. It is also biological implausible given evidence for strong motor synergies across limbs. In this work, we propose to learn a unified policy for whole-body control of a legged manipulator using reinforcement learning. We propose Regularized Online Adaptation to bridge the Sim2Real gap for high-DoF control, and Advantage Mixing exploiting the causal dependency in the action space to overcome local minima during training the whole-body system. We also present a simple design for a low-cost legged manipulator, and find that our unified policy can demonstrate dynamic and agile behaviors across several task setups. Videos are at https://maniploco.github.io

下载PDF全文

下载文献需遵守相关版权规定

论文标题