学习四足球的扭矩控制

论文标题

学习四足球的扭矩控制

Learning Torque Control for Quadrupedal Locomotion

论文作者

Chen, Shuxiao, Zhang, Bike, Mueller, Mark W., Rai, Akshara, Sreenath, Koushil

论文摘要

增强学习（RL）已成为开发四足动物控制器的有前途的方法。通常，运动的RL设计遵循基于位置的范式，其中RL策略以低频输出目标关节位置，然后通过高频比例衍生（PD）控制器跟踪，以产生关节扭矩。相比之下，对于基于模型的四足体运动的控制，已经从基于位置控制到基于扭矩的控制的范式转移。鉴于基于模型的控制的最新进展，我们通过引入基于扭矩的RL框架来探索基于位置的RL范式的替代方案，其中RL策略直接以高频预测了联合扭矩，从而避免使用PD控制器。提出的学习扭矩控制框架通过广泛的实验进行了验证，其中四足动物能够穿越各种地形并抵抗外部干扰，同时遵循用户指定的命令。此外，与学习位置控制相比，学习扭矩控制证明了获得更高奖励的潜力，并且对重大的外部干扰更加强大。据我们所知，这是对四足体运动的端到端学习扭矩控制的第一次模拟尝试。

Reinforcement learning (RL) has become a promising approach to developing controllers for quadrupedal robots. Conventionally, an RL design for locomotion follows a position-based paradigm, wherein an RL policy outputs target joint positions at a low frequency that are then tracked by a high-frequency proportional-derivative (PD) controller to produce joint torques. In contrast, for the model-based control of quadrupedal locomotion, there has been a paradigm shift from position-based control to torque-based control. In light of the recent advances in model-based control, we explore an alternative to the position-based RL paradigm, by introducing a torque-based RL framework, where an RL policy directly predicts joint torques at a high frequency, thus circumventing the use of a PD controller. The proposed learning torque control framework is validated with extensive experiments, in which a quadruped is capable of traversing various terrain and resisting external disturbances while following user-specified commands. Furthermore, compared to learning position control, learning torque control demonstrates the potential to achieve a higher reward and is more robust to significant external disturbances. To our knowledge, this is the first sim-to-real attempt for end-to-end learning torque control of quadrupedal locomotion.

下载PDF全文

下载文献需遵守相关版权规定

论文标题