论文标题
学习四足球的扭矩控制
Learning Torque Control for Quadrupedal Locomotion
论文作者
论文摘要
增强学习(RL)已成为开发四足动物控制器的有前途的方法。通常,运动的RL设计遵循基于位置的范式,其中RL策略以低频输出目标关节位置,然后通过高频比例衍生(PD)控制器跟踪,以产生关节扭矩。相比之下,对于基于模型的四足体运动的控制,已经从基于位置控制到基于扭矩的控制的范式转移。鉴于基于模型的控制的最新进展,我们通过引入基于扭矩的RL框架来探索基于位置的RL范式的替代方案,其中RL策略直接以高频预测了联合扭矩,从而避免使用PD控制器。提出的学习扭矩控制框架通过广泛的实验进行了验证,其中四足动物能够穿越各种地形并抵抗外部干扰,同时遵循用户指定的命令。此外,与学习位置控制相比,学习扭矩控制证明了获得更高奖励的潜力,并且对重大的外部干扰更加强大。据我们所知,这是对四足体运动的端到端学习扭矩控制的第一次模拟尝试。
Reinforcement learning (RL) has become a promising approach to developing controllers for quadrupedal robots. Conventionally, an RL design for locomotion follows a position-based paradigm, wherein an RL policy outputs target joint positions at a low frequency that are then tracked by a high-frequency proportional-derivative (PD) controller to produce joint torques. In contrast, for the model-based control of quadrupedal locomotion, there has been a paradigm shift from position-based control to torque-based control. In light of the recent advances in model-based control, we explore an alternative to the position-based RL paradigm, by introducing a torque-based RL framework, where an RL policy directly predicts joint torques at a high frequency, thus circumventing the use of a PD controller. The proposed learning torque control framework is validated with extensive experiments, in which a quadruped is capable of traversing various terrain and resisting external disturbances while following user-specified commands. Furthermore, compared to learning position control, learning torque control demonstrates the potential to achieve a higher reward and is more robust to significant external disturbances. To our knowledge, this is the first sim-to-real attempt for end-to-end learning torque control of quadrupedal locomotion.