通过演员 - 批判性的增强学习设计一个强大的低水平不可或缺的不可能的控制器

论文标题

通过演员 - 批判性的增强学习设计一个强大的低水平不可或缺的不可能的控制器

Designing a Robust Low-Level Agnostic Controller for a Quadrotor with Actor-Critic Reinforcement Learning

论文作者

Eduardo, Guilherme Siqueira, Caarls, Wouter

论文摘要

目的：使用二次运动的现实应用程序引入了许多干扰和时变属性，这些属性对飞行控制器构成了挑战。我们观察到，当四型载体的任务是拾起并丢弃有效载荷时，在文献中发现了在文献中发现的传统PID和基于RL的控制器，以维持飞行后，由于与该外部物体的相互作用，车辆改变了动态。方法：在这项工作中，我们在基于软参与者 - 批评的低级Waypoint指南控制器的训练阶段引入域随机化。在拟议的有效载荷取货上评估了最终的控制器，并删除了模拟车辆现实生活操作的添加干扰。结果和结论：我们表明，通过在训练过程中引入四型动力学的一定程度的不确定性，我们可以获得一个能够使用四极管参数变化的较大变化来执行提出的任务的控制器。此外，基于RL的控制器的表现优于传统的位置PID控制器，在此任务中具有优化的收益，同时对不同的仿真参数保持不可知。

Purpose: Real-life applications using quadrotors introduce a number of disturbances and time-varying properties that pose a challenge to flight controllers. We observed that, when a quadrotor is tasked with picking up and dropping a payload, traditional PID and RL-based controllers found in literature struggle to maintain flight after the vehicle changes its dynamics due to interaction with this external object. Methods: In this work, we introduce domain randomization during the training phase of a low-level waypoint guidance controller based on Soft Actor-Critic. The resulting controller is evaluated on the proposed payload pick up and drop task with added disturbances that emulate real-life operation of the vehicle. Results & Conclusion: We show that, by introducing a certain degree of uncertainty in quadrotor dynamics during training, we can obtain a controller that is capable to perform the proposed task using a larger variation of quadrotor parameters. Additionally, the RL-based controller outperforms a traditional positional PID controller with optimized gains in this task, while remaining agnostic to different simulation parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题