共享自主无人机着陆的强化学习

论文标题

共享自主无人机着陆的强化学习

Reinforcement Learning for Shared Autonomy Drone Landings

论文作者

Backman, Kal, Kulić, Dana, Chung, Hoam

论文摘要

新手飞行员发现，由于复杂的无人机动态，深度感知的挑战，控制界面缺乏专业知识以及地面效应的其他干扰，因此很难操作和陆上无人驾驶汽车（UAV）（UAV）。因此，我们提出了一种共同的自治方法，以协助飞行员在深度感知很难且安全着陆区有限的条件下安全地降落无人机。我们的方法包括两个模块：一个感知模块，该模块使用两个RGB-D摄像机和一个经过强化学习算法TD3进行培训的策略模块，将信息编码到压缩潜在表示上，以识别飞行员的意图并提供控制输入，以增加用户的输入以安全地降落了UAV。使用模拟用户群体对策略模块进行了模拟培训。从具有四个参数的参数模型中对模拟用户进行采样，该参数模拟了飞行员符合助手，熟练程度，攻击性和速度的趋势。我们进行了一项用户研究（n = 28），其中人类参与者的任务是在挑战性的观看条件下将物理无人机降落在几个平台之一。该助手仅接受模拟用户数据培训，尽管没有意识到人类参与者的目标或先验的环境结构，但任务成功率从51.4％提高到98.2％。对于拟议的助手，无论先前的试验经验如何，参与者的表现都比最有经验最有经验的无助参与者更高。

Novice pilots find it difficult to operate and land unmanned aerial vehicles (UAVs), due to the complex UAV dynamics, challenges in depth perception, lack of expertise with the control interface and additional disturbances from the ground effect. Therefore we propose a shared autonomy approach to assist pilots in safely landing a UAV under conditions where depth perception is difficult and safe landing zones are limited. Our approach comprises of two modules: a perception module that encodes information onto a compressed latent representation using two RGB-D cameras and a policy module that is trained with the reinforcement learning algorithm TD3 to discern the pilot's intent and to provide control inputs that augment the user's input to safely land the UAV. The policy module is trained in simulation using a population of simulated users. Simulated users are sampled from a parametric model with four parameters, which model a pilot's tendency to conform to the assistant, proficiency, aggressiveness and speed. We conduct a user study (n = 28) where human participants were tasked with landing a physical UAV on one of several platforms under challenging viewing conditions. The assistant, trained with only simulated user data, improved task success rate from 51.4% to 98.2% despite being unaware of the human participants' goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.

下载PDF全文

下载文献需遵守相关版权规定

论文标题