通过逆增强学习在现实生活中开车

论文标题

通过逆增强学习在现实生活中开车

Driving in Real Life with Inverse Reinforcement Learning

论文作者

Phan-Minh, Tung, Howington, Forbes, Chu, Ting-Sheng, Lee, Sang Uk, Tomov, Momchil S., Li, Nanxiang, Dicle, Caglayan, Findler, Samuel, Suarez-Ruiz, Francisco, Beaudoin, Robert, Yang, Bo, Omari, Sammy, Wolff, Eric M.

论文摘要

在本文中，我们介绍了第一个基于学习的规划师，该计划者使用逆强化学习（IRL）驾驶着茂密的城市交通驾驶汽车。我们的计划者Driveirl生成了一套各种轨迹建议，用轻巧且可解释的安全过滤器过滤这些轨迹，然后使用学习的模型来评分每个剩余的轨迹。然后，我们的自动驾驶车辆的低级控制器跟踪最好的轨迹。我们在最大熵IRL框架内的500小时以上的现实世界数据集上训练轨迹评分模型。 Driveirl的好处包括：仅由于学习轨迹评分功能，相对可解释的功能以及强大的现实世界表现而进行的简单设计。我们在拉斯维加斯大道上验证了Driveirl，并在交通繁忙的情况下展示了完全自主的驾驶，包括涉及切割的场景，铅车的突然制动以及酒店的皮卡/下车区。我们的数据集将公开以帮助在这一领域进行进一步研究。

In this paper, we introduce the first learning-based planner to drive a car in dense, urban traffic using Inverse Reinforcement Learning (IRL). Our planner, DriveIRL, generates a diverse set of trajectory proposals, filters these trajectories with a lightweight and interpretable safety filter, and then uses a learned model to score each remaining trajectory. The best trajectory is then tracked by the low-level controller of our self-driving vehicle. We train our trajectory scoring model on a 500+ hour real-world dataset of expert driving demonstrations in Las Vegas within the maximum entropy IRL framework. DriveIRL's benefits include: a simple design due to only learning the trajectory scoring function, relatively interpretable features, and strong real-world performance. We validated DriveIRL on the Las Vegas Strip and demonstrated fully autonomous driving in heavy traffic, including scenarios involving cut-ins, abrupt braking by the lead vehicle, and hotel pickup/dropoff zones. Our dataset will be made public to help further research in this area.

下载PDF全文

下载文献需遵守相关版权规定

论文标题