论文标题
自动驾驶汽车的规则层次结构退缩的地平线规划
Receding Horizon Planning with Rule Hierarchies for Autonomous Vehicles
论文作者
论文摘要
自动驾驶汽车经常必须与规划要求相互冲突,例如,如果避免发生碰撞要求猛击刹车,安全性和舒适性可能会互相矛盾。为了解决此类冲突,已经提出了将重要性排名分配给规则(即强加规则层次结构),这反过来又基于他们所满足的规则的重要性来引起轨迹的排名。一方面,施加规则层次结构可以增强可解释性,但将组合复杂性引入计划;另一方面,基于现代梯度的优化工具可以利用可区分的奖励结构,但对音调的解释性较差和不敏感。在本文中,我们提出了一种等效表达规则层次结构的方法,即可与现代基于梯度的优化者适合的可区分奖励结构,从而实现了两全其美的最佳。我们通过制定保留排名的奖励函数来实现这一目标,这些奖励函数在规则层次结构引起的轨迹等级中是单调的;即,排名较高的轨迹获得更高的奖励。配备了规则层次结构及其相应的保留排名奖励功能,我们开发了一个两阶段的计划者,可以有效地解决冲突的计划要求。我们证明,我们的方法可以在7-10 Hz中生成运动计划,以实现各种具有挑战性的道路导航和交叉谈判方案。
Autonomous vehicles must often contend with conflicting planning requirements, e.g., safety and comfort could be at odds with each other if avoiding a collision calls for slamming the brakes. To resolve such conflicts, assigning importance ranking to rules (i.e., imposing a rule hierarchy) has been proposed, which, in turn, induces rankings on trajectories based on the importance of the rules they satisfy. On one hand, imposing rule hierarchies can enhance interpretability, but introduce combinatorial complexity to planning; while on the other hand, differentiable reward structures can be leveraged by modern gradient-based optimization tools, but are less interpretable and unintuitive to tune. In this paper, we present an approach to equivalently express rule hierarchies as differentiable reward structures amenable to modern gradient-based optimizers, thereby, achieving the best of both worlds. We achieve this by formulating rank-preserving reward functions that are monotonic in the rank of the trajectories induced by the rule hierarchy; i.e., higher ranked trajectories receive higher reward. Equipped with a rule hierarchy and its corresponding rank-preserving reward function, we develop a two-stage planner that can efficiently resolve conflicting planning requirements. We demonstrate that our approach can generate motion plans in ~7-10 Hz for various challenging road navigation and intersection negotiation scenarios.