在密集的移动人群中，机器人导航的动态可行的深入增强学习政策

论文标题

在密集的移动人群中，机器人导航的动态可行的深入增强学习政策

Dynamically Feasible Deep Reinforcement Learning Policy for Robot Navigation in Dense Mobile Crowds

论文作者

Patel, Utsav, Kumar, Nithish, Sathyamoorthy, Adarsh Jagan, Manocha, Dinesh

论文摘要

我们提出了一种基于新颖的深入增强学习（DRL）政策，以计算动态可行且具有空间意识的速度，以便在移动障碍之间导航的机器人。我们的方法将动态窗口方法（DWA）的好处结合在一起，将机器人的动态约束与最先进的DRL导航方法相结合，可以很好地处理移动的障碍和行人。我们的表述通过将环境障碍的动作嵌入新的低维观测空间中来实现这些目标。它还使用一种新颖的奖励功能来积极增强速度，从而使机器人从障碍物的标题方向上移开，从而导致碰撞数量明显降低。我们在现实的3-D模拟环境中评估了我们的方法，并在具有几个步行行人的挑战性室内场景中，在真正的差异驱动机器人中评估了我们的方法。我们将我们的方法与最新的避免碰撞方法进行了比较，并观察到成功率（最高33 \％\％），动态约束数量（最高61 \％\％降低）和平滑度的显着提高。我们还进行消融研究，以强调观察空间制定和奖励结构的优势。

We present a novel Deep Reinforcement Learning (DRL) based policy to compute dynamically feasible and spatially aware velocities for a robot navigating among mobile obstacles. Our approach combines the benefits of the Dynamic Window Approach (DWA) in terms of satisfying the robot's dynamics constraints with state-of-the-art DRL-based navigation methods that can handle moving obstacles and pedestrians well. Our formulation achieves these goals by embedding the environmental obstacles' motions in a novel low-dimensional observation space. It also uses a novel reward function to positively reinforce velocities that move the robot away from the obstacle's heading direction leading to significantly lower number of collisions. We evaluate our method in realistic 3-D simulated environments and on a real differential drive robot in challenging dense indoor scenarios with several walking pedestrians. We compare our method with state-of-the-art collision avoidance methods and observe significant improvements in terms of success rate (up to 33\% increase), number of dynamics constraint violations (up to 61\% decrease), and smoothness. We also conduct ablation studies to highlight the advantages of our observation space formulation, and reward structure.

下载PDF全文

下载文献需遵守相关版权规定

论文标题