用于转向角度预测的感觉运动配位的多模式融合

论文标题

用于转向角度预测的感觉运动配位的多模式融合

Multi-Modal Fusion for Sensorimotor Coordination in Steering Angle Prediction

论文作者

Munir, Farzeen, Azam, Shoaib, Lee, Byung-Geun, Jeon, Moongu

论文摘要

模仿学习被用来学习以端到端方式进行转向角度预测的感觉运动协调，需要专家演示。这些专家演示与环境感知和车辆控制数据配对。传统的基于框架的RGB摄像机是用于获取环境感知数据的最常见的外部感受传感器模式。基于框架的RGB摄像机在学习端到端横向控制中用作单一模态时，产生了令人鼓舞的结果。但是，基于框架的常规RGB摄像机在照明变化条件下的可操作性有限，并且受运动模糊的影响。事件摄像头提供了基于框架的RGB摄像机的互补信息。这项工作探讨了通过预测转向角度的端到端横向控制的基于帧的RGB和事件数据的融合。此外，事件数据与基于帧的RGB数据的表示形式如何有助于对自动驾驶汽车进行稳健的横向控制。为此，我们提出了Drfuser，这是一种用于学习端到端横向控制的新型卷积编码器架构。编码器模块在基于帧的RGB数据和事件数据之间以及自发项层之间分支。此外，这项研究还为我们自己收集的数据集做出了贡献，该数据集由事件，基于框架的RGB和车辆控制数据组成。该方法的功效在我们收集的数据集，戴维斯驾驶数据集（DDD）和Carla Eventscape数据集上进行了实验评估。实验结果表明，所提出的方法Drfuser在根平方误差（RMSE）和用作评估指标的平均绝对误差（MAE）方面优于最先进的方法。

Imitation learning is employed to learn sensorimotor coordination for steering angle prediction in an end-to-end fashion requires expert demonstrations. These expert demonstrations are paired with environmental perception and vehicle control data. The conventional frame-based RGB camera is the most common exteroceptive sensor modality used to acquire the environmental perception data. The frame-based RGB camera has produced promising results when used as a single modality in learning end-to-end lateral control. However, the conventional frame-based RGB camera has limited operability in illumination variation conditions and is affected by the motion blur. The event camera provides complementary information to the frame-based RGB camera. This work explores the fusion of frame-based RGB and event data for learning end-to-end lateral control by predicting steering angle. In addition, how the representation from event data fuse with frame-based RGB data helps to predict the lateral control robustly for the autonomous vehicle. To this end, we propose DRFuser, a novel convolutional encoder-decoder architecture for learning end-to-end lateral control. The encoder module is branched between the frame-based RGB data and event data along with the self-attention layers. Moreover, this study has also contributed to our own collected dataset comprised of event, frame-based RGB, and vehicle control data. The efficacy of the proposed method is experimentally evaluated on our collected dataset, Davis Driving dataset (DDD), and Carla Eventscape dataset. The experimental results illustrate that the proposed method DRFuser outperforms the state-of-the-art in terms of root-mean-square error (RMSE) and mean absolute error (MAE) used as evaluation metrics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题