自动驾驶的强大环境感知：视觉框架对象检测的统一学习管道

论文标题

自动驾驶的强大环境感知：视觉框架对象检测的统一学习管道

Robust Environment Perception for Automated Driving: A Unified Learning Pipeline for Visual-Infrared Object Detection

论文作者

Vadidar, Mohsen, Kariminezhad, Ali, Mayr, Christian, Kloeker, Laurent, Eckstein, Lutz

论文摘要

RGB互补金属 - 氧化物导体（CMOS）传感器在可见光光谱中起作用。因此，它对环境光条件非常敏感。相反，在8-14微米光谱带中运行的长波红外（LWIR）传感器，与可见光无关。在本文中，我们利用视觉和热感知单元来实现强大的对象检测目的。在FLIR [1]数据集的精细同步和（交叉）标记之后，该多模式感知数据通过卷积神经网络（CNN），以检测道路上的三个关键物体，即行人，自行车和汽车。在评估RGB和红外线（通常会互换使用热和红外）传感器后，将各种网络结构进行比较，以有效地将数据融合在特征级别上。我们的RGB-Thermal（RGBT）融合网络利用了新型的熵块注意模块（EBAM），以82.9％的地图优于最先进的网络[2]。

The RGB complementary metal-oxidesemiconductor (CMOS) sensor works within the visible light spectrum. Therefore it is very sensitive to environmental light conditions. On the contrary, a long-wave infrared (LWIR) sensor operating in 8-14 micro meter spectral band, functions independent of visible light. In this paper, we exploit both visual and thermal perception units for robust object detection purposes. After delicate synchronization and (cross-) labeling of the FLIR [1] dataset, this multi-modal perception data passes through a convolutional neural network (CNN) to detect three critical objects on the road, namely pedestrians, bicycles, and cars. After evaluation of RGB and infrared (thermal and infrared are often used interchangeably) sensors separately, various network structures are compared to fuse the data at the feature level effectively. Our RGB-thermal (RGBT) fusion network, which takes advantage of a novel entropy-block attention module (EBAM), outperforms the state-of-the-art network [2] by 10% with 82.9% mAP.

下载PDF全文

下载文献需遵守相关版权规定

论文标题