论文标题
3D对象检测的体素场融合
Voxel Field Fusion for 3D Object Detection
论文作者
论文摘要
在这项工作中,我们提出了一个概念上简单而有效的跨模式3D对象检测框架,称为Voxel Field Fusion。所提出的方法旨在通过表示和融合增强图像特征作为Voxel场中的射线来保持交叉模式的一致性。为此,可学习的采样器首先设计为从图像平面中采样至关重要的特征,该特征以射线的方式投影到体素网格上,从而保持了具有空间上下文的特征表示的一致性。此外,在构造体素场中的补充环境融合了射线融合,以融合特征。我们进一步开发了混合增强器以使特征变化的转换保持一致,从而弥合了数据增强中的模态差距。该提出的框架被证明是在各种基准中获得一致的收益,并且在Kitti和Nuscenes数据集上的先前基于Fusion的方法优于先前的基于融合的方法。代码可在https://github.com/dvlab-research/vff上提供。
In this work, we present a conceptually simple yet effective framework for cross-modality 3D object detection, named voxel field fusion. The proposed approach aims to maintain cross-modality consistency by representing and fusing augmented image features as a ray in the voxel field. To this end, the learnable sampler is first designed to sample vital features from the image plane that are projected to the voxel grid in a point-to-ray manner, which maintains the consistency in feature representation with spatial context. In addition, ray-wise fusion is conducted to fuse features with the supplemental context in the constructed voxel field. We further develop mixed augmentor to align feature-variant transformations, which bridges the modality gap in data augmentation. The proposed framework is demonstrated to achieve consistent gains in various benchmarks and outperforms previous fusion-based methods on KITTI and nuScenes datasets. Code is made available at https://github.com/dvlab-research/VFF.