论文标题
Voxel Set Transformer:从点云中检测到3D对象检测的设置方法
Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds
论文作者
论文摘要
在许多2D视觉任务中,变形金刚表现出了有希望的表现。但是,计算大规模点云数据上的自我注意力很麻烦,因为点云是一个长序列,并且在3D空间中分布不均。为了解决这个问题,现有方法通常通过将点分为相同大小的群集或对离散表示的卷积自我注意来计算本地的自我发挥。但是,前者会导致随机点辍学,而后者通常具有狭窄的注意场。在本文中,我们提出了一种新型基于体素的体系结构,即体素套装变压器(VoxSet),以通过集合到集合的翻译从点云中检测3D对象。 VoxSet建立在基于体素的设定注意力(VSA)模块的基础上,该模块将每个体素的自我发作降低,并在一组潜在的代码引起的隐藏空间中的两个跨音调和模型特征。使用VSA模块,VoxSet可以管理具有任意尺寸的Voxelized点簇,并与线性复杂性并行处理它们。提出的VoxSet将变压器的高性能与基于体素的模型的效率相结合,该模型可以用作基于卷积和基于点的骨架的良好替代品。 VoxSet报告了Kitti和Waymo检测基准的竞争结果。可以在\ url {https://github.com/skyhehe123/voxset}找到源代码。
Transformer has demonstrated promising performance in many 2D vision tasks. However, it is cumbersome to compute the self-attention on large-scale point cloud data because point cloud is a long sequence and unevenly distributed in 3D space. To solve this issue, existing methods usually compute self-attention locally by grouping the points into clusters of the same size, or perform convolutional self-attention on a discretized representation. However, the former results in stochastic point dropout, while the latter typically has narrow attention fields. In this paper, we propose a novel voxel-based architecture, namely Voxel Set Transformer (VoxSeT), to detect 3D objects from point clouds by means of set-to-set translation. VoxSeT is built upon a voxel-based set attention (VSA) module, which reduces the self-attention in each voxel by two cross-attentions and models features in a hidden space induced by a group of latent codes. With the VSA module, VoxSeT can manage voxelized point clusters with arbitrary size in a wide range, and process them in parallel with linear complexity. The proposed VoxSeT integrates the high performance of transformer with the efficiency of voxel-based model, which can be used as a good alternative to the convolutional and point-based backbones. VoxSeT reports competitive results on the KITTI and Waymo detection benchmarks. The source codes can be found at \url{https://github.com/skyhehe123/VoxSeT}.