论文标题

Persdet:单眼3D检测在透视鸟的视图中

PersDet: Monocular 3D Detection in Perspective Bird's-Eye-View

论文作者

Zhou, Hongyu, Ge, Zheng, Mao, Weixin, Li, Zeming

论文摘要

当前,在鸟眼中检测3D对象(BEV)优于其他3D检测器,用于自动驾驶和机器人技术。但是,将图像特征转换为BEV需要特别操作员进行特征采样。这些操作员在许多边缘设备上不受支持,在部署检测器时会带来额外的障碍。为了解决此问题,我们重新审视BEV表示的生成,并在透视bev中提出检测对象 - 一种不需要功能采样的新的BEV表示。我们证明,BEV功能同样可以享受BEV范式的好处。此外,视角BEV通过解决特征采样引起的问题来改善检测性能。我们建议基于此发现的“透视” BEV空间中的高性能对象检测PERSDET。在实施简单且有效的结构时,SPEDET在Nuscenes基准上的现有最新单眼方法优于现有的最新单眼方法,在使用Resnet-50作为骨架时达到34.6%的MAP和40.8%的NDS。

Currently, detecting 3D objects in Bird's-Eye-View (BEV) is superior to other 3D detectors for autonomous driving and robotics. However, transforming image features into BEV necessitates special operators to conduct feature sampling. These operators are not supported on many edge devices, bringing extra obstacles when deploying detectors. To address this problem, we revisit the generation of BEV representation and propose detecting objects in perspective BEV -- a new BEV representation that does not require feature sampling. We demonstrate that perspective BEV features can likewise enjoy the benefits of the BEV paradigm. Moreover, the perspective BEV improves detection performance by addressing issues caused by feature sampling. We propose PersDet for high-performance object detection in perspective BEV space based on this discovery. While implementing a simple and memory-efficient structure, PersDet outperforms existing state-of-the-art monocular methods on the nuScenes benchmark, reaching 34.6% mAP and 40.8% NDS when using ResNet-50 as the backbone.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源