论文标题
DISP R-CNN:通过形状的立体3D对象检测先验引导实例差异估计
Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation
论文作者
论文摘要
在本文中,我们提出了一个名为DISP R-CNN的新型系统,用于从立体声图像中检测3D对象。许多最近的工作通过首先恢复具有差异估计的点云,然后应用3D检测器来解决此问题。为整个图像计算了差异图,该图的成本很高,无法利用特定于类别的先验。相比之下,我们设计了一个实例差异估计网络(IDISPNET),该实例仅预测感兴趣的对象上的像素,并在更准确的差异估计中学习特定于类别的形状。为了解决培训中缺乏差异注释的挑战,我们建议使用统计形状模型来产生密集的差异伪景真相,而无需激光点云,这使我们的系统更加广泛地适用。 KITTI数据集的实验表明,即使在训练时间无法使用LiDar地面真相,DESP R-CNN也能够达到竞争性能,并且在平均精度方面,以前的最先前方法也降低了20%。
In this paper, we propose a novel system named Disp R-CNN for 3D object detection from stereo images. Many recent works solve this problem by first recovering a point cloud with disparity estimation and then apply a 3D detector. The disparity map is computed for the entire image, which is costly and fails to leverage category-specific prior. In contrast, we design an instance disparity estimation network (iDispNet) that predicts disparity only for pixels on objects of interest and learns a category-specific shape prior for more accurate disparity estimation. To address the challenge from scarcity of disparity annotation in training, we propose to use a statistical shape model to generate dense disparity pseudo-ground-truth without the need of LiDAR point clouds, which makes our system more widely applicable. Experiments on the KITTI dataset show that, even when LiDAR ground-truth is not available at training time, Disp R-CNN achieves competitive performance and outperforms previous state-of-the-art methods by 20% in terms of average precision.