优化基于锚的探测器，用于自动驾驶场景

论文标题

优化基于锚的探测器，用于自动驾驶场景

Optimizing Anchor-based Detectors for Autonomous Driving Scenes

论文作者

Du, Xianzhi, Hung, Wei-Chih, Lin, Tsung-Yi

论文摘要

本文总结了自动驾驶场景中流行的基于锚的探测器的模型改进和推理时间优化。基于为常见检测场景设计的高性能RCNN-RS和Verinanet-RS检测框架，我们研究了一组框架改进，以使探测器适应探测器以更好地检测人群场景中的小物体。然后，我们通过扩展输入分辨率和模型大小来提出模型缩放策略，以实现更好的速度准确性权衡曲线。我们在Waymo Open数据集（WOD）的实时2D检测轨道上评估了模型家庭。在V100 GPU的70 ms/框架延迟约束中，我们最大的Cascade RCNN-RS型号可实现76.9％的AP/L1和70.1％AP/L2，在WOD实时2D检测中获得了新的最新技术。我们最快的视网膜RS模型可实现6.3 ms/帧，同时将合理的检测精度保持在50.7％AP/L1和42.9％AP/L2。

This paper summarizes model improvements and inference-time optimizations for the popular anchor-based detectors in the scenes of autonomous driving. Based on the high-performing RCNN-RS and RetinaNet-RS detection frameworks designed for common detection scenes, we study a set of framework improvements to adapt the detectors to better detect small objects in crowd scenes. Then, we propose a model scaling strategy by scaling input resolution and model size to achieve a better speed-accuracy trade-off curve. We evaluate our family of models on the real-time 2D detection track of the Waymo Open Dataset (WOD). Within the 70 ms/frame latency constraint on a V100 GPU, our largest Cascade RCNN-RS model achieves 76.9% AP/L1 and 70.1% AP/L2, attaining the new state-of-the-art on WOD real-time 2D detection. Our fastest RetinaNet-RS model achieves 6.3 ms/frame while maintaining a reasonable detection precision at 50.7% AP/L1 and 42.9% AP/L2.

下载PDF全文

下载文献需遵守相关版权规定

论文标题