可概括的行人检测：房间里的大象

论文标题

可概括的行人检测：房间里的大象

Generalizable Pedestrian Detection: The Elephant In The Room

论文作者

Hasan, Irtiza, Liao, Shengcai, Li, Jinpeng, Akram, Saad Ullah, Shao, Ling

论文摘要

行人检测用于许多基于视觉的应用程序，从视频监视到自动驾驶。尽管达到了高性能，但仍然未知现有检测器概括为看不见的数据。这很重要，因为实用的检测器应准备好在应用程序的各种情况下使用。为此，我们使用直接跨数据库评估的一般原则在本文中进行了一项全面的研究。通过这项研究，我们发现现有的最先进的行人探测器在同一数据集中接受训练和测试时表现良好，在交叉数据集评估中概括了很差。我们证明了这种趋势的原因有两个。首先，他们的设计（例如，锚定设置）可能会偏向传统的单数据库培训和测试管道中的流行基准，但因此很大程度上限制了它们的概括能力。其次，在行人中，培训来源通常并不密集，而在情况下则不多。在直接的跨数据库评估下，我们发现，与现有的最新行人探测器相比，一通用对象检测器没有行人鉴定的设计，在设计中进行了概括得多。此外，我们说明，通过爬网，收集的多样化和密集的数据集成为行人检测的有效培训的有效来源。因此，我们提出了一条渐进式培训管道，并发现它适合于自动驾驶导向的行人检测。因此，本文进行的研究表明，应该更重点放在跨数据库评估中，以实现可推广的行人探测器的未来设计。可以在https://github.com/hasanirtiza/pedestron上访问代码和模型。

Pedestrian detection is used in many vision based applications ranging from video surveillance to autonomous driving. Despite achieving high performance, it is still largely unknown how well existing detectors generalize to unseen data. This is important because a practical detector should be ready to use in various scenarios in applications. To this end, we conduct a comprehensive study in this paper, using a general principle of direct cross-dataset evaluation. Through this study, we find that existing state-of-the-art pedestrian detectors, though perform quite well when trained and tested on the same dataset, generalize poorly in cross dataset evaluation. We demonstrate that there are two reasons for this trend. Firstly, their designs (e.g. anchor settings) may be biased towards popular benchmarks in the traditional single-dataset training and test pipeline, but as a result largely limit their generalization capability. Secondly, the training source is generally not dense in pedestrians and diverse in scenarios. Under direct cross-dataset evaluation, surprisingly, we find that a general purpose object detector, without pedestrian-tailored adaptation in design, generalizes much better compared to existing state-of-the-art pedestrian detectors. Furthermore, we illustrate that diverse and dense datasets, collected by crawling the web, serve to be an efficient source of pre-training for pedestrian detection. Accordingly, we propose a progressive training pipeline and find that it works well for autonomous-driving oriented pedestrian detection. Consequently, the study conducted in this paper suggests that more emphasis should be put on cross-dataset evaluation for the future design of generalizable pedestrian detectors. Code and models can be accessed at https://github.com/hasanirtiza/Pedestron.

下载PDF全文

下载文献需遵守相关版权规定

论文标题