带有嘈杂标签的人重新识别的大规模预培训

论文标题

带有嘈杂标签的人重新识别的大规模预培训

Large-Scale Pre-training for Person Re-identification with Noisy Labels

论文作者

Fu, Dengpan, Chen, Dongdong, Yang, Hao, Bao, Jianmin, Yuan, Lu, Zhang, Lei, Li, Houqiang, Wen, Fang, Chen, Dong

论文摘要

本文旨在解决针对人重新识别（RE-ID）的嘈杂标签的预培训问题。为了设置预训练任务，我们将简单的在线多目标跟踪系统应用于现有未标记的重新ID数据集的原始视频“ Luperson”和“构建名为“ luperson-nl”的嘈杂标签变体”。由于这些标签自动从轨道上自动得出的标签不可避免地包含噪音，因此我们利用嘈杂标签（PNL）开发了一个大规模的预训练框架（PNL），该框架由三个学习模块组成：监督的重新学习，基于原型的对比学习，基于原型的对比学习，以及标签引导的伴随临界学习。原则上，对这三个模块的联合学习不仅与一个原型相似，而且还基于原型分配纠正了嘈杂的标签。我们证明，直接从原始视频中学习是预训练的有希望的替代方法，该培训利用空间和时间相关作为弱监督。这项简单的预训练任务提供了一种可扩展的方法，可以从“ Luperson-nl”上从事sota re-id表示形式，而无需铃铛和哨声。例如，通过应用相同的监督重新ID方法MGN，我们的预训练模型将无监督的预培训对应物的地图提高了5.7％，2.2％，2.3％，Cuhk03，Dukemtmc和MSMT17。在小规模或少数射击设置下，性能增益更加重要，表明学习表示的可传递性更好。代码可在https://github.com/dengpanfu/luperson-nl上找到

This paper aims to address the problem of pre-training for person re-identification (Re-ID) with noisy labels. To setup the pre-training task, we apply a simple online multi-object tracking system on raw videos of an existing unlabeled Re-ID dataset "LUPerson" nd build the Noisy Labeled variant called "LUPerson-NL". Since theses ID labels automatically derived from tracklets inevitably contain noises, we develop a large-scale Pre-training framework utilizing Noisy Labels (PNL), which consists of three learning modules: supervised Re-ID learning, prototype-based contrastive learning, and label-guided contrastive learning. In principle, joint learning of these three modules not only clusters similar examples to one prototype, but also rectifies noisy labels based on the prototype assignment. We demonstrate that learning directly from raw videos is a promising alternative for pre-training, which utilizes spatial and temporal correlations as weak supervision. This simple pre-training task provides a scalable way to learn SOTA Re-ID representations from scratch on "LUPerson-NL" without bells and whistles. For example, by applying on the same supervised Re-ID method MGN, our pre-trained model improves the mAP over the unsupervised pre-training counterpart by 5.7%, 2.2%, 2.3% on CUHK03, DukeMTMC, and MSMT17 respectively. Under the small-scale or few-shot setting, the performance gain is even more significant, suggesting a better transferability of the learned representation. Code is available at https://github.com/DengpanFu/LUPerson-NL

下载PDF全文

下载文献需遵守相关版权规定

论文标题