论文标题
引导图像与其干净和嘈杂的标签之间的关系
Bootstrapping the Relationship Between Images and Their Clean and Noisy Labels
论文作者
论文摘要
许多最先进的嘈杂标签学习方法依赖于学习机制,这些学习机制可以在训练过程中估算样品的清洁标签并放弃其原始嘈杂标签。但是,这种方法阻止了图像,嘈杂标签和干净标签之间的关系的学习,在处理依赖实例的标签噪声问题时,这些方法已被证明很有用。此外,确实旨在学习这种关系的方法需要清晰注释的数据子集,以及用于培训的蒸馏或多方面模型。在本文中,我们提出了一种新的培训算法,该算法依靠一个简单的模型来学习清洁和嘈杂标签之间的关系,而无需明确标记的数据子集。我们的算法遵循一个三阶段的过程,即:1)对分类器进行自我保护的预训练,然后对分类器进行早期训练,以自信地预测训练集的一部分的清洁标签; 2)使用从阶段(1)的清洁集来引导图像,嘈杂标签和干净标签之间的关系,我们利用使用半监督的学习来有效地重新标记剩余的训练集; 3)对分类器进行监督培训,并使用阶段(2)的所有重新标记样品进行监督。通过学习这种关系,我们在不对称和实例依赖性标签噪声问题中实现了最新的性能。
Many state-of-the-art noisy-label learning methods rely on learning mechanisms that estimate the samples' clean labels during training and discard their original noisy labels. However, this approach prevents the learning of the relationship between images, noisy labels and clean labels, which has been shown to be useful when dealing with instance-dependent label noise problems. Furthermore, methods that do aim to learn this relationship require cleanly annotated subsets of data, as well as distillation or multi-faceted models for training. In this paper, we propose a new training algorithm that relies on a simple model to learn the relationship between clean and noisy labels without the need for a cleanly labelled subset of data. Our algorithm follows a 3-stage process, namely: 1) self-supervised pre-training followed by an early-stopping training of the classifier to confidently predict clean labels for a subset of the training set; 2) use the clean set from stage (1) to bootstrap the relationship between images, noisy labels and clean labels, which we exploit for effective relabelling of the remaining training set using semi-supervised learning; and 3) supervised training of the classifier with all relabelled samples from stage (2). By learning this relationship, we achieve state-of-the-art performance in asymmetric and instance-dependent label noise problems.