论文标题
关于多视图对象分类的自我监督表示的鲁棒性
On the robustness of self-supervised representations for multi-view object classification
论文作者
论文摘要
众所周知,与完全监督的预训练相比,来自自我监督的预训练的表示可以在各种下游任务上执行,并且通常更好地在各种下游任务上执行。这已经在许多设置中显示,例如通用对象分类和检测,语义分割和图像检索。但是,最近有些问题出现了,这些问题证明了自我监督表示的某些故障模式,例如在非imagenet的数据上的性能或复杂的场景。在本文中,我们表明,基于实例歧视目标的自我监督表示形式导致对象的更好表示,这些对象对对象的观点和观点的变化更为强大。我们针对多个有监督的基线执行现代自我监督方法的实验,以证明这一点,包括通过同型图近似对象观点变化,以及基于多个多视图数据集的现实世界测试。我们发现,自我监督的表示对对象的观点更为强大,并且似乎编码了有关对象的更相关的信息,这些信息促进了从新颖观点中识别对象的识别。
It is known that representations from self-supervised pre-training can perform on par, and often better, on various downstream tasks than representations from fully-supervised pre-training. This has been shown in a host of settings such as generic object classification and detection, semantic segmentation, and image retrieval. However, some issues have recently come to the fore that demonstrate some of the failure modes of self-supervised representations, such as performance on non-ImageNet-like data, or complex scenes. In this paper, we show that self-supervised representations based on the instance discrimination objective lead to better representations of objects that are more robust to changes in the viewpoint and perspective of the object. We perform experiments of modern self-supervised methods against multiple supervised baselines to demonstrate this, including approximating object viewpoint variation through homographies, and real-world tests based on several multi-view datasets. We find that self-supervised representations are more robust to object viewpoint and appear to encode more pertinent information about objects that facilitate the recognition of objects from novel views.