利用混合未标记的数据来检测可见和看不见的分布类别的样本

论文标题

利用混合未标记的数据来检测可见和看不见的分布类别的样本

Exploiting Mixed Unlabeled Data for Detecting Samples of Seen and Unseen Out-of-Distribution Classes

论文作者

Sun, Yi-Xuan, Wang, Wei

论文摘要

在现实世界中，分布（OOD）检测至关重要，这在近年来引起了人们的关注。但是，大多数现有的OOD检测方法都需要许多标记为分布（ID）数据，从而导致了巨大的标签成本。在本文中，我们专注于更现实的方案，其中有限的标记数据和丰富的未标记数据可用，并且这些未标记的数据与ID和OOD样本混合在一起。我们提出了自适应学习（AIOL）方法的自适应，其中我们采用适当的温度从混合未标记的数据中适应潜在的ID和OOD样品，并考虑在它们上进行熵以进行OOD检测。此外，由于现实应用程序中的测试数据可能包含在混合未标记的数据中类的OOD样本（我们称它们为看不见的OOD类），因此将数据增强技术带入了该方法中以进一步提高性能。这些实验是在各种基准数据集上进行的，这证明了我们方法的优越性。

Out-of-Distribution (OOD) detection is essential in real-world applications, which has attracted increasing attention in recent years. However, most existing OOD detection methods require many labeled In-Distribution (ID) data, causing a heavy labeling cost. In this paper, we focus on the more realistic scenario, where limited labeled data and abundant unlabeled data are available, and these unlabeled data are mixed with ID and OOD samples. We propose the Adaptive In-Out-aware Learning (AIOL) method, in which we employ the appropriate temperature to adaptively select potential ID and OOD samples from the mixed unlabeled data and consider the entropy over them for OOD detection. Moreover, since the test data in realistic applications may contain OOD samples whose classes are not in the mixed unlabeled data (we call them unseen OOD classes), data augmentation techniques are brought into the method to further improve the performance. The experiments are conducted on various benchmark datasets, which demonstrate the superiority of our method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题