对比度学习的不确定性：关于下游性能的可预测性

论文标题

对比度学习的不确定性：关于下游性能的可预测性

Uncertainty in Contrastive Learning: On the Predictability of Downstream Performance

论文作者

Ardeshir, Shervin, Azizan, Navid

论文摘要

当今一些最先进的深度学习模型的出色表现在某种程度上是由于在大型数据集上进行了广泛的（自我监督的对比预处理）。相比之下，网络以成对的正（类似）和负（不同）数据点为呈现，并经过训练以找到每个数据点的嵌入向量，即一个表示形式，可以对各种下游任务进行进一步调整。为了将这些模型安全地部署在关键的决策系统中，至关重要的是要使他们衡量其不确定性或可靠性。但是，由于训练对比模型的成对性质，并且在输出上缺乏绝对标签（一个抽象的嵌入向量），因此将常规不确定性估计技术适应此类模型是不平凡的。在这项工作中，我们研究是否可以以有意义的方式量化单个数据点的不确定性。换句话说，我们探索给定数据点上的下游性能是否可以直接从其预训练的嵌入中预测。我们表明，可以通过直接估算嵌入空间中训练数据的分布并考虑表示局部一致性来实现此目标。我们的实验表明，嵌入向量的不确定性概念通常与其下游精度密切相关。

The superior performance of some of today's state-of-the-art deep learning models is to some extent owed to extensive (self-)supervised contrastive pretraining on large-scale datasets. In contrastive learning, the network is presented with pairs of positive (similar) and negative (dissimilar) datapoints and is trained to find an embedding vector for each datapoint, i.e., a representation, which can be further fine-tuned for various downstream tasks. In order to safely deploy these models in critical decision-making systems, it is crucial to equip them with a measure of their uncertainty or reliability. However, due to the pairwise nature of training a contrastive model, and the lack of absolute labels on the output (an abstract embedding vector), adapting conventional uncertainty estimation techniques to such models is non-trivial. In this work, we study whether the uncertainty of such a representation can be quantified for a single datapoint in a meaningful way. In other words, we explore if the downstream performance on a given datapoint is predictable, directly from its pre-trained embedding. We show that this goal can be achieved by directly estimating the distribution of the training data in the embedding space and accounting for the local consistency of the representations. Our experiments show that this notion of uncertainty for an embedding vector often strongly correlates with its downstream accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题