Mix-n匹配：深度学习中不确定性校准的合奏和组成方法

论文标题

Mix-n匹配：深度学习中不确定性校准的合奏和组成方法

Mix-n-Match: Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning

论文作者

Zhang, Jize, Kailkhura, Bhavya, Han, T. Yong-Jin

论文摘要

本文研究了机器学习分类器的事后校准问题。我们介绍了以下Desiderata以进行不确定性校准：（a）准确性保存，（b）数据效率和（c）高表达能力。我们表明，现有的方法都不满足所有三个要求，并证明了混合n匹配校准策略（即合奏和组成）如何有助于实现明显更好的数据效率和表达能力，同时可证明原始分类器的分类准确性。 Mix-n匹配策略是通用的，因为它们可用于改善任何现成的校准器的性能。我们还揭示了标准评估实践中的潜在问题。流行方法（例如，基于直方图的预期校准误差（ECE））可能会提供误导性结果，尤其是在小型数据方面。因此，我们提出了一种基于数据有效的核密度估计器，以对校准性能进行可靠的评估，并证明其渐近无偏见和一致性。我们的方法在大多数实验环境中的校准以及评估任务上都超过了最先进的解决方案。我们的代码可从https://github.com/zhang64-llnl/mix-n-match-calibration获得。

This paper studies the problem of post-hoc calibration of machine learning classifiers. We introduce the following desiderata for uncertainty calibration: (a) accuracy-preserving, (b) data-efficient, and (c) high expressive power. We show that none of the existing methods satisfy all three requirements, and demonstrate how Mix-n-Match calibration strategies (i.e., ensemble and composition) can help achieve remarkably better data-efficiency and expressive power while provably maintaining the classification accuracy of the original classifier. Mix-n-Match strategies are generic in the sense that they can be used to improve the performance of any off-the-shelf calibrator. We also reveal potential issues in standard evaluation practices. Popular approaches (e.g., histogram-based expected calibration error (ECE)) may provide misleading results especially in small-data regime. Therefore, we propose an alternative data-efficient kernel density-based estimator for a reliable evaluation of the calibration performance and prove its asymptotically unbiasedness and consistency. Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks in most of the experimental settings. Our codes are available at https://github.com/zhang64-llnl/Mix-n-Match-Calibration.

下载PDF全文

下载文献需遵守相关版权规定

论文标题