论文标题

UMIX:通过不确定性感知的混音,提高对亚群体转移的重要性加权

UMIX: Improving Importance Weighting for Subpopulation Shift via Uncertainty-Aware Mixup

论文作者

Han, Zongbo, Liang, Zhipeng, Yang, Fan, Liu, Liu, Li, Lanqing, Bian, Yatao, Zhao, Peilin, Wu, Bingzhe, Zhang, Changqing, Yao, Jianhua

论文摘要

在许多现实世界中的机器学习应用中广泛存在亚种群的转移,指的是包含相同亚种群组的培训和测试分布,但在亚种群频率中有所不同。重要性重新加权是一种通过对培训数据集中的每个样本施加恒定或自适应抽样权重来处理亚群偏移问题的正常方法。但是,最近的一些研究认识到,这些方法中的大多数无法改善性能,而不是经验风险最小化,尤其是当应用于过度参数化的神经网络时。在这项工作中,我们提出了一个简单而实用的框架,称为不确定性感知混合(UMIX),以根据样品不确定性重新加权“混合”样品,以减轻过度参数化模型中的过度拟合问题。每个样品的拟议UMIX中都配备了基于训练对象的不确定性估计,以灵活地表征亚群分布。我们还提供了有见地的理论分析,以验证UMIX是否在先前的工作中实现了更好的概括界限。此外,我们在广泛的任务上进行了广泛的经验研究,以验证我们方法的有效性,既有定性和定量。代码可从https://github.com/tencentailabhealthcare/umix获得。

Subpopulation shift widely exists in many real-world machine learning applications, referring to the training and test distributions containing the same subpopulation groups but varying in subpopulation frequencies. Importance reweighting is a normal way to handle the subpopulation shift issue by imposing constant or adaptive sampling weights on each sample in the training dataset. However, some recent studies have recognized that most of these approaches fail to improve the performance over empirical risk minimization especially when applied to over-parameterized neural networks. In this work, we propose a simple yet practical framework, called uncertainty-aware mixup (UMIX), to mitigate the overfitting issue in over-parameterized models by reweighting the ''mixed'' samples according to the sample uncertainty. The training-trajectories-based uncertainty estimation is equipped in the proposed UMIX for each sample to flexibly characterize the subpopulation distribution. We also provide insightful theoretical analysis to verify that UMIX achieves better generalization bounds over prior works. Further, we conduct extensive empirical studies across a wide range of tasks to validate the effectiveness of our method both qualitatively and quantitatively. Code is available at https://github.com/TencentAILabHealthcare/UMIX.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源