MaxMatch：半监督学习，具有最差的一致性

论文标题

MaxMatch：半监督学习，具有最差的一致性

MaxMatch: Semi-Supervised Learning with Worst-Case Consistency

论文作者

Jiang, Yangbangyan, Li, Xiaodan, Chen, Yuefeng, He, Yuan, Xu, Qianqian, Yang, Zhiyong, Cao, Xiaochun, Huang, Qingming

论文摘要

近年来，已经取得了巨大进展，以通过半监督学习（SSL）来克服未标记的数据来克服效率低下的监督问题。大多数最先进的模型是基于对未标记的数据追求一致的模型预测的想法，而该数据被称为输入噪声，这称为一致性正则化。尽管如此，对其成功的原因缺乏理论上的见解。为了弥合理论和实际结果之间的差距，我们在本文中提出了SSL的最坏情况一致性正则化技术。具体而言，我们首先提出了针对SSL的概括，该概括由分别在标记和未标记的培训数据上观察到的经验损失项组成。在这种界限的激励下，我们得出了一个SSL目标，该目标将原始未标记的样本与其多个增强变体之间的最大不一致性最小化。然后，我们提供了一种简单但有效的算法来解决提出的最小问题，从理论上证明它会收敛到固定点。在五个流行的基准数据集上进行的实验验证了我们提出的方法的有效性。

In recent years, great progress has been made to incorporate unlabeled data to overcome the inefficiently supervised problem via semi-supervised learning (SSL). Most state-of-the-art models are based on the idea of pursuing consistent model predictions over unlabeled data toward the input noise, which is called consistency regularization. Nonetheless, there is a lack of theoretical insights into the reason behind its success. To bridge the gap between theoretical and practical results, we propose a worst-case consistency regularization technique for SSL in this paper. Specifically, we first present a generalization bound for SSL consisting of the empirical loss terms observed on labeled and unlabeled training data separately. Motivated by this bound, we derive an SSL objective that minimizes the largest inconsistency between an original unlabeled sample and its multiple augmented variants. We then provide a simple but effective algorithm to solve the proposed minimax problem, and theoretically prove that it converges to a stationary point. Experiments on five popular benchmark datasets validate the effectiveness of our proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题