论文标题
固定:混合使用令人沮丧的简单域概括
FIXED: Frustratingly Easy Domain Generalization with Mixup
论文作者
论文摘要
域的概括(DG)旨在从多个训练领域学习一个可推广的模型,以便它可以在看不见的目标域上表现良好。一种流行的策略是增强培训数据,通过诸如混音〜\ cite {zhang2018 -mixup}之类的方法受益。尽管可以直接应用香草混合,但理论和经验研究发现了一些限制其性能的缺点。首先,混音无法有效地识别可用于学习不变表示的域和类信息。其次,混音可以通过随机插值引入合成噪声数据点,从而降低其歧视能力。基于分析,我们为基于混合的DG提出了一种简单而有效的增强,即域不变特征混合(FIX)。它学习混合的域不变表示。为了进一步增强歧视,我们利用现有技术扩大班级之间的利润,以进一步提出以增强的歧视(固定)方法来提出域不变的特征混合。我们提供了有关保证其有效性的理论见解。 Extensive experiments on seven public datasets across two modalities including image classification (Digits-DG, PACS, Office-Home) and time series (DSADS, PAMAP2, UCI-HAR, and USC-HAD) demonstrate that our approach significantly outperforms nine state-of-the-art related methods, beating the best performing baseline by 6.5\% on average in terms of test accuracy.代码可在以下网址获得:https://github.com/jindongwang/transferlearning/tree/master/code/deep/fixed。
Domain generalization (DG) aims to learn a generalizable model from multiple training domains such that it can perform well on unseen target domains. A popular strategy is to augment training data to benefit generalization through methods such as Mixup~\cite{zhang2018mixup}. While the vanilla Mixup can be directly applied, theoretical and empirical investigations uncover several shortcomings that limit its performance. Firstly, Mixup cannot effectively identify the domain and class information that can be used for learning invariant representations. Secondly, Mixup may introduce synthetic noisy data points via random interpolation, which lowers its discrimination capability. Based on the analysis, we propose a simple yet effective enhancement for Mixup-based DG, namely domain-invariant Feature mIXup (FIX). It learns domain-invariant representations for Mixup. To further enhance discrimination, we leverage existing techniques to enlarge margins among classes to further propose the domain-invariant Feature MIXup with Enhanced Discrimination (FIXED) approach. We present theoretical insights about guarantees on its effectiveness. Extensive experiments on seven public datasets across two modalities including image classification (Digits-DG, PACS, Office-Home) and time series (DSADS, PAMAP2, UCI-HAR, and USC-HAD) demonstrate that our approach significantly outperforms nine state-of-the-art related methods, beating the best performing baseline by 6.5\% on average in terms of test accuracy. Code is available at: https://github.com/jindongwang/transferlearning/tree/master/code/deep/fixed.