HEAD2TOE：利用中间表示来更好地转移学习

论文标题

HEAD2TOE：利用中间表示来更好地转移学习

Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning

论文作者

Evci, Utku, Dumoulin, Vincent, Larochelle, Hugo, Mozer, Michael C.

论文摘要

转移学习方法旨在使用在数据富源源域上预测的模型来提高数据堆放目标域的性能。一种经济高效的策略，即线性探测，涉及冻结源模型并培训目标域的新分类头。该策略的表现要优于一种更昂贵但最先进的方法 - 对源模型的所有参数对目标域进行微调 - 可能是因为微调允许该模型利用中间层中的有用信息，而后来经过认证的层则否则会丢弃。我们探讨了可能直接利用这些中间层的假设。我们提出了一种方法，即从头到脚探测（HEAD2TOE），该方法从源模型的所有层中选择特征来训练目标域的分类头。在对VTAB-1K的评估中，HEAD2TOE与平均进行微调获得的性能相匹配，同时减少培训和存储成本数百倍或更多，但至关重要，但至关重要的是，对于分布式转移，Head2Toe胜过微调。

Transfer-learning methods aim to improve performance in a data-scarce target domain using a model pretrained on a data-rich source domain. A cost-efficient strategy, linear probing, involves freezing the source model and training a new classification head for the target domain. This strategy is outperformed by a more costly but state-of-the-art method -- fine-tuning all parameters of the source model to the target domain -- possibly because fine-tuning allows the model to leverage useful information from intermediate layers which is otherwise discarded by the later pretrained layers. We explore the hypothesis that these intermediate layers might be directly exploited. We propose a method, Head-to-Toe probing (Head2Toe), that selects features from all layers of the source model to train a classification head for the target-domain. In evaluations on the VTAB-1k, Head2Toe matches performance obtained with fine-tuning on average while reducing training and storage cost hundred folds or more, but critically, for out-of-distribution transfer, Head2Toe outperforms fine-tuning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题