论文标题

关于领域适应理论的调查:学习界限和理论保证

A survey on domain adaptation theory: learning bounds and theoretical guarantees

论文作者

Redko, Ievgen, Morvant, Emilie, Habrard, Amaury, Sebban, Marc, Bennani, Younès

论文摘要

所有著名的机器学习算法都构成了受监督和半监督的学习工作,只有在一个共同的假设下:培训和测试数据遵循相同的分布。当分布变化时,大多数统计模型必须从新收集的数据中重建,对于某些应用程序,这些数据可能是昂贵或无法获得的。因此,有必要开发方法,以减少在相关领域中可用的数据并在相似领域中进一步使用这些数据,从而减少需求和努力获得新的标签样品。这引起了一个新的机器学习框架,称为转移学习:一个受人类在跨任务中推断知识以更有效学习的知识能力启发的学习环境。尽管有大量不同的转移学习方案,但本调查的主要目的是在特定的,最受欢迎的转移学习中最受欢迎的子场中概述了最先进的理论结果,称为域名。在此子场中,假定数据分布在整个培训和测试数据中都会发生变化,而学习任务保持不变。我们提供了与域适应性问题有关的现有结果的首次最新描述,该结果涵盖了基于不同统计学习框架的学习界限。

All famous machine learning algorithms that comprise both supervised and semi-supervised learning work well only under a common assumption: the training and test data follow the same distribution. When the distribution changes, most statistical models must be reconstructed from newly collected data, which for some applications can be costly or impossible to obtain. Therefore, it has become necessary to develop approaches that reduce the need and the effort to obtain new labeled samples by exploiting data that are available in related areas, and using these further across similar fields. This has given rise to a new machine learning framework known as transfer learning: a learning setting inspired by the capability of a human being to extrapolate knowledge across tasks to learn more efficiently. Despite a large amount of different transfer learning scenarios, the main objective of this survey is to provide an overview of the state-of-the-art theoretical results in a specific, and arguably the most popular, sub-field of transfer learning, called domain adaptation. In this sub-field, the data distribution is assumed to change across the training and the test data, while the learning task remains the same. We provide a first up-to-date description of existing results related to domain adaptation problem that cover learning bounds based on different statistical learning frameworks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源