论文标题
Diva:深度度量学习的各种视觉特征聚合
DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning
论文作者
论文摘要
视觉相似性在许多计算机视觉应用中都起着重要作用。深度度量学习(DML)是学习此类相似性的有力框架,不仅可以从训练数据到相同分布的测试分布,而且尤其是转化为未知的测试类。但是,其主要的学习范式是歧视性的监督培训,通常导致专门用于分离培训课程的代表性。但是,对于有效的概括,这种图像表示需要捕获各种数据特征。为此,我们建议和研究多个互补的学习任务,仅诉诸于标准DML设置的可用培训样本和标签,从而在概念上针对不同的数据关系。通过同时优化我们的任务,我们学习了一个单个模型来汇总其培训信号,从而在多个已建立的DML基准数据集上实现了强烈的概括和最先进的性能。
Visual Similarity plays an important role in many computer vision applications. Deep metric learning (DML) is a powerful framework for learning such similarities which not only generalize from training data to identically distributed test distributions, but in particular also translate to unknown test classes. However, its prevailing learning paradigm is class-discriminative supervised training, which typically results in representations specialized in separating training classes. For effective generalization, however, such an image representation needs to capture a diverse range of data characteristics. To this end, we propose and study multiple complementary learning tasks, targeting conceptually different data relationships by only resorting to the available training samples and labels of a standard DML setting. Through simultaneous optimization of our tasks we learn a single model to aggregate their training signals, resulting in strong generalization and state-of-the-art performance on multiple established DML benchmark datasets.