论文标题
缺少链接:跨数据集查找标签关系
The Missing Link: Finding label relations across datasets
论文作者
论文摘要
计算机视觉是由可用于培训或评估新方法的许多数据集驱动的。但是,每个数据集都有一组不同的类标签,类的视觉定义,遵循特定分布的图像,注释协议等。在本文中,我们探讨了跨数据集之间的视觉语义关系的自动发现。我们旨在了解数据集中某个类的实例与另一个数据集中另一类的实例有关。他们是否处于身份,父母/孩子的重叠关系中?还是它们之间没有链接?为了找到跨数据集的标签之间的关系,我们根据语言,视觉及其组合提出方法。我们表明,我们可以有效地发现跨数据集及其类型的标签关系。我们将方法应用于四个应用程序:了解标签关系,确定缺失方面,增加标签特异性并预测转移学习的收益。我们得出的结论是,不能通过单独查看类的名称来建立标签关系,因为它们在很大程度上取决于如何构建每个数据集。
Computer vision is driven by the many datasets available for training or evaluating novel methods. However, each dataset has a different set of class labels, visual definition of classes, images following a specific distribution, annotation protocols, etc. In this paper we explore the automatic discovery of visual-semantic relations between labels across datasets. We aim to understand how instances of a certain class in a dataset relate to the instances of another class in another dataset. Are they in an identity, parent/child, overlap relation? Or is there no link between them at all? To find relations between labels across datasets, we propose methods based on language, on vision, and on their combination. We show that we can effectively discover label relations across datasets, as well as their type. We apply our method to four applications: understand label relations, identify missing aspects, increase label specificity, and predict transfer learning gains. We conclude that label relations cannot be established by looking at the names of classes alone, as they depend strongly on how each of the datasets was constructed.