论文标题
视觉地面真相构造作为刻面分类
Visual Ground Truth Construction as Faceted Classification
论文作者
论文摘要
机器学习和计算机视觉的最新工作为开发主要对象识别基准数据集的开发提供了系统设计缺陷的证据。一个这样的示例就是ImageNet,其中,对于几类图像,它们所代表的对象与用来注释它们的标签之间存在不一致。这个问题的后果是主要的,特别是考虑到大量的机器学习应用程序,尤其是基于深度神经网络的应用程序,这些应用程序已在这些数据集上进行过培训。在本文中,我们认为问题是缺乏知识表示(KR)方法,为建造这些基础真相基准数据集提供了基础。因此,我们提出了一个以三个主要步骤表达的解决方案:(i)以四个有序阶段的阶段解构对象识别过程,该阶段基于宗教哲学理论; (ii)基于这种分层,提出了一种基于新颖的四阶方法,用于根据其视觉特性在分类层次结构中组织对象; (iii)根据刻面分类范式进行此类分类。我们方法的主要新颖性在于,我们从利用视觉属的视觉特性构建分类层次结构,而不是从语言扎根的特性中构建分类层次结构。提出的方法通过音乐实验的成像网层次结构进行了验证。
Recent work in Machine Learning and Computer Vision has provided evidence of systematic design flaws in the development of major object recognition benchmark datasets. One such example is ImageNet, wherein, for several categories of images, there are incongruences between the objects they represent and the labels used to annotate them. The consequences of this problem are major, in particular considering the large number of machine learning applications, not least those based on Deep Neural Networks, that have been trained on these datasets. In this paper we posit the problem to be the lack of a knowledge representation (KR) methodology providing the foundations for the construction of these ground truth benchmark datasets. Accordingly, we propose a solution articulated in three main steps: (i) deconstructing the object recognition process in four ordered stages grounded in the philosophical theory of teleosemantics; (ii) based on such stratification, proposing a novel four-phased methodology for organizing objects in classification hierarchies according to their visual properties; and (iii) performing such classification according to the faceted classification paradigm. The key novelty of our approach lies in the fact that we construct the classification hierarchies from visual properties exploiting visual genus-differentiae, and not from linguistically grounded properties. The proposed approach is validated by a set of experiments on the ImageNet hierarchy of musical experiments.