多视图主动的细粒识别

论文标题

多视图主动的细粒识别

Multi-View Active Fine-Grained Recognition

论文作者

Du, Ruoyi, Yu, Wenqing, Wang, Heqing, Chang, Dongliang, Lin, Ting-En, Li, Yongbin, Ma, Zhanyu

论文摘要

随着数十年来开发的细粒视觉分类（FGVC），与大作品相关的方向暴露了一个关键方向 - 找到歧视性的地方区域并揭示了细微的差异。但是，与识别静态图像中的视觉内容不同，对于识别真实物理世界中的对象，歧视性信息不仅存在于看到的本地区域内，而且还存在于其他看不见的观点中。换句话说，除了关注可区分的部分与整体上，为了有效而准确的认识，还需要一眼来推断关键的视角，例如，人们可能会识别出一眼前沿的“奔驰AMG gt”，然后知道，然后知道，看一下排气管可以帮助您告诉哪个年度的模型。 In this paper, back to reality, we put forward the problem of active fine-grained recognition (AFGR) and complete this study in three steps: (i) a hierarchical, multi-view, fine-grained vehicle dataset is collected as the testbed, (ii) a simple experiment is designed to verify that different perspectives contribute differently for FGVC and different categories own different discriminative perspective, (iii) a policy-gradient-based framework is adopted to achieve有效识别具有主动视图选择。全面的实验表明，所提出的方法比以前的FGVC方法和高级神经网络提供了更好的性能折衷。

As fine-grained visual classification (FGVC) being developed for decades, great works related have exposed a key direction -- finding discriminative local regions and revealing subtle differences. However, unlike identifying visual contents within static images, for recognizing objects in the real physical world, discriminative information is not only present within seen local regions but also hides in other unseen perspectives. In other words, in addition to focusing on the distinguishable part from the whole, for efficient and accurate recognition, it is required to infer the key perspective with a few glances, e.g., people may recognize a "Benz AMG GT" with a glance of its front and then know that taking a look at its exhaust pipe can help to tell which year's model it is. In this paper, back to reality, we put forward the problem of active fine-grained recognition (AFGR) and complete this study in three steps: (i) a hierarchical, multi-view, fine-grained vehicle dataset is collected as the testbed, (ii) a simple experiment is designed to verify that different perspectives contribute differently for FGVC and different categories own different discriminative perspective, (iii) a policy-gradient-based framework is adopted to achieve efficient recognition with active view selection. Comprehensive experiments demonstrate that the proposed method delivers a better performance-efficient trade-off than previous FGVC methods and advanced neural networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题