半决赛：用于大规模细粒图像检索的学习障碍解决方案

论文标题

半决赛：用于大规模细粒图像检索的学习障碍解决方案

SEMICON: A Learning-to-hash Solution for Large-scale Fine-grained Image Retrieval

论文作者

Shen, Yang, Sun, Xuhao, Wei, Xiu-Shen, Jiang, Qing-Yuan, Yang, Jian

论文摘要

在本文中，我们提出了基于抑制增强面膜的注意力和交互式通道转换（分号），以学习处理大规模细粒图像检索任务的二进制哈希码。在半号中，我们首先基于对动态定位歧视图像区域的抑制增强面膜（SEM）的注意。更重要的是，与现有的注意机制不同，我们的SEM是为了限制此类区域而开发的，然后通过考虑以阶段的方式考虑激活区域之间的关系来限制其他互补区域。在每个阶段，交互式通道变换（ICON）模块之后旨在利用跨访问激活张量的通道之间的相关性。由于通道通常可以与细粒对象的部分相对应，因此该部分相关性也可以相应地建模，从而进一步提高了细粒的检索精度。此外，要作为计算经济，图标是通过有效的两步过程实现的。最后，对我们的分号的哈希学习由全球和本地级分支组成，以更好地表示细粒对象，然后生成与多个级别相对应的二进制哈希码。在五个基准高颗粒数据集上进行的实验显示了我们优于竞争方法。

In this paper, we propose Suppression-Enhancing Mask based attention and Interactive Channel transformatiON (SEMICON) to learn binary hash codes for dealing with large-scale fine-grained image retrieval tasks. In SEMICON, we first develop a suppression-enhancing mask (SEM) based attention to dynamically localize discriminative image regions. More importantly, different from existing attention mechanism simply erasing previous discriminative regions, our SEM is developed to restrain such regions and then discover other complementary regions by considering the relation between activated regions in a stage-by-stage fashion. In each stage, the interactive channel transformation (ICON) module is afterwards designed to exploit correlations across channels of attended activation tensors. Since channels could generally correspond to the parts of fine-grained objects, the part correlation can be also modeled accordingly, which further improves fine-grained retrieval accuracy. Moreover, to be computational economy, ICON is realized by an efficient two-step process. Finally, the hash learning of our SEMICON consists of both global- and local-level branches for better representing fine-grained objects and then generating binary hash codes explicitly corresponding to multiple levels. Experiments on five benchmark fine-grained datasets show our superiority over competing methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题