基于神经特征的加权融合的多标签徽标识别和检索

论文标题

基于神经特征的加权融合的多标签徽标识别和检索

Multi-Label Logo Recognition and Retrieval based on Weighted Fusion of Neural Features

论文作者

Bernabeu, Marisa, Gallego, Antonio Javier, Pertusa, Antonio

论文摘要

分类徽标图像是一项具有挑战性的任务，因为它们包含诸如文本或形状的元素，这些元素可以代表从已知对象到抽象形状的任何内容。徽标分类的当前艺术状态将问题作为一项侧重于单个特征的多级任务解决问题，但徽标可以具有多个同时标签，例如不同的颜色。这项工作提出了一种方法，该方法允许在视觉上相似的徽标根据其形状，颜色，商业扇区，语义，一般特征，或用户选择的功能组合从一组数据中进行分类和搜索。与以前的方法不同，该提案采用了一系列专门研究特定属性的多标签深神经网络，并结合了获得的功能以执行相似性搜索。为了深入研究分类系统，比较了不同的现有徽标拓扑并分析了它们的某些问题，例如商标注册数据库通常包含的不完整标签。考虑到欧盟商标数据集的76,000个徽标（比以前的方法高7倍）评估，该提案使用维也纳本体论进行了层次组织。总体而言，实验可以达到可靠的定量和定性结果，对于商标图像检索任务，最新技术的归一化平均等级误差从0.040降低到0.018。最后，鉴于徽标的语义通常可以是主观的，因此对图形设计学生和专业人士进行了调查。结果表明，所提出的方法比人类专家运营商提供了更好的标签，将标签排名的平均精度从0.53提高到0.68。

Classifying logo images is a challenging task as they contain elements such as text or shapes that can represent anything from known objects to abstract shapes. While the current state of the art for logo classification addresses the problem as a multi-class task focusing on a single characteristic, logos can have several simultaneous labels, such as different colors. This work proposes a method that allows visually similar logos to be classified and searched from a set of data according to their shape, color, commercial sector, semantics, general characteristics, or a combination of features selected by the user. Unlike previous approaches, the proposal employs a series of multi-label deep neural networks specialized in specific attributes and combines the obtained features to perform the similarity search. To delve into the classification system, different existing logo topologies are compared and some of their problems are analyzed, such as the incomplete labeling that trademark registration databases usually contain. The proposal is evaluated considering 76,000 logos (7 times more than previous approaches) from the European Union Trademarks dataset, which is organized hierarchically using the Vienna ontology. Overall, experimentation attains reliable quantitative and qualitative results, reducing the normalized average rank error of the state-of-the-art from 0.040 to 0.018 for the Trademark Image Retrieval task. Finally, given that the semantics of logos can often be subjective, graphic design students and professionals were surveyed. Results show that the proposed methodology provides better labeling than a human expert operator, improving the label ranking average precision from 0.53 to 0.68.

下载PDF全文

下载文献需遵守相关版权规定

论文标题