保险丝和参加：全身嵌入艺术和草图的学习

论文标题

保险丝和参加：全身嵌入艺术和草图的学习

Fuse and Attend: Generalized Embedding Learning for Art and Sketches

论文作者

Dutta, Ujjal Kr

论文摘要

虽然深层学习方法在多个计算机视觉任务中见证了广泛的成功，但代表自然图像的最新方法不一定在其他域中的图像上表现良好，例如绘画，漫画和草图。这是因为与自然图像相比，数据分布的分布发生了巨大变化。像草图这样的域通常包含稀疏的信息像素。但是，识别此类域中的对象至关重要，给定多个相关的应用程序利用了此类数据，例如草图以图像检索。因此，实现可以在多个领域中表现良好的嵌入学习模型不仅具有挑战性，而且在计算机视觉中起着关键作用。为此，在本文中，我们提出了一种新颖的嵌入学习方法，目的是跨不同领域概括。在训练过程中，鉴于来自域中的查询图像，我们采用封闭式的融合和注意力来产生一个积极的例子，该示例带有查询对象类别（来自多个域）的语义的广泛概念。凭借对比度学习，我们将查询和积极的嵌入方式汲取，以学习在跨领域稳健的表示形式。同时，要教导该模型对来自不同语义类别（跨域）的示例的歧视性，我们还维护了负嵌入（来自不同类别）的池。我们在流行的PAC（照片，艺术绘画，卡通和草图）数据集上使用域床框架展示了我们方法的实力。

While deep Embedding Learning approaches have witnessed widespread success in multiple computer vision tasks, the state-of-the-art methods for representing natural images need not necessarily perform well on images from other domains, such as paintings, cartoons, and sketch. This is because of the huge shift in the distribution of data from across these domains, as compared to natural images. Domains like sketch often contain sparse informative pixels. However, recognizing objects in such domains is crucial, given multiple relevant applications leveraging such data, for instance, sketch to image retrieval. Thus, achieving an Embedding Learning model that could perform well across multiple domains is not only challenging, but plays a pivotal role in computer vision. To this end, in this paper, we propose a novel Embedding Learning approach with the goal of generalizing across different domains. During training, given a query image from a domain, we employ gated fusion and attention to generate a positive example, which carries a broad notion of the semantics of the query object category (from across multiple domains). By virtue of Contrastive Learning, we pull the embeddings of the query and positive, in order to learn a representation which is robust across domains. At the same time, to teach the model to be discriminative against examples from different semantic categories (across domains), we also maintain a pool of negative embeddings (from different categories). We show the prowess of our method using the DomainBed framework, on the popular PACS (Photo, Art painting, Cartoon, and Sketch) dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题