通过语义匹配来识别模棱两可的相似性条件

论文标题

通过语义匹配来识别模棱两可的相似性条件

Identifying Ambiguous Similarity Conditions via Semantic Matching

论文作者

Ye, Han-Jia, Shi, Yi, Zhan, De-Chuan

论文摘要

图像内部的丰富语义导致其与他人的模棱两可的关系，即，在一个条件下，两个图像可能相似，但在另一个条件中也不相同。诸如“飞机”之类的三胞胎类似于“鸟”，而不是“火车”，弱监督的有条件相似性学习（WS-CSL）学习多个嵌入以匹配语义条件，而没有明确的条件标签，例如“可以飞行”。但是，三胞胎中的相似性关系不确定，除了提供条件。例如，一旦条件标签更改为“是车辆”，先前的比较就无效。为此，我们通过在将学习的嵌入在其最佳条件下预测比较的正确性来介绍一个新的评估标准，该条件衡量了多少WS-CSL可以涵盖潜在的语义作为监督模型。此外，我们提出了距离引起的语义条件验证网络（Discovernet），该网络以“分解和融合”方式表征实例 - 内置和三胞胎条件关系。为了使学习的嵌入覆盖所有语义，Discovernet在三胞胎和条件之间的对应关系上使用了设定的模块或附加的正规器。 Discovernet在UT-Zappos-50k和Celeb-A W.R.T.等基准上实现最先进的性能。不同的标准。

Rich semantics inside an image result in its ambiguous relationship with others, i.e., two images could be similar in one condition but dissimilar in another. Given triplets like "aircraft" is similar to "bird" than "train", Weakly Supervised Conditional Similarity Learning (WS-CSL) learns multiple embeddings to match semantic conditions without explicit condition labels such as "can fly". However, similarity relationships in a triplet are uncertain except providing a condition. For example, the previous comparison becomes invalid once the conditional label changes to "is vehicle". To this end, we introduce a novel evaluation criterion by predicting the comparison's correctness after assigning the learned embeddings to their optimal conditions, which measures how much WS-CSL could cover latent semantics as the supervised model. Furthermore, we propose the Distance Induced Semantic COndition VERification Network (DiscoverNet), which characterizes the instance-instance and triplets-condition relations in a "decompose-and-fuse" manner. To make the learned embeddings cover all semantics, DiscoverNet utilizes a set module or an additional regularizer over the correspondence between a triplet and a condition. DiscoverNet achieves state-of-the-art performance on benchmarks like UT-Zappos-50k and Celeb-A w.r.t. different criteria.

下载PDF全文

下载文献需遵守相关版权规定

论文标题