论文标题
探索一种用于跨模式遥感图像检索的细粒多尺度方法
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
论文作者
论文摘要
遥感(RS)跨模式文本图像检索引起了广泛的关注,因为其灵活的输入和有效的查询优势。但是,传统方法忽略了RS图像中多尺度和冗余目标的特征,从而导致检索准确性降解。为了解决RS多模式检索任务中多尺度稀缺性和目标冗余的问题,我们提出了一种新型的不对称多模式特征匹配网络(AMFMN)。我们的模型适应了多尺度功能输入,有利于多源检索方法,并且可以动态滤波冗余功能。 AMFMN采用多尺度的视觉自我注意力(MVSA)模块来提取RS图像的显着特征,并利用视觉特征来指导文本表示。此外,为了减轻RS图像中强类别相似性引起的正面样本的歧义,我们根据样本对的先前相似性提出了具有动态变量缘的三重态损耗函数。最后,与传统的RS Image-Text数据集不同,具有粗文本和更高的内部相似性不同,我们构建了一个精细的且具有挑战性的遥感图像文本匹配数据集(RSITMD),该数据集(RSITMD)支持RS图像通过关键字和句子分别和共同的句子进行回收。四个RS文本图像数据集的实验表明,所提出的模型可以在跨模式RS文本图像检索任务中实现最先进的性能。
Remote sensing (RS) cross-modal text-image retrieval has attracted extensive attention for its advantages of flexible input and efficient query. However, traditional methods ignore the characteristics of multi-scale and redundant targets in RS image, leading to the degradation of retrieval accuracy. To cope with the problem of multi-scale scarcity and target redundancy in RS multimodal retrieval task, we come up with a novel asymmetric multimodal feature matching network (AMFMN). Our model adapts to multi-scale feature inputs, favors multi-source retrieval methods, and can dynamically filter redundant features. AMFMN employs the multi-scale visual self-attention (MVSA) module to extract the salient features of RS image and utilizes visual features to guide the text representation. Furthermore, to alleviate the positive samples ambiguity caused by the strong intraclass similarity in RS image, we propose a triplet loss function with dynamic variable margin based on prior similarity of sample pairs. Finally, unlike the traditional RS image-text dataset with coarse text and higher intraclass similarity, we construct a fine-grained and more challenging Remote sensing Image-Text Match dataset (RSITMD), which supports RS image retrieval through keywords and sentence separately and jointly. Experiments on four RS text-image datasets demonstrate that the proposed model can achieve state-of-the-art performance in cross-modal RS text-image retrieval task.