通过上下文聚合学习本地特征以进行视觉本地化

论文标题

通过上下文聚合学习本地特征以进行视觉本地化

Learning Local Features with Context Aggregation for Visual Localization

论文作者

Hong, Siyu, Li, Kunhong, Zhang, Yongcong, Fu, Zhiheng, Liu, Mengyi, Guo, Yulan

论文摘要

在许多愿景应用中，关键点检测和描述在许多愿景应用中至关重要。大多数现有方法都使用检测 - 然后描述或检测描述策略来学习本地功能，而无需考虑其上下文信息。因此，这些方法学习强大的本地特征是一项挑战。在本文中，我们专注于低级文本信息和高级语义上下文信息的融合，以提高本地特征的歧视性。具体而言，我们首先估算一个分数图，以根据所有像素的描述符质量来表示潜在关键点的分布。然后，我们根据分数图的指导提取和汇总多尺度的高级语义特征。最后，使用残差模块融合和精制的低级本地特征和高级语义特征。对挑战性本地功能基准数据集进行的实验表明，我们的方法在视觉定位基准的本地功能挑战中实现了最先进的性能。

Keypoint detection and description is fundamental yet important in many vision applications. Most existing methods use detect-then-describe or detect-and-describe strategy to learn local features without considering their context information. Consequently, it is challenging for these methods to learn robust local features. In this paper, we focus on the fusion of low-level textual information and high-level semantic context information to improve the discrimitiveness of local features. Specifically, we first estimate a score map to represent the distribution of potential keypoints according to the quality of descriptors of all pixels. Then, we extract and aggregate multi-scale high-level semantic features based by the guidance of the score map. Finally, the low-level local features and high-level semantic features are fused and refined using a residual module. Experiments on the challenging local feature benchmark dataset demonstrate that our method achieves the state-of-the-art performance in the local feature challenge of the visual localization benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题