论文标题

部分可观测时空混沌系统的无模型预测

Multi-modal Semantic SLAM for Complex Dynamic Environments

论文作者

Wang, Han, Ko, Jing Ying, Xie, Lihua

论文摘要

同时定位和映射(SLAM)是许多现实世界机器人应用中最重要的技术之一。静态环境的假设在大多数SLAM算法中都是常见的,但是对于大多数应用来说,这并非如此。关于语义大满贯的最新工作旨在了解环境中的对象,并通过执行基于图像的细分来区分场景上下文。但是,分割结果通常是不完美或不完整的,随后可以降低映射质量和本地化的准确性。在本文中,我们提出了一个强大的多模式语义框架,以解决复杂且高度动态的环境中的SLAM问题。我们建议学习一个更强大的对象特征表示,并在骨干网络中展开和思考的机制,这为我们的基线实例细分模型带来了更好的识别结果。此外,将仅几何聚类和视觉语义信息组合在一起,以减少由于小规模对象,遮挡和运动模糊而导致的分割误差的效果。已经进行了彻底的实验来评估所提出方法的性能。结果表明,我们的方法可以精确地识别识别缺陷和运动模糊的动态对象。此外,所提出的SLAM框架能够以超过10 Hz的处理速率有效地构建静态密集地图,这可以在许多实际应用中实现。培训数据和建议的方法均在https://github.com/wh200720041/mms_slam上开放。

Simultaneous Localization and Mapping (SLAM) is one of the most essential techniques in many real-world robotic applications. The assumption of static environments is common in most SLAM algorithms, which however, is not the case for most applications. Recent work on semantic SLAM aims to understand the objects in an environment and distinguish dynamic information from a scene context by performing image-based segmentation. However, the segmentation results are often imperfect or incomplete, which can subsequently reduce the quality of mapping and the accuracy of localization. In this paper, we present a robust multi-modal semantic framework to solve the SLAM problem in complex and highly dynamic environments. We propose to learn a more powerful object feature representation and deploy the mechanism of looking and thinking twice to the backbone network, which leads to a better recognition result to our baseline instance segmentation model. Moreover, both geometric-only clustering and visual semantic information are combined to reduce the effect of segmentation error due to small-scale objects, occlusion and motion blur. Thorough experiments have been conducted to evaluate the performance of the proposed method. The results show that our method can precisely identify dynamic objects under recognition imperfection and motion blur. Moreover, the proposed SLAM framework is able to efficiently build a static dense map at a processing rate of more than 10 Hz, which can be implemented in many practical applications. Both training data and the proposed method is open sourced at https://github.com/wh200720041/MMS_SLAM.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源