零射击活动视觉搜索（Zavis）：智能对象搜索机器人助手

论文标题

零射击活动视觉搜索（Zavis）：智能对象搜索机器人助手

Zero-shot Active Visual Search (ZAVIS): Intelligent Object Search for Robotic Assistants

论文作者

Park, Jeongeun, Yoon, Taerim, Hong, Jejoon, Yu, Youngjae, Pan, Matthew, Choi, Sungjoon

论文摘要

在本文中，我们专注于使用配备有视觉传感器的移动机器人（例如RGBD摄像头）有效地定位使用自由形式语言描述的目标对象的问题。常规的主动视觉搜索预测了一组搜索的对象，在实践中构成了这些技术的限制。为了在主动视觉搜索中提供更多的灵活性，我们提出了一个系统，用户可以使用自由形式的语言输入目标命令；我们将此系统称为野外（AVSW）中的主动视觉搜索。 AVSW检测并计划搜索用户通过静态地标（例如桌子或床）表示的语义网格图输入的目标对象。为了有效地计划对象搜索模式，AVSW考虑了基于常识性知识的共发生和预测性不确定性，同时决定首先访问哪些地标。我们在模拟环境和现实世界环境中验证了有关SR（成功率）和SPL（成功加权）的建议方法。在模拟方案中，该方法的平均间隙为0.283，而在模拟方案中，该方法的表现优于先前的方法。我们在现实世界研究中使用先锋3AT机器人进一步证明了AVSW。

In this paper, we focus on the problem of efficiently locating a target object described with free-form language using a mobile robot equipped with vision sensors (e.g., an RGBD camera). Conventional active visual search predefines a set of objects to search for, rendering these techniques restrictive in practice. To provide added flexibility in active visual searching, we propose a system where a user can enter target commands using free-form language; we call this system Active Visual Search in the Wild (AVSW). AVSW detects and plans to search for a target object inputted by a user through a semantic grid map represented by static landmarks (e.g., desk or bed). For efficient planning of object search patterns, AVSW considers commonsense knowledge-based co-occurrence and predictive uncertainty while deciding which landmarks to visit first. We validate the proposed method with respect to SR (success rate) and SPL (success weighted by path length) in both simulated and real-world environments. The proposed method outperforms previous methods in terms of SPL in simulated scenarios with an average gap of 0.283. We further demonstrate AVSW with a Pioneer-3AT robot in real-world studies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题