触觉敏感的Newtonianvae用于高临界工业连接器插入

论文标题

触觉敏感的Newtonianvae用于高临界工业连接器插入

Tactile-Sensitive NewtonianVAE for High-Accuracy Industrial Connector Insertion

论文作者

Okumura, Ryo, Nishio, Nobuki, Taniguchi, Tadahiro

论文摘要

工业连接器插入任务需要插头的亚毫米定位并掌握姿势补偿。因此，对插头和插座之间的相对姿势的高度准确估计对于完成任务至关重要。世界模型是视觉运动控制的有前途的技术，因为它们获得了适当的状态表示，可以共同优化特征提取和潜在动力学模型。最近的研究表明，Newtonianvae是一种世界模型的一种类型，可以获得相当于从图像到物理坐标的映射的潜在空间。在牛顿维尔的潜在空间中可以实现比例控制。但是，在物理环境中应用牛顿台上的高临界工业任务是一个开放的问题。此外，现有的框架不考虑在获得的潜在空间中的掌握姿势补偿。在这项工作中，我们提出了对触觉敏感的Newtonianvae，并将其应用于物理环境中带有姿势变化的USB连接器插入。我们采用了凝胶型触觉传感器，并估计了插头的掌握姿势补偿的插入位置。我们的方法以端到端的方式训练潜在空间，不需要其他工程和注释。在获得的潜在空间中可以使用简单的比例控制。此外，我们证明了原始的牛顿谷在某些情况下失败了，并证明了域知识诱导可以提高模型的准确性。可以使用机器人规范和掌握姿势误差测量轻松获得此域知识。我们证明，我们提出的方法在物理环境中的USB连接器插入任务中实现了100 \％的成功率和0.3 mm的定位精度。它比使用坐标转换的GRASP姿势补偿优于基于SOTA CNN的两个阶段姿势回归。

An industrial connector insertion task requires submillimeter positioning and grasp pose compensation for a plug. Thus, highly accurate estimation of the relative pose between a plug and socket is fundamental for achieving the task. World models are promising technologies for visuomotor control because they obtain appropriate state representation to jointly optimize feature extraction and latent dynamics model. Recent studies show that the NewtonianVAE, a type of the world model, acquires latent space equivalent to mapping from images to physical coordinates. Proportional control can be achieved in the latent space of NewtonianVAE. However, applying NewtonianVAE to high-accuracy industrial tasks in physical environments is an open problem. Moreover, the existing framework does not consider the grasp pose compensation in the obtained latent space. In this work, we proposed tactile-sensitive NewtonianVAE and applied it to a USB connector insertion with grasp pose variation in the physical environments. We adopted a GelSight-type tactile sensor and estimated the insertion position compensated by the grasp pose of the plug. Our method trains the latent space in an end-to-end manner, and no additional engineering and annotation are required. Simple proportional control is available in the obtained latent space. Moreover, we showed that the original NewtonianVAE fails in some situations, and demonstrated that domain knowledge induction improves model accuracy. This domain knowledge can be easily obtained using robot specification and grasp pose error measurement. We demonstrated that our proposed method achieved a 100\% success rate and 0.3 mm positioning accuracy in the USB connector insertion task in the physical environment. It outperformed SOTA CNN-based two-stage goal pose regression with grasp pose compensation using coordinate transformation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题