使用可区分的复位模块学习闭环面团操纵

论文标题

使用可区分的复位模块学习闭环面团操纵

Learning Closed-loop Dough Manipulation Using a Differentiable Reset Module

论文作者

Qi, Carl, Lin, Xingyu, Held, David

论文摘要

可变形的物体操纵在我们的日常生活中具有许多应用，例如烹饪和洗衣折叠。操纵弹性塑料对象（例如面团）特别具有挑战性，因为面团缺乏紧凑的状态表示，需要接触丰富的相互作用。我们考虑将一块面团从RGB-D图像中变成特定形状的任务。尽管该任务对于人类来说似乎是直观的，但对于诸如幼稚轨迹优化之类的常见方法，仍然存在局部最佳选择。我们提出了一种新型的轨迹优化器，该优化器通过可区分的“重置”模块进行优化，将单阶段的固定定位轨迹转化为多阶段的多阶段多启动轨迹，其中所有阶段均已共同优化。然后，我们对轨迹优化器产生的演示进行训练闭环政策。我们的策略将部分点云视为输入，从而使从仿真到现实世界的转移易于转移。我们表明，我们的政策可以执行现实世界的面团操纵，将面团的球弄平到目标形状。

Deformable object manipulation has many applications such as cooking and laundry folding in our daily lives. Manipulating elastoplastic objects such as dough is particularly challenging because dough lacks a compact state representation and requires contact-rich interactions. We consider the task of flattening a piece of dough into a specific shape from RGB-D images. While the task is seemingly intuitive for humans, there exist local optima for common approaches such as naive trajectory optimization. We propose a novel trajectory optimizer that optimizes through a differentiable "reset" module, transforming a single-stage, fixed-initialization trajectory into a multistage, multi-initialization trajectory where all stages are optimized jointly. We then train a closed-loop policy on the demonstrations generated by our trajectory optimizer. Our policy receives partial point clouds as input, allowing ease of transfer from simulation to the real world. We show that our policy can perform real-world dough manipulation, flattening a ball of dough into a target shape.

下载PDF全文

下载文献需遵守相关版权规定

论文标题