跨模式3D形状生成和操纵

论文标题

跨模式3D形状生成和操纵

Cross-Modal 3D Shape Generation and Manipulation

论文作者

Cheng, Zezhou, Chai, Menglei, Ren, Jian, Lee, Hsin-Ying, Olszewski, Kyle, Huang, Zeng, Maji, Subhransu, Tulyakov, Sergey

论文摘要

创建和编辑3D对象的形状和颜色需要巨大的人类努力和专业知识。与3D接口中的直接操作相比，2D相互作用（例如草图和涂鸦）通常对用户更自然和直观。在本文中，我们提出了一个通用的多模式生成模型，该模型通过共享的潜在空间耦合2D模式和隐式3D表示。通过提出的模型，通过简单地通过潜在空间从特定的2D控制方式传播编辑，可以实现多功能3D生成和操纵。例如，通过绘制草图来编辑3D形状，通过绘画颜色在2D渲染上重新色彩，或在一个或几个参考图像中生成特定类别的3D形状。与先前的作品不同，我们的模型不需要每个编辑任务进行重新训练或微调，并且在概念上也很简单，易于实现，对输入域移动的强大，并且对部分2D输入的多样化进行了灵活。我们在灰度线草图的两个代表性2D模式和渲染颜色图像上评估了我们的框架，并证明我们的方法可以通过以下2D模态实现各种形状的操纵和生成任务。

Creating and editing the shape and color of 3D objects require tremendous human effort and expertise. Compared to direct manipulation in 3D interfaces, 2D interactions such as sketches and scribbles are usually much more natural and intuitive for the users. In this paper, we propose a generic multi-modal generative model that couples the 2D modalities and implicit 3D representations through shared latent spaces. With the proposed model, versatile 3D generation and manipulation are enabled by simply propagating the editing from a specific 2D controlling modality through the latent spaces. For example, editing the 3D shape by drawing a sketch, re-colorizing the 3D surface via painting color scribbles on the 2D rendering, or generating 3D shapes of a certain category given one or a few reference images. Unlike prior works, our model does not require re-training or fine-tuning per editing task and is also conceptually simple, easy to implement, robust to input domain shifts, and flexible to diverse reconstruction on partial 2D inputs. We evaluate our framework on two representative 2D modalities of grayscale line sketches and rendered color images, and demonstrate that our method enables various shape manipulation and generation tasks with these 2D modalities.

下载PDF全文

下载文献需遵守相关版权规定

论文标题