论文标题
DM-NERF:2D图像中的3D场景几何分解和操纵
DM-NeRF: 3D Scene Geometry Decomposition and Manipulation from 2D Images
论文作者
论文摘要
在本文中,我们研究了2D视图中的3D场景几何分解和操纵的问题。通过利用最近的隐式神经表示技术,尤其是吸引人的神经辐射领域,我们引入了一个对象字段组件,以了解仅从2D监督的3D空间中所有单个对象的独特代码。该组件的关键是一系列精心设计的损失功能,以使每个3D点,尤其是在非占领空间中,即使没有3D标签,也可以有效地优化。此外,我们引入了一种反查询算法,以自由操纵学习的场景表示中指定的3D对象形状。值得注意的是,我们的操纵算法可以明确解决关键问题,例如对象碰撞和视觉遮挡。我们的方法称为DM-NERF,是最早在单个管道中同时重建,分解,操纵和渲染复杂3D场景的方法之一。在三个数据集上进行的大量实验清楚地表明,我们的方法可以从2D视图中准确分解所有3D对象,从而允许在3D空间中自由操纵任何感兴趣的对象,例如翻译,旋转,尺寸调整和变形。
In this paper, we study the problem of 3D scene geometry decomposition and manipulation from 2D views. By leveraging the recent implicit neural representation techniques, particularly the appealing neural radiance fields, we introduce an object field component to learn unique codes for all individual objects in 3D space only from 2D supervision. The key to this component is a series of carefully designed loss functions to enable every 3D point, especially in non-occupied space, to be effectively optimized even without 3D labels. In addition, we introduce an inverse query algorithm to freely manipulate any specified 3D object shape in the learned scene representation. Notably, our manipulation algorithm can explicitly tackle key issues such as object collisions and visual occlusions. Our method, called DM-NeRF, is among the first to simultaneously reconstruct, decompose, manipulate and render complex 3D scenes in a single pipeline. Extensive experiments on three datasets clearly show that our method can accurately decompose all 3D objects from 2D views, allowing any interested object to be freely manipulated in 3D space such as translation, rotation, size adjustment, and deformation.