论文标题

VID2CAD:使用视频中的多视图约束的CAD模型对齐

Vid2CAD: CAD Model Alignment using Multi-View Constraints from Videos

论文作者

Maninis, Kevis-Kokitsi, Popov, Stefan, Nießner, Matthias, Ferrari, Vittorio

论文摘要

我们解决将CAD模型与包含多个对象的复杂场景的视频序列对齐的任务。我们的方法可以处理任意视频,并完全自动恢复出现在其中的每个对象的9 DOF姿势,从而将它们对齐在常见的3D坐标帧中。我们方法的核心思想是将单个帧的神经网络预测与时间全局,多视图约束优化公式集成。这种整合过程可以在人均预测中解决规模和深度歧义,并通常改善所有姿势参数的估计值。通过利用多视图约束,我们的方法还可以解决遮挡并处理单个帧中视图的对象,从而将所有对象重构为场景的单个全球一致的CAD表示。与我们构建的最新单帧方法掩码2加上相比,我们在SCAN2CAD数据集(从11.6%到30.7%的班级平均准确度)上实现了实质性改进。

We address the task of aligning CAD models to a video sequence of a complex scene containing multiple objects. Our method can process arbitrary videos and fully automatically recover the 9 DoF pose for each object appearing in it, thus aligning them in a common 3D coordinate frame. The core idea of our method is to integrate neural network predictions from individual frames with a temporally global, multi-view constraint optimization formulation. This integration process resolves the scale and depth ambiguities in the per-frame predictions, and generally improves the estimate of all pose parameters. By leveraging multi-view constraints, our method also resolves occlusions and handles objects that are out of view in individual frames, thus reconstructing all objects into a single globally consistent CAD representation of the scene. In comparison to the state-of-the-art single-frame method Mask2CAD that we build on, we achieve substantial improvements on the Scan2CAD dataset (from 11.6% to 30.7% class average accuracy).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源