GRAF：3D感知图像合成的生成辐射场

论文标题

GRAF：3D感知图像合成的生成辐射场

GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis

论文作者

Schwarz, Katja, Liao, Yiyi, Niemeyer, Michael, Geiger, Andreas

论文摘要

尽管2D生成对抗网络已实现了高分辨率图像的综合，但它们在很大程度上缺乏对3D世界和图像形成过程的理解。因此，它们不能对摄像机的视点或对象姿势提供精确的控制。为了解决这个问题，最近的几种方法利用了基于中间体体素的表示与可区分的渲染结合使用。但是，现有方法要么产生低图像分辨率，要么在解开相机和场景属性中跌落，例如，对象身份可能会随着视点而变化。在本文中，我们提出了一个针对辐射场的生成模型，该模型最近证明是成功的单个场景综合图。与基于体素的表示形式相反，辐射场不仅限于3D空间的粗离散化，而是允许在存在重建歧义的情况下优雅地降低相机和场景属性。通过引入一个基于多尺度贴片的鉴别器，我们在单独使用未经未介绍的2D图像训练模型时展示了高分辨率图像的综合。我们系统地分析了几个具有挑战性的合成和现实数据集的方法。我们的实验表明，辐射场是生成图像合成的强大表示，导致3D一致的模型具有高忠诚度。

While 2D generative adversarial networks have enabled high-resolution image synthesis, they largely lack an understanding of the 3D world and the image formation process. Thus, they do not provide precise control over camera viewpoint or object pose. To address this problem, several recent approaches leverage intermediate voxel-based representations in combination with differentiable rendering. However, existing methods either produce low image resolution or fall short in disentangling camera and scene properties, e.g., the object identity may vary with the viewpoint. In this paper, we propose a generative model for radiance fields which have recently proven successful for novel view synthesis of a single scene. In contrast to voxel-based representations, radiance fields are not confined to a coarse discretization of the 3D space, yet allow for disentangling camera and scene properties while degrading gracefully in the presence of reconstruction ambiguity. By introducing a multi-scale patch-based discriminator, we demonstrate synthesis of high-resolution images while training our model from unposed 2D images alone. We systematically analyze our approach on several challenging synthetic and real-world datasets. Our experiments reveal that radiance fields are a powerful representation for generative image synthesis, leading to 3D consistent models that render with high fidelity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题