论文标题

NVAE:深层分层自动编码器

NVAE: A Deep Hierarchical Variational Autoencoder

论文作者

Vahdat, Arash, Kautz, Jan

论文摘要

标准化流量,自回旋模型,变异自动编码器(VAE)和基于深度能量的模型是基于竞争的基于可能的深度生成学习的框架。其中,VAE具有快速,可拖动的采样和易于访问的编码网络的优势。但是,目前,它们的表现超过了其他模型,例如标准化流量和自回归模型。尽管VAE中的大多数研究都集中在统计挑战上,但我们探索了精心设计用于分层VAE的神经体系结构的正交方向。我们提出了Nouveau Vae(NVAE),这是一种旨在使用深度可分离卷积和批处理标准化的深层分层VAE。 NVAE配备了正常分布的残留参数化,其训练通过光谱正则化稳定。我们表明,NVAE在MNIST,CIFAR-10,CELEBA 64和CELEBA HQ数据集的基于自动进取的可能性模型中取得了最新的结果,并且在FFHQ上提供了强大的基线。例如,在CIFAR-10上,NVAE将最先进的时间从2.98推到每个维度的2.91位,并且在Celeba HQ上产生了高质量的图像。据我们所知,NVAE是第一个成功的VAE,用于自然图像,最大为256 $ \ times $ 256像素。源代码可从https://github.com/nvlabs/nvae获得。

Normalizing flows, autoregressive models, variational autoencoders (VAEs), and deep energy-based models are among competing likelihood-based frameworks for deep generative learning. Among them, VAEs have the advantage of fast and tractable sampling and easy-to-access encoding networks. However, they are currently outperformed by other models such as normalizing flows and autoregressive models. While the majority of the research in VAEs is focused on the statistical challenges, we explore the orthogonal direction of carefully designing neural architectures for hierarchical VAEs. We propose Nouveau VAE (NVAE), a deep hierarchical VAE built for image generation using depth-wise separable convolutions and batch normalization. NVAE is equipped with a residual parameterization of Normal distributions and its training is stabilized by spectral regularization. We show that NVAE achieves state-of-the-art results among non-autoregressive likelihood-based models on the MNIST, CIFAR-10, CelebA 64, and CelebA HQ datasets and it provides a strong baseline on FFHQ. For example, on CIFAR-10, NVAE pushes the state-of-the-art from 2.98 to 2.91 bits per dimension, and it produces high-quality images on CelebA HQ. To the best of our knowledge, NVAE is the first successful VAE applied to natural images as large as 256$\times$256 pixels. The source code is available at https://github.com/NVlabs/NVAE .

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源