带有计划采样的标记至图像扩散模型

论文标题

带有计划采样的标记至图像扩散模型

Markup-to-Image Diffusion Models with Scheduled Sampling

论文作者

Deng, Yuntian, Kojima, Noriyuki, Rush, Alexander M.

论文摘要

在图像生成的最新进展的基础上，我们提出了一种完全数据驱动的方法，将标记渲染到图像中。该方法基于扩散模型，该模型使用在高斯噪声分布之上的一系列降解操作来参数化数据的分布。我们将扩散降级过程视为一个顺序决策过程，并表明它表现出类似于模仿学习问题中的暴露偏见问题的复杂错误。为了减轻这些问题，我们将计划的抽样算法调整为扩散训练。我们在四个标记数据集上进行实验：数学公式（乳胶），表布局（HTML），乐谱音乐（Lilypond）和分子图像（微笑）。这些实验每个实验验证了扩散过程的有效性以及计划采样来解决发电问题。这些结果还表明，标记到图像任务为诊断和分析生成图像模型提供了有用的控制组成设置。

Building on recent advances in image generation, we present a fully data-driven approach to rendering markup into images. The approach is based on diffusion models, which parameterize the distribution of data using a sequence of denoising operations on top of a Gaussian noise distribution. We view the diffusion denoising process as a sequential decision making process, and show that it exhibits compounding errors similar to exposure bias issues in imitation learning problems. To mitigate these issues, we adapt the scheduled sampling algorithm to diffusion training. We conduct experiments on four markup datasets: mathematical formulas (LaTeX), table layouts (HTML), sheet music (LilyPond), and molecular images (SMILES). These experiments each verify the effectiveness of the diffusion process and the use of scheduled sampling to fix generation issues. These results also show that the markup-to-image task presents a useful controlled compositional setting for diagnosing and analyzing generative image models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题