无监督的场景草图到照片综合

论文标题

无监督的场景草图到照片综合

Unsupervised Scene Sketch to Photo Synthesis

论文作者

Wang, Jiayun, Jeon, Sangryul, Yu, Stella X., Zhang, Xi, Arora, Himanshu, Lou, Yu

论文摘要

草图在快速执行的徒手绘图时会形成直观而有力的视觉表达。我们提出了一种从场景草图中综合现实照片的方法。不需要素描和照片对，我们的框架直接以无监督的方式从随时可用的大型照片数据集中学习。为此，我们介绍了一个标准化模块，该模块在训练过程中通过将照片和草图转换为标准化域，即边缘地图，从而提供伪素描 - 光明对。素描和照片之间减少的域间隙还使我们可以将它们分为两个组成部分：整体场景结构和低级视觉样式，例如颜色和纹理。利用这一优势，我们通过结合草图的结构和参考照片的视觉样式来综合照片真实的图像。关于感知相似性指标和人类感知研究的广泛实验结果表明，该方法可以从场景草图和跑赢大盘的最先进的照片合成基准产生逼真的照片。我们还证明，我们的框架通过编辑相应草图的笔触来促进对照片综合的可控操纵，比依赖于区域级编辑的以前的方法提供了更多细粒度的细节。

Sketches make an intuitive and powerful visual expression as they are fast executed freehand drawings. We present a method for synthesizing realistic photos from scene sketches. Without the need for sketch and photo pairs, our framework directly learns from readily available large-scale photo datasets in an unsupervised manner. To this end, we introduce a standardization module that provides pseudo sketch-photo pairs during training by converting photos and sketches to a standardized domain, i.e. the edge map. The reduced domain gap between sketch and photo also allows us to disentangle them into two components: holistic scene structures and low-level visual styles such as color and texture. Taking this advantage, we synthesize a photo-realistic image by combining the structure of a sketch and the visual style of a reference photo. Extensive experimental results on perceptual similarity metrics and human perceptual studies show the proposed method could generate realistic photos with high fidelity from scene sketches and outperform state-of-the-art photo synthesis baselines. We also demonstrate that our framework facilitates a controllable manipulation of photo synthesis by editing strokes of corresponding sketches, delivering more fine-grained details than previous approaches that rely on region-level editing.

下载PDF全文

下载文献需遵守相关版权规定

论文标题