语义形状自适应特征调制用于语义图像综合

论文标题

语义形状自适应特征调制用于语义图像综合

Semantic-shape Adaptive Feature Modulation for Semantic Image Synthesis

论文作者

Lv, Zhengyao, Li, Xiaoming, Niu, Zhenxing, Cao, Bing, Zuo, Wangmeng

论文摘要

近年来，语义图像综合方面取得了长足的进步，在合成具有丰富细节的照片现实图像方面，它仍然具有挑战性。大多数以前的方法都侧重于利用给定的语义图，该图仅捕获图像的对象级布局。显然，细粒度级别的语义布局将使对象详细信息生成受益，并且可以大致从对象的形状中推断出来。为了利用零件级的布局，我们提出了一个形状吸引的位置描述符（SPD）来描述每个像素的位置特征，其中将对象形状明确编码到SPD特征中。此外，提出了一个语义形状自适应特征调制（SAFM）块，以结合给定的语义图和我们的位置特征，以产生适应性调制的特征。广泛的实验表明，提出的SPD和SAFM可以通过丰富的细节显着改善对象的产生。此外，就定量和定性评估而言，我们的方法对SOTA方法的表现非常有利。源代码和模型可在https://github.com/cszy98/safm上找到。

Recent years have witnessed substantial progress in semantic image synthesis, it is still challenging in synthesizing photo-realistic images with rich details. Most previous methods focus on exploiting the given semantic map, which just captures an object-level layout for an image. Obviously, a fine-grained part-level semantic layout will benefit object details generation, and it can be roughly inferred from an object's shape. In order to exploit the part-level layouts, we propose a Shape-aware Position Descriptor (SPD) to describe each pixel's positional feature, where object shape is explicitly encoded into the SPD feature. Furthermore, a Semantic-shape Adaptive Feature Modulation (SAFM) block is proposed to combine the given semantic map and our positional features to produce adaptively modulated features. Extensive experiments demonstrate that the proposed SPD and SAFM significantly improve the generation of objects with rich details. Moreover, our method performs favorably against the SOTA methods in terms of quantitative and qualitative evaluation. The source code and model are available at https://github.com/cszy98/SAFM.

下载PDF全文

下载文献需遵守相关版权规定

论文标题