TIC：文本引导的图像着色

论文标题

TIC：文本引导的图像着色

TIC: Text-Guided Image Colorization

论文作者

Ghosh, Subhankar, Roy, Prasun, Bhattacharya, Saumik, Pal, Umapada, Blumenstein, Michael

论文摘要

图像着色是计算机视觉中的一个众所周知的问题。但是，由于任务的性质不足，图像着色本质上是具有挑战性的。尽管研究人员已经进行了几次尝试使着色管道自动化，但由于缺乏调理，这些过程通常会产生不切实际的结果。在这项工作中，我们试图将文本描述与要着色的灰度图像一起集成为辅助条件，以提高着色过程的忠诚度。据我们所知，这是将文本条件纳入着色管道中的首次尝试之一。为此，我们提出了一个新颖的深网，该网络采用了两个输入（灰度图像和各自的编码文本描述），并试图预测相关的颜色范围。由于各自的文本描述包含场景中存在的对象的颜色信息，因此编码文本有助于提高预测颜色的整体质量。我们已经使用不同的指标评估了我们提出的模型，并发现它均优于最先进的着色算法，既有定性和定量。

Image colorization is a well-known problem in computer vision. However, due to the ill-posed nature of the task, image colorization is inherently challenging. Though several attempts have been made by researchers to make the colorization pipeline automatic, these processes often produce unrealistic results due to a lack of conditioning. In this work, we attempt to integrate textual descriptions as an auxiliary condition, along with the grayscale image that is to be colorized, to improve the fidelity of the colorization process. To the best of our knowledge, this is one of the first attempts to incorporate textual conditioning in the colorization pipeline. To do so, we have proposed a novel deep network that takes two inputs (the grayscale image and the respective encoded text description) and tries to predict the relevant color gamut. As the respective textual descriptions contain color information of the objects present in the scene, the text encoding helps to improve the overall quality of the predicted colors. We have evaluated our proposed model using different metrics and found that it outperforms the state-of-the-art colorization algorithms both qualitatively and quantitatively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题