论文标题
轻巧图像超分辨率的交叉诱导的重点推理网络
Cross-receptive Focused Inference Network for Lightweight Image Super-Resolution
论文作者
论文摘要
最近,由于全局特征提取的能力,基于变压器的方法在单图超分辨率(SISR)任务中表现出了令人印象深刻的性能。但是,要忽略了要合并上下文信息以动态提取功能的变压器的功能被忽略。为了解决这个问题,我们提出了一个轻巧的交叉诱导性重点推理网络(CFIN),该推理网络由与CNN和Transformer混合的CT块组成。具体而言,在CT块中,我们首先提出了一个基于CNN的跨尺度信息聚合模块(CIAM),以使模型能够更好地专注于潜在有用的信息,以提高变压器阶段的效率。然后,我们设计了一种新型的交叉磁场引导变压器(CFGT),以通过使用模拟的卷积内核来选择重建所需的上下文信息,该卷积内核了解当前的语义信息并利用不同自我注意的信息交互。广泛的实验表明,我们提出的CFIN可以使用上下文信息有效地重建图像,并且可以在计算成本和模型性能作为有效模型之间取得良好的平衡。源代码将在https://github.com/iviplab/cfin上找到。
Recently, Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks due to the ability of global feature extraction. However, the capabilities of Transformers that need to incorporate contextual information to extract features dynamically are neglected. To address this issue, we propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer. Specifically, in the CT block, we first propose a CNN-based Cross-Scale Information Aggregation Module (CIAM) to enable the model to better focus on potentially helpful information to improve the efficiency of the Transformer phase. Then, we design a novel Cross-receptive Field Guided Transformer (CFGT) to enable the selection of contextual information required for reconstruction by using a modulated convolutional kernel that understands the current semantic information and exploits the information interaction within different self-attention. Extensive experiments have shown that our proposed CFIN can effectively reconstruct images using contextual information, and it can strike a good balance between computational cost and model performance as an efficient model. Source codes will be available at https://github.com/IVIPLab/CFIN.