朝着可靠的视觉识别的稳健2D卷积

论文标题

朝着可靠的视觉识别的稳健2D卷积

Towards Robust 2D Convolution for Reliable Visual Recognition

论文作者

Li, Lida, Li, Shuai, Wang, Kun, Feng, Xiangchu, Zhang, Lei

论文摘要

负责从输入图像中提取特征的2D卷积（Conv2D）是卷积神经网络（CNN）的关键模块之一。但是，Conv2D容易受到图像腐败和对抗样本的影响。这是一个重要但很少研究的问题，即我们是否可以设计更强大的Conv2D替代方案，以进行更可靠的功能提取。在本文中，灵感来自最近开发的可学习的稀疏变换，该变换学会将CNN特征转换为紧凑而稀疏的潜在空间，我们设计了一个新颖的构建块，用RCONV-MK表示，以增强提取的卷积特征的稳健性。我们的方法利用一组不同尺寸的可学习核以不同的频率提取功能，并采用标准化的软阈值操作员来适应删除不同腐败级别的噪声和琐碎的功能。在干净的图像，损坏的图像以及对抗样品上进行了广泛的实验，验证了提出的可靠模块的有效性，以实现可靠的视觉识别。源代码包含在提交中。

2D convolution (Conv2d), which is responsible for extracting features from the input image, is one of the key modules of a convolutional neural network (CNN). However, Conv2d is vulnerable to image corruptions and adversarial samples. It is an important yet rarely investigated problem that whether we can design a more robust alternative of Conv2d for more reliable feature extraction. In this paper, inspired by the recently developed learnable sparse transform that learns to convert the CNN features into a compact and sparse latent space, we design a novel building block, denoted by RConv-MK, to strengthen the robustness of extracted convolutional features. Our method leverages a set of learnable kernels of different sizes to extract features at different frequencies and employs a normalized soft thresholding operator to adaptively remove noises and trivial features at different corruption levels. Extensive experiments on clean images, corrupted images as well as adversarial samples validate the effectiveness of the proposed robust module for reliable visual recognition. The source codes are enclosed in the submission.

下载PDF全文

下载文献需遵守相关版权规定

论文标题