使用Vision Transformer的隐私图像分类

论文标题

使用Vision Transformer的隐私图像分类

Privacy-Preserving Image Classification Using Vision Transformer

论文作者

Qi, Zheng, MaungMaung, AprilPyone, Kinoshita, Yuma, Kiya, Hitoshi

论文摘要

在本文中，我们提出了一种基于加密图像和视觉变压器（VIT）的组合使用的隐私图像分类方法。提出的方法不仅使我们不仅可以将图像应用于没有视觉信息的VIT模型进行培训和测试，还可以保持高分类精度。 VIT利用图像贴片的贴片嵌入和位置嵌入，因此该体系结构显示出可减少块形图像转换的影响。在一个实验中，证明提出的具有隐私图像分类的方法可在分类的准确性和鲁棒性对各种攻击方面胜过最先进的方法。

In this paper, we propose a privacy-preserving image classification method that is based on the combined use of encrypted images and the vision transformer (ViT). The proposed method allows us not only to apply images without visual information to ViT models for both training and testing but to also maintain a high classification accuracy. ViT utilizes patch embedding and position embedding for image patches, so this architecture is shown to reduce the influence of block-wise image transformation. In an experiment, the proposed method for privacy-preserving image classification is demonstrated to outperform state-of-the-art methods in terms of classification accuracy and robustness against various attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题