量子视觉变压器

论文标题

量子视觉变压器

Quantum Vision Transformers

论文作者

Cherrat, El Amine, Kerenidis, Iordanis, Mathur, Natansh, Landman, Jonas, Strahm, Martin, Li, Yun Yvonna

论文摘要

在这项工作中，通过扩展最新的经典变压器神经网络架构来详细设计和分析量子变压器在自然语言处理和图像分析中非常表现。在先前的工作基础上，该工作使用参数化的量子电路来进行数据加载和正交神经层，我们引入了三种类型的量子变压器进行训练和推理，包括基于复合矩阵的量子变压器，该量子变压器保证了量子注意机制的理论优势，与它们的经典配置相比，量子注意事项的理论优势在同样的运行时间和型号的情况下。这些量子体系结构可以使用浅量子电路构建，并在质量上产生不同的分类模型。这三个提出的量子注意层在遵循经典变压器和表现出更多量子特征之间的频谱上有所不同。作为量子变压器的构建基块，我们提出了一种新颖的方法，用于将基质作为量子状态以及两个新的可训练的量子正交层，适应于不同级别的连接性和量子计算机的质量。我们对标准医疗图像数据集上的量子变压器进行了广泛的模拟，这些量子变压器与经典基准（包括一流的经典视觉变压器）相比有时表现出竞争性，有时表现更好。与标准的经典基准相比，我们在这些小规模数据集上训练的量子变压器所需的参数较少。最后，我们在超导量子计算机上实现了量子变压器，并获得了多达六个量子实验的令人鼓舞的结果。

In this work, quantum transformers are designed and analysed in detail by extending the state-of-the-art classical transformer neural network architectures known to be very performant in natural language processing and image analysis. Building upon the previous work, which uses parametrised quantum circuits for data loading and orthogonal neural layers, we introduce three types of quantum transformers for training and inference, including a quantum transformer based on compound matrices, which guarantees a theoretical advantage of the quantum attention mechanism compared to their classical counterpart both in terms of asymptotic run time and the number of model parameters. These quantum architectures can be built using shallow quantum circuits and produce qualitatively different classification models. The three proposed quantum attention layers vary on the spectrum between closely following the classical transformers and exhibiting more quantum characteristics. As building blocks of the quantum transformer, we propose a novel method for loading a matrix as quantum states as well as two new trainable quantum orthogonal layers adaptable to different levels of connectivity and quality of quantum computers. We performed extensive simulations of the quantum transformers on standard medical image datasets that showed competitively, and at times better performance compared to the classical benchmarks, including the best-in-class classical vision transformers. The quantum transformers we trained on these small-scale datasets require fewer parameters compared to standard classical benchmarks. Finally, we implemented our quantum transformers on superconducting quantum computers and obtained encouraging results for up to six qubit experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题