MPCFormer：使用MPC快速，表演和私人变压器推断

论文标题

MPCFormer：使用MPC快速，表演和私人变压器推断

MPCFormer: fast, performant and private Transformer inference with MPC

论文作者

Li, Dacheng, Shao, Rulin, Wang, Hongyi, Guo, Han, Xing, Eric P., Zhang, Hao

论文摘要

启用私有推理对于基于变压器模型的许多云推理服务至关重要。但是，现有的私人推理解决方案可以将推理潜伏期增加60倍以上，或者显着损害推理质量。在本文中，我们使用安全的多方计算（MPC）和知识蒸馏（KD）将框架MPCFormer设计为实用解决方案。通过广泛的评估，我们表明MPCFormer在MPC设置中显着加快了变压器的推断，同时达到了与输入模型相似的ML性能。在IMDB数据集上，它的性能与Bertbase相似，而速度更快。在胶水基准上，它以2.2倍的速度实现了Bertbase的97％的性能。 MPCFormer在不同训练的变压器重量（例如Robertabase和包括Bertlarge）的较大模型中保持有效。代码可从https://github.com/mccree177/mpcformer获得。

Enabling private inference is crucial for many cloud inference services that are based on Transformer models. However, existing private inference solutions can increase the inference latency by more than 60x or significantly compromise the inference quality. In this paper, we design the framework MPCFORMER as a practical solution, using Secure Multi-Party Computation (MPC) and Knowledge Distillation (KD). Through extensive evaluations, we show that MPCFORMER significantly speeds up Transformer inference in MPC settings while achieving similar ML performance to the input model. On the IMDb dataset, it achieves similar performance to BERTBASE, while being 5.3x faster. On the GLUE benchmark, it achieves 97% performance of BERTBASE with a 2.2x speedup. MPCFORMER remains effective with different trained Transformer weights such as ROBERTABASE and larger models including BERTLarge. Code is available at https://github.com/MccRee177/MPCFormer.

下载PDF全文

下载文献需遵守相关版权规定

论文标题