论文标题

使用构图代码嵌入基于变压器的语义解析模型

Compressing Transformer-Based Semantic Parsing Models using Compositional Code Embeddings

论文作者

Prakash, Prafull, Shashidhar, Saurabh Kumar, Zhao, Wenlong, Rongali, Subendhu, Khan, Haidar, Kayser, Michael

论文摘要

当前最新的面向任务的语义解析模型将Bert或Roberta用作预验证的编码器;这些型号具有巨大的内存足迹。这对他们的部署构成了挑战,例如Amazon Alexa和Google Assistant在Edge设备上,内存预算有限的挑战。我们建议学习构图代码嵌入,以大大减少Bert-Base和Roberta-Base的大小。我们还将该技术应用于Distilbert,Albert-Base和Albert-Large,这是三种已经被压缩的BERT变体,这些变体在语义解析上具有相似的最新性能,模型尺寸较小。我们观察到95.15%〜98.46%的嵌入压缩率和20.47%〜34.22%的编码器压缩率,同时保留了超过97.5%的语义解析性能。我们提供培训的秘诀,并分析代码嵌入大小和下游性能之间的权衡。

The current state-of-the-art task-oriented semantic parsing models use BERT or RoBERTa as pretrained encoders; these models have huge memory footprints. This poses a challenge to their deployment for voice assistants such as Amazon Alexa and Google Assistant on edge devices with limited memory budgets. We propose to learn compositional code embeddings to greatly reduce the sizes of BERT-base and RoBERTa-base. We also apply the technique to DistilBERT, ALBERT-base, and ALBERT-large, three already compressed BERT variants which attain similar state-of-the-art performances on semantic parsing with much smaller model sizes. We observe 95.15% ~ 98.46% embedding compression rates and 20.47% ~ 34.22% encoder compression rates, while preserving greater than 97.5% semantic parsing performances. We provide the recipe for training and analyze the trade-off between code embedding sizes and downstream performances.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源