论文标题

Egeria:通过知识引导层冷冻的有效DNN培训

Egeria: Efficient DNN Training with Knowledge-Guided Layer Freezing

论文作者

Wang, Yiding, Sun, Decang, Chen, Kai, Lai, Fan, Chowdhury, Mosharaf

论文摘要

训练深度神经网络(DNNS)很耗时。尽管大多数现有的解决方案都试图重叠/计划计算和通信以进行有效培训,但本文通过通过DNN层冻结跳过计算和通信进一步走了一步。我们的关键见解是,内部DNN层的训练进度有很大的不同,并且前层通常比深层更早地训练。为了探讨这一点,我们首先介绍了训练可塑性的概念,以量化内部DNN层的训练进度。然后,我们设计了Egeria,这是一种知识引导的DNN训练系统,它采用从参考模型的语义知识来准确评估各个层的训练可塑性并安全地冻结了融合的训练,从而节省了相应的向后计算和通信。我们的参考模型是使用量化技术即时生成的,并在可用的CPU上不同步运行前向操作,以最大程度地减少开销。此外,Egeria通过预取的冷冻层的中间输出缓存,以进一步跳过正向计算。我们使用流行视觉和语言模型的实施和测试式实验表明,Egeria获得了19%-43%的培训速度W.R.T.最先进的工作而没有牺牲准确性。

Training deep neural networks (DNNs) is time-consuming. While most existing solutions try to overlap/schedule computation and communication for efficient training, this paper goes one step further by skipping computing and communication through DNN layer freezing. Our key insight is that the training progress of internal DNN layers differs significantly, and front layers often become well-trained much earlier than deep layers. To explore this, we first introduce the notion of training plasticity to quantify the training progress of internal DNN layers. Then we design Egeria, a knowledge-guided DNN training system that employs semantic knowledge from a reference model to accurately evaluate individual layers' training plasticity and safely freeze the converged ones, saving their corresponding backward computation and communication. Our reference model is generated on the fly using quantization techniques and runs forward operations asynchronously on available CPUs to minimize the overhead. In addition, Egeria caches the intermediate outputs of the frozen layers with prefetching to further skip the forward computation. Our implementation and testbed experiments with popular vision and language models show that Egeria achieves 19%-43% training speedup w.r.t. the state-of-the-art without sacrificing accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源