论文标题

用辅助网络修剪的深度用于Tinyml

Depth Pruning with Auxiliary Networks for TinyML

论文作者

De Leon, Josen Daniel, Atienza, Rowel

论文摘要

修剪是一种神经网络优化技术,可牺牲准确性,以换取较低的计算要求。在处理蒂尼姆(Tinyml)中极为受限的环境时,修剪很有用。不幸的是,特殊的硬件要求和对已经紧凑模型的有效性的有限研究阻止了其更广泛的采用。深度修剪是一种修剪形式,不需要专门的硬件,而是遭受了巨大的准确性下降。为了改善这一点,我们提出了一种利用高效辅助网络作为中间特征图的有效解释器的修改。我们的结果表明,Mlperftiny Visual Wakewords(VWW)任务的参数降低了93%,关键字发现(KWS)任务的参数分别为28%,精度成本分别为0.65%和1.06%。当在Cortex-M0微控制器上进行评估时,我们的建议方法将VWW模型尺寸降低了4.7倍,潜伏期降低了1.6倍,而反直觉获得1%的精度。 Cortex-M0上的KWS型号的大小也减少了1.2倍,延迟降低了1.2倍,精度为2.21%。

Pruning is a neural network optimization technique that sacrifices accuracy in exchange for lower computational requirements. Pruning has been useful when working with extremely constrained environments in tinyML. Unfortunately, special hardware requirements and limited study on its effectiveness on already compact models prevent its wider adoption. Depth pruning is a form of pruning that requires no specialized hardware but suffers from a large accuracy falloff. To improve this, we propose a modification that utilizes a highly efficient auxiliary network as an effective interpreter of intermediate feature maps. Our results show a parameter reduction of 93% on the MLPerfTiny Visual Wakewords (VWW) task and 28% on the Keyword Spotting (KWS) task with accuracy cost of 0.65% and 1.06% respectively. When evaluated on a Cortex-M0 microcontroller, our proposed method reduces the VWW model size by 4.7x and latency by 1.6x while counter intuitively gaining 1% accuracy. KWS model size on Cortex-M0 was also reduced by 1.2x and latency by 1.2x at the cost of 2.21% accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源