论文标题
C-NMT:神经机器翻译的协作推理框架
C-NMT: A Collaborative Inference Framework for Neural Machine Translation
论文作者
论文摘要
协作推断(CI)通过边缘和云设备的操作来优化深度学习推断的潜伏和能量消耗。尽管对其他任务有益,但CI从未应用于神经机器翻译核心(NMT)的序列序列映射问题。在这项工作中,我们解决了协作NMT的特定问题,例如估计生成(未知)输出序列所需的延迟,并显示如何将现有的CI方法适应这些应用程序。我们的实验表明,与非授权方法相比,CI可以将NMT的潜伏期降低44%。
Collaborative Inference (CI) optimizes the latency and energy consumption of deep learning inference through the inter-operation of edge and cloud devices. Albeit beneficial for other tasks, CI has never been applied to the sequence- to-sequence mapping problem at the heart of Neural Machine Translation (NMT). In this work, we address the specific issues of collaborative NMT, such as estimating the latency required to generate the (unknown) output sequence, and show how existing CI methods can be adapted to these applications. Our experiments show that CI can reduce the latency of NMT by up to 44% compared to a non-collaborative approach.