边缘序言：灵活的分布式深度学习推论

论文标题

边缘序言：灵活的分布式深度学习推论

Edge-PRUNE: Flexible Distributed Deep Learning Inference

论文作者

Boutellier, Jani, Tan, Bo, Nurmi, Jari

论文摘要

在过去的几年中，低资源端点设备和Edge服务器之间的协作深度学习推论已获得了重大的研究兴趣。这样的计算分区可以帮助减少端点设备能源消耗并改善延迟，但同样重要的是，也有助于保护敏感数据的隐私。本文介绍了Edge-Proune，这是一个灵活但轻量级的计算框架，用于在Edge服务器和一个或多个客户端设备之间分配机器学习推断。与以前的方法相比，Edge-Proune基于正式的数据流计算模型，对机器学习培训框架不可知，同时提供了利用深度学习加速器（例如嵌入式GPU）的广泛支持。本文的实验部分展示了通过图像分类和对象跟踪应用程序在两个异质端点设备和一个边缘服务器上使用和对象跟踪应用程序的使用和性能，这是无线和物理连接的。例如，基于SSD-Mobilenet的对象跟踪的端点设备推理时间是通过协作推断加速5.8倍的。

Collaborative deep learning inference between low-resource endpoint devices and edge servers has received significant research interest in the last few years. Such computation partitioning can help reducing endpoint device energy consumption and improve latency, but equally importantly also contributes to privacy-preserving of sensitive data. This paper describes Edge-PRUNE, a flexible but light-weight computation framework for distributing machine learning inference between edge servers and one or more client devices. Compared to previous approaches, Edge-PRUNE is based on a formal dataflow computing model, and is agnostic towards machine learning training frameworks, offering at the same time wide support for leveraging deep learning accelerators such as embedded GPUs. The experimental section of the paper demonstrates the use and performance of Edge-PRUNE by image classification and object tracking applications on two heterogeneous endpoint devices and an edge server, over wireless and physical connections. Endpoint device inference time for SSD-Mobilenet based object tracking, for example, is accelerated 5.8x by collaborative inference.

下载PDF全文

下载文献需遵守相关版权规定

论文标题