部署时间DNN模型搜索的基准

论文标题

部署时间DNN模型搜索的基准

Benchmark of DNN Model Search at Deployment Time

论文作者

Zhou, Lixi, Jain, Arindam, Wang, Zijie, Das, Amitabh, Yang, Yingzhen, Zou, Jia

论文摘要

深度学习已成为机器学习和人工智能中最受欢迎的方向。但是，培训数据的准备以及模型培训通常很耗时，并成为端到端机器学习生命周期的瓶颈。重复使用用于推断数据集的模型可以避免再培训成本。但是，当有多种候选模型时，发现正确的重复使用模型是一项挑战。尽管存在许多模型共享平台，例如MODELDB，TENSORFLOW HUB，PYTORCH HUB和DLHUB，但这些系统中的大多数都要求模型上传器手动指定每个模型和模型下载器的详细信息，以筛选关键字搜索结果以选择模型。我们缺少一个高产的模型搜索工具，该工具可以选择用于部署的模型，而无需从目标域中进行任何手动检查和/或标记数据。本文提出了多种模型搜索策略，包括各种基于相似性的方法和基于非相似性的方法。我们在多个模型推理方案上设计，实施和评估这些方法，包括活动识别，图像识别，文本分类，自然语言处理和实体匹配。实验评估表明，在大多数工作负载中，我们提出的基于不对称相似性的测量，适应性，优于基于对称性相似性的测量和非相似度的测量值。

Deep learning has become the most popular direction in machine learning and artificial intelligence. However, the preparation of training data, as well as model training, are often time-consuming and become the bottleneck of the end-to-end machine learning lifecycle. Reusing models for inferring a dataset can avoid the costs of retraining. However, when there are multiple candidate models, it is challenging to discover the right model for reuse. Although there exist a number of model sharing platforms such as ModelDB, TensorFlow Hub, PyTorch Hub, and DLHub, most of these systems require model uploaders to manually specify the details of each model and model downloaders to screen keyword search results for selecting a model. We are lacking a highly productive model search tool that selects models for deployment without the need for any manual inspection and/or labeled data from the target domain. This paper proposes multiple model search strategies including various similarity-based approaches and non-similarity-based approaches. We design, implement, and evaluate these approaches on multiple model inference scenarios, including activity recognition, image recognition, text classification, natural language processing, and entity matching. The experimental evaluation showed that our proposed asymmetric similarity-based measurement, adaptivity, outperformed symmetric similarity-based measurements and non-similarity-based measurements in most of the workloads.

下载PDF全文

下载文献需遵守相关版权规定

论文标题