论文标题
迈向有效的ML系统:在大型汽车共享平台中揭示任务准确性和工程效率之间的权衡
Towards an Efficient ML System: Unveiling a Trade-off between Task Accuracy and Engineering Efficiency in a Large-scale Car Sharing Platform
论文作者
论文摘要
在有监督的深神经网络的显着性能后,开发ML系统的传统程序是\ textit {以任务为中心},旨在最大化任务准确性。但是,当ML从业人员解决其域中的多个任务时,我们仔细检查了此\ textit {以任务为中心} ML系统缺乏工程效率。为了解决此问题,我们提出了一个\ textIt {以效率为中心} ML系统,该系统将许多数据集,分类器,分布式探测器和从业人员域中存在的预测表和预测表中连接到了单个ML管道中。在现实世界共享平台中的各种图像识别任务下,我们的研究说明了我们如何建立所提出的系统以及从此旅程中学到的经验教训。首先,提议的ML系统实现了最高工程效率,同时实现了竞争性的任务准确性。此外,与\ textit {以任务为中心}范式相比,我们发现\ textit {以效率为中心} ml系统对多标记的样本产生令人满意的预测结果,这些样本经常存在于现实世界中。我们分析了从表示能力中获得的这些好处,这些益处从串联数据集学习了更广泛的标签空间。最后但并非最不重要的一点是,我们的研究详细阐述了我们如何部署此\ textit {以效率为中心的ML系统部署在现实世界中的Live Cloud环境中。根据提出的类比,我们高度期望ML从业者可以利用我们的研究提高其领域的工程效率。
Upon the significant performance of the supervised deep neural networks, conventional procedures of developing ML system are \textit{task-centric}, which aims to maximize the task accuracy. However, we scrutinized this \textit{task-centric} ML system lacks in engineering efficiency when the ML practitioners solve multiple tasks in their domain. To resolve this problem, we propose an \textit{efficiency-centric} ML system that concatenates numerous datasets, classifiers, out-of-distribution detectors, and prediction tables existing in the practitioners' domain into a single ML pipeline. Under various image recognition tasks in the real world car-sharing platform, our study illustrates how we established the proposed system and lessons learned from this journey as follows. First, the proposed ML system accomplishes supreme engineering efficiency while achieving a competitive task accuracy. Moreover, compared to the \textit{task-centric} paradigm, we discovered that the \textit{efficiency-centric} ML system yields satisfactory prediction results on multi-labelable samples, which frequently exist in the real world. We analyze these benefits derived from the representation power, which learned broader label spaces from the concatenated dataset. Last but not least, our study elaborated how we deployed this \textit{efficiency-centric} ML system is deployed in the real world live cloud environment. Based on the proposed analogies, we highly expect that ML practitioners can utilize our study to elevate engineering efficiency in their domain.