论文标题

使用在线包装策略的智能资源管理用于数据流的智能资源管理

Smart Resource Management for Data Streaming using an Online Bin-packing Strategy

论文作者

Stein, Oliver, Blamey, Ben, Karlsson, Johan, Sabirsh, Alan, Spjuth, Ola, Hellander, Andreas, Toor, Salman

论文摘要

数据流处理框架为在大型数据集上执行复杂的工作流提供了可靠,有效的机制。大多数当前可用的流媒体框架的共同挑战是有效利用资源。大多数框架都使用静态或半静态设置来利用资源利用率,可很好地适用于已建立的用例,但导致了看不见的情况的边际改进。另一个紧迫的问题是有效地处理大型单个对象,例如科学数据集的典型图像和矩阵。正如在与Spark和Kafka流媒体框架的基准比较中所证明的那样,Quangonio已被证明是相对较大的单个物体流的一个很好的解决方案。我们在这里提出了基于在线包装算法的谐波框架的扩展,以便有效利用资源。基于大型显微镜管道中的现实世界用例,我们比较了新系统的结果来激发自动缩放机制。

Data stream processing frameworks provide reliable and efficient mechanisms for executing complex workflows over large datasets. A common challenge for the majority of currently available streaming frameworks is efficient utilization of resources. Most frameworks use static or semi-static settings for resource utilization that work well for established use cases but lead to marginal improvements for unseen scenarios. Another pressing issue is the efficient processing of large individual objects such as images and matrices typical for scientific datasets. HarmonicIO has proven to be a good solution for streams of relatively large individual objects, as demonstrated in a benchmark comparison with the Spark and Kafka streaming frameworks. We here present an extension of the HarmonicIO framework based on the online bin-packing algorithm, to allow for efficient utilization of resources. Based on a real world use case from large-scale microscopy pipelines, we compare results of the new system to Spark's auto-scaling mechanism.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源