论文标题
Lifestream:定期流的高性能流处理引擎
LifeStream: A High-Performance Stream Processing Engine for Periodic Streams
论文作者
论文摘要
世界各地的医院每天从患者那里收集大量的生理数据。最近,研究兴趣增加了将这些数据进行统计分析以获得更多见解并提供改进的医学诊断。这样的分析需要对大量数据进行复杂的计算,要求有效的数据处理系统。本文表明,当前可用的数据处理解决方案要么无法满足性能要求,要么缺乏简单且灵活的编程接口。为了解决这个问题,我们建议\ emph {lifestream},这是一种用于生理数据的高性能流处理引擎。 Lifestream通过采用优化来利用生理数据的周期性,通过提供丰富的时间查询语言支持和性能来达到易于编程之间的甜蜜点。我们证明,Lifestream的端到端性能高达$ 7.5 \ times $ $ $ $ $ $ $ $比最先进的流媒体发动机高,而$ 3.2 \ tims $ $ $ $ $ $ $比实际数据集和工作量上的手工优化的数值库。
Hospitals around the world collect massive amounts of physiological data from their patients every day. Recently, there has been an increase in research interest to subject this data to statistical analysis to gain more insights and provide improved medical diagnoses. Such analyses require complex computations on large volumes of data, demanding efficient data processing systems. This paper shows that currently available data processing solutions either fail to meet the performance requirements or lack simple and flexible programming interfaces. To address this problem, we propose \emph{LifeStream}, a high-performance stream processing engine for physiological data. LifeStream hits the sweet spot between ease of programming by providing a rich temporal query language support and performance by employing optimizations that exploit the periodic nature of physiological data. We demonstrate that LifeStream achieves end-to-end performance up to $7.5\times$ higher than state-of-the-art streaming engines and $3.2\times$ than hand-optimized numerical libraries on real-world datasets and workloads.