论文标题

探索擦除编码技术,以高可用

Exploring Erasure Coding Techniques for High Availability of Intermediate Data

论文作者

Zhang, Zhe, Bockelman, Brian, Weitzel, Derek, Swanson, David

论文摘要

科学计算工作流产生了短暂的分布数据,但对于工作完成时间至关重要。这类数据称为中间数据。实现高数据可用性的一种常见方法是复制数据。但是,现代科学应用中产生的中间数据越来越大,需要新的存储技术来提高存储效率。删除代码作为替代方案,可以使用更少的存储空间,同时保持相似的数据可用性。在本文中,我们采用擦除代码来存储中间数据,并将其性能与复制进行比较。我们还使用平均时间到数据损失(MTTDL)的度量来估计中间数据的寿命。我们提出了一种算法,以主动将数据冗余从易受伤害的机器转移到可靠的机器,以通过一些额外的网络开销来提高数据可用性。此外,我们提出了一种算法,以分配网络上物理上彼此接近的数据的冗余单元,以减少网络带宽,以便在访问数据时重建数据。

Scientific computing workflows generate enormous distributed data that is short-lived, yet critical for job completion time. This class of data is called intermediate data. A common way to achieve high data availability is to replicate data. However, an increasing scale of intermediate data generated in modern scientific applications demands new storage techniques to improve storage efficiency. Erasure Codes, as an alternative, can use less storage space while maintaining similar data availability. In this paper, we adopt erasure codes for storing intermediate data and compare its performance with replication. We also use the metric of Mean-Time-To-Data-Loss (MTTDL) to estimate the lifetime of intermediate data. We propose an algorithm to proactively relocate data redundancy from vulnerable machines to reliable ones to improve data availability with some extra network overhead. Furthermore, we propose an algorithm to assign redundancy units of data physically close to each other on the network to reduce the network bandwidth for reconstructing data when it is being accessed.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源