论文标题
使用网络内订购加速BFT协议的情况
The Case for Accelerating BFT Protocols Using In-Network Ordering
论文作者
论文摘要
当今部署在数据中心的任务关键系统面临着更复杂的故障。拜占庭式容错(BFT)方案能够掩盖这些类型的故障,但由于其性能成本和复杂性,很少部署。在这项工作中,我们提出了一种新的方法来设计数据中心中的高性能BFT协议。通过重新检查网络和BFT协议之间的订购责任,我们主张数据中心网络基础架构提供的新抽象。具体而言,我们设计了一个新的身份验证有序的多播原始(AOM),可提供可转让的身份验证和非均衡保证。设计的两个硬件实现(一种使用HMAC,另一个使用公共密钥加密进行身份验证)证明了设计的可行性 - 在新代编程开关上。然后,我们共同设计了一种新的BFT协议Matrix,该协议利用AOM的保证来消除常见情况下的跨更改协调和身份验证。评估结果表明,矩阵在延迟和吞吐量指标上都超过了最先进的协议,这表明了我们新网络排序抽象对BFT系统的好处。
Mission critical systems deployed in data centers today are facing more sophisticated failures. Byzantine fault tolerant (BFT) protocols are capable of masking these types of failures, but are rarely deployed due to their performance cost and complexity. In this work, we propose a new approach to designing high performance BFT protocols in data centers. By re-examining the ordering responsibility between the network and the BFT protocol, we advocate a new abstraction offered by the data center network infrastructure. Concretely, we design a new authenticated ordered multicast primitive (AOM) that provides transferable authentication and non-equivocation guarantees. Feasibility of the design is demonstrated by two hardware implementations of AOM -- one using HMAC and the other using public key cryptography for authentication -- on new-generation programmable switches. We then co-design a new BFT protocol, Matrix, that leverages the guarantees of AOM to eliminate cross-replica coordination and authentication in the common case. Evaluation results show that Matrix outperforms state-of-the-art protocols on both latency and throughput metrics by a wide margin, demonstrating the benefit of our new network ordering abstraction for BFT systems.