论文标题

直接遥测访问

Direct Telemetry Access

论文作者

Langlet, Jonatan, Basat, Ran Ben, Oliaro, Gabriele, Mitzenmacher, Michael, Yu, Minlan, Antichi, Gianni

论文摘要

细粒度网络遥测已成为现代数据中心标准,并且是诸如拥塞控制,负载平衡和高级故障排除等基本应用的基础。随着网络大小的增加和遥测变得更加细粒度,从开关到收藏家需要报告的数据量会大大增长,以启用整个网络范围的视图。结果,扩展数据收集系统是逐渐难以进行的。 我们介绍了Direct Temetry访问(DTA),该解决方案优化了用于汇总和将数亿个报告从交换机从开关转移到收集器内存中可查询数据结构的解决方案。 DTA轻量级,它可以大大减少收藏家的开销。 DTA建立在RDMA之上,我们提出了新颖和表达的报告基础,以便与现有的最新遥测机制(例如INT或Marple)轻松整合。 我们表明,DTA显着提高了遥测收集率。例如,当与INT一起使用时,它可以使用一台服务器收集和汇总超过400m的报告,从而提高了原子多核的提高$ 16 $ x。

Fine-grained network telemetry is becoming a modern datacenter standard and is the basis of essential applications such as congestion control, load balancing, and advanced troubleshooting. As network size increases and telemetry gets more fine-grained, there is a tremendous growth in the amount of data needed to be reported from switches to collectors to enable network-wide view. As a consequence, it is progressively hard to scale data collection systems. We introduce Direct Telemetry Access (DTA), a solution optimized for aggregating and moving hundreds of millions of reports per second from switches into queryable data structures in collectors' memory. DTA is lightweight and it is able to greatly reduce overheads at collectors. DTA is built on top of RDMA, and we propose novel and expressive reporting primitives to allow easy integration with existing state-of-the-art telemetry mechanisms such as INT or Marple. We show that DTA significantly improves telemetry collection rates. For example, when used with INT, it can collect and aggregate over 400M reports per second with a single server, improving over the Atomic MultiLog by up to $16$x.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源