论文标题

贝叶斯网络的高效且可扩展的结构学习:算法和应用

Efficient and Scalable Structure Learning for Bayesian Networks: Algorithms and Applications

论文作者

Zhu, Rong, Pfadler, Andreas, Wu, Ziniu, Han, Yuxing, Yang, Xiaoke, Ye, Feng, Qian, Zhenping, Zhou, Jingren, Cui, Bin

论文摘要

贝叶斯网络(BN)的结构学习是广泛研究的重要问题。它在阿里巴巴组的各种应用中扮演着核心角色。但是,现有的结构学习算法由于其效率低和可扩展性差而受到现实世界应用的限制。为了解决这个问题,我们提出了一种新的结构学习算法,该结构最少可以满足我们的业务需求,因为它同时达到了高准确性,效率和可扩展性。最少的核心思想是将结构学习成一个连续约束优化问题,并具有一个新型的可区分约束函数,可测量所得图的无精性。与现有工作不同,我们的约束函数构建在图表的光谱半径上,可以在几乎线性时间W.R.T.中进行评估。图节点大小。基于它,至少可以通过低存储开销有效地实现。根据我们的基准评估,最少的1到2个数量级的速度比艺术方法的状态更快,并且能够在BNS上扩展多达数十万个变量。在我们的生产环境中,至少要部署,并为20多个应用程序服务,每天执行数千个执行。我们描述了阿里巴巴的票务预订服务中的具体情况,在该服务中,最少可以应用近乎实时的自动异常检测和根本错误原因原因分析系统。我们还表明,至少可以释放在新领域应用BN结构学习的可能性,例如大型基因表达数据分析和可解释的建议系统。

Structure Learning for Bayesian network (BN) is an important problem with extensive research. It plays central roles in a wide variety of applications in Alibaba Group. However, existing structure learning algorithms suffer from considerable limitations in real world applications due to their low efficiency and poor scalability. To resolve this, we propose a new structure learning algorithm LEAST, which comprehensively fulfills our business requirements as it attains high accuracy, efficiency and scalability at the same time. The core idea of LEAST is to formulate the structure learning into a continuous constrained optimization problem, with a novel differentiable constraint function measuring the acyclicity of the resulting graph. Unlike with existing work, our constraint function is built on the spectral radius of the graph and could be evaluated in near linear time w.r.t. the graph node size. Based on it, LEAST can be efficiently implemented with low storage overhead. According to our benchmark evaluation, LEAST runs 1 to 2 orders of magnitude faster than state of the art method with comparable accuracy, and it is able to scale on BNs with up to hundreds of thousands of variables. In our production environment, LEAST is deployed and serves for more than 20 applications with thousands of executions per day. We describe a concrete scenario in a ticket booking service in Alibaba, where LEAST is applied to build a near real-time automatic anomaly detection and root error cause analysis system. We also show that LEAST unlocks the possibility of applying BN structure learning in new areas, such as large-scale gene expression data analysis and explainable recommendation system.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源