毫无恐惧的dag：仔细研究学习贝叶斯网络的连续优化

论文标题

毫无恐惧的dag：仔细研究学习贝叶斯网络的连续优化

DAGs with No Fears: A Closer Look at Continuous Optimization for Learning Bayesian Networks

论文作者

Wei, Dennis, Gao, Tian, Yu, Yue

论文摘要

本文重新检查了一个连续优化框架，该框架称为学习贝叶斯网络。我们首先将无环的现有代数表征概括为一类矩阵多项式。接下来，着眼于单参数的每个参数设置，可以表明，除非在微不足道的情况下，否则不能满足宣传公式的Karush-Kuhn-tucker（KKT）最佳条件，这解释了相关算法的行为。然后，我们得出了同等重新制定的KKT条件，表明它们确实是必要的，并将它们联系起来，以明确的约束，即从图中没有某些边缘。如果得分函数是凸的，那么尽管约束不存在，但这些KKT条件也足以使局部最小化。在KKT条件下，提出了局部搜索后处理算法，并证明将大大而普遍地改善所有测试算法的结构锤距，通常为2倍或更多。某些与本地搜索的组合既比原始识别更准确，更高效。

This paper re-examines a continuous optimization framework dubbed NOTEARS for learning Bayesian networks. We first generalize existing algebraic characterizations of acyclicity to a class of matrix polynomials. Next, focusing on a one-parameter-per-edge setting, it is shown that the Karush-Kuhn-Tucker (KKT) optimality conditions for the NOTEARS formulation cannot be satisfied except in a trivial case, which explains a behavior of the associated algorithm. We then derive the KKT conditions for an equivalent reformulation, show that they are indeed necessary, and relate them to explicit constraints that certain edges be absent from the graph. If the score function is convex, these KKT conditions are also sufficient for local minimality despite the non-convexity of the constraint. Informed by the KKT conditions, a local search post-processing algorithm is proposed and shown to substantially and universally improve the structural Hamming distance of all tested algorithms, typically by a factor of 2 or more. Some combinations with local search are both more accurate and more efficient than the original NOTEARS.

下载PDF全文

下载文献需遵守相关版权规定

论文标题