论文标题
证明概率电路的公平性
Certifying Fairness of Probabilistic Circuits
论文作者
论文摘要
随着机器学习系统用于决策的越来越多,有关此类系统的公平性能的问题开始占据中心位置。关于算法公平性的大多数现有工作都可以在预测时间完全观察特征,就像统计奇偶校验和机会均等的流行概念一样。但是,对于可以通过部分观察进行预测的模型,这是不够的,因为我们可能会错过偏见模式并错误地证明模型是公平的。为了解决这个问题,最近引入的公平概念询问该模型是否表现出任何歧视模式,在该模式中,以(部分)特征观察为特征的个人仅通过公开一个或多个敏感属性(例如性别和种族)而获得截然不同的决定。通过明确考虑部分观察,这提供了更加细粒度的公平概念。 在本文中,我们提出了一种算法来搜索一般概率模型(即概率电路)中的歧视模式。以前,这种算法仅限于幼稚的贝叶斯分类器,这些分类器是强大的独立性假设。相比之下,概率电路为各种可拖动的概率模型提供了一个统一的框架,甚至可以从某些类别的贝叶斯网络和概率程序中编译,从而使我们的方法更加广泛地适用。此外,对于不公平的模型,快速找到歧视模式并提取它们以更好地解释性可能是有用的。因此,我们还提出了一种基于抽样的方法,以更有效地挖掘歧视模式,并引入新的模式,例如最小,最大和帕累托最佳模式,可以有效地总结许多歧视模式。
With the increased use of machine learning systems for decision making, questions about the fairness properties of such systems start to take center stage. Most existing work on algorithmic fairness assume complete observation of features at prediction time, as is the case for popular notions like statistical parity and equal opportunity. However, this is not sufficient for models that can make predictions with partial observation as we could miss patterns of bias and incorrectly certify a model to be fair. To address this, a recently introduced notion of fairness asks whether the model exhibits any discrimination pattern, in which an individual characterized by (partial) feature observations, receives vastly different decisions merely by disclosing one or more sensitive attributes such as gender and race. By explicitly accounting for partial observations, this provides a much more fine-grained notion of fairness. In this paper, we propose an algorithm to search for discrimination patterns in a general class of probabilistic models, namely probabilistic circuits. Previously, such algorithms were limited to naive Bayes classifiers which make strong independence assumptions; by contrast, probabilistic circuits provide a unifying framework for a wide range of tractable probabilistic models and can even be compiled from certain classes of Bayesian networks and probabilistic programs, making our method much more broadly applicable. Furthermore, for an unfair model, it may be useful to quickly find discrimination patterns and distill them for better interpretability. As such, we also propose a sampling-based approach to more efficiently mine discrimination patterns, and introduce new classes of patterns such as minimal, maximal, and Pareto optimal patterns that can effectively summarize exponentially many discrimination patterns