论文标题
为什么公平标签可以产生不公平的预测:引入不公平的图形条件
Why Fair Labels Can Yield Unfair Predictions: Graphical Conditions for Introduced Unfairness
论文作者
论文摘要
除了重现培训数据中的歧视性关系外,机器学习系统还可以引入或放大歧视性效果。我们将其称为引入不公平的情况,并研究它可能出现的条件。为此,我们建议引入总变化,以衡量引入不公平的量度,并建立可能激励其发生的图形条件。这些标准表明,将敏感属性添加为特征,可以消除在行为良好的损失函数下引入变化的激励措施。此外,从因果关系角度来看,引入了特定路径的效应,阐明了应将特定路径公平的问题阐明。
In addition to reproducing discriminatory relationships in the training data, machine learning systems can also introduce or amplify discriminatory effects. We refer to this as introduced unfairness, and investigate the conditions under which it may arise. To this end, we propose introduced total variation as a measure of introduced unfairness, and establish graphical conditions under which it may be incentivised to occur. These criteria imply that adding the sensitive attribute as a feature removes the incentive for introduced variation under well-behaved loss functions. Additionally, taking a causal perspective, introduced path-specific effects shed light on the issue of when specific paths should be considered fair.