量化预测模型中的公平性和歧视

论文标题

量化预测模型中的公平性和歧视

Quantifying fairness and discrimination in predictive models

论文作者

Charpentier, Arthur

论文摘要

对歧视的分析长期以来一直感兴趣的经济学家和律师。近年来，计算机科学和机器学习中的文献已经对该主题感兴趣，并对该主题进行了有趣的重新阅读。这些问题是对用于翻译文本或识别图像中人员的算法的众多批评的后果。随着大量数据的到来以及越来越不透明的算法的使用，具有歧视性算法并不奇怪，因为通过无限地丰富数据，它已经变得很容易代表敏感变量了。根据Kranzberg（1986）的说法，“技术既不是好也不是坏，也不是中立的”，因此，“机器学习不会给您任何您没有明确要求的性别中立性``免费的'''，如Kearns et a所声称的那样。（2019）。在本文中，我们将回到一般环境，以获取分类中的预测模型。我们将基于敏感变量与预测之间的独立性，介绍公平的主要概念，称为群体公平，可能是基于此或该信息的条件。我们将通过提出个人公平的概念来进一步结束。最后，我们将看到如何纠正潜在歧视，以确保模型更具道德状

The analysis of discrimination has long interested economists and lawyers. In recent years, the literature in computer science and machine learning has become interested in the subject, offering an interesting re-reading of the topic. These questions are the consequences of numerous criticisms of algorithms used to translate texts or to identify people in images. With the arrival of massive data, and the use of increasingly opaque algorithms, it is not surprising to have discriminatory algorithms, because it has become easy to have a proxy of a sensitive variable, by enriching the data indefinitely. According to Kranzberg (1986), "technology is neither good nor bad, nor is it neutral", and therefore, "machine learning won't give you anything like gender neutrality `for free' that you didn't explicitely ask for", as claimed by Kearns et a. (2019). In this article, we will come back to the general context, for predictive models in classification. We will present the main concepts of fairness, called group fairness, based on independence between the sensitive variable and the prediction, possibly conditioned on this or that information. We will finish by going further, by presenting the concepts of individual fairness. Finally, we will see how to correct a potential discrimination, in order to guarantee that a model is more ethical

下载PDF全文

下载文献需遵守相关版权规定

论文标题