使用合成数据生成器调查偏见：经验证据和哲学解释

论文标题

使用合成数据生成器调查偏见：经验证据和哲学解释

Investigating Bias with a Synthetic Data Generator: Empirical Evidence and Philosophical Interpretation

论文作者

Castelnovo, Alessandro, Crupi, Riccardo, Inverardi, Nicole, Regoli, Daniele, Cosentini, Andrea

论文摘要

机器学习应用在我们的社会中变得越来越普遍。由于这些决策系统依赖于数据驱动的学习，因此风险是它们会系统地传播嵌入数据中的偏见。在本文中，我们建议通过引入一个框架来生成具有特定类型偏差及其组合的综合数据的框架来分析偏见。我们深入研究了这些偏见的性质，讨论了它们与道德和正义框架的关系。最后，我们利用所提出的合成数据生成器在不同的情况下进行不同的偏置组合进行实验。因此，我们分析了偏见对未经降低和缓解的机器学习模型中性能和公平度量的影响。

Machine learning applications are becoming increasingly pervasive in our society. Since these decision-making systems rely on data-driven learning, risk is that they will systematically spread the bias embedded in data. In this paper, we propose to analyze biases by introducing a framework for generating synthetic data with specific types of bias and their combinations. We delve into the nature of these biases discussing their relationship to moral and justice frameworks. Finally, we exploit our proposed synthetic data generator to perform experiments on different scenarios, with various bias combinations. We thus analyze the impact of biases on performance and fairness metrics both in non-mitigated and mitigated machine learning models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题