一种基于特征成分的公平主体组件分析的新方法

论文标题

一种基于特征成分的公平主体组件分析的新方法

A novel approach for Fair Principal Component Analysis based on eigendecomposition

论文作者

Pelegrina, Guilherme Dean, Duarte, Leonardo Tomazeli

论文摘要

主成分分析（PCA）是信号处理中无处不在的维度降低技术，它搜索了一个投影矩阵，该矩阵最小化了还原数据集和原始数据集之间的均衡误差。由于经典的PCA并非量身定制以解决与公平性有关的问题，因此其对实际问题的应用可能会导致不同群体的重建错误（例如，男人和女性，白人，黑人等）的差异，以及可能带来的有害后果，例如对敏感群体引入偏见。尽管最近提出了几种公平的PCA版本，但在搜索算法中仍然存在基本差距，这些算法足够简单，可以部署在实际系统中。为了解决这个问题，我们提出了一种新颖的PCA算法，该算法通过一个简单的策略来解决公平问题，该策略包括一维搜索，该搜索利用了PCA的封闭式解决方案。正如数值实验所证明的那样，该提案可以显着提高公平性，而总体重建误差的损失很小，而无需诉诸复杂的优化方案。此外，我们的发现在几种真实情况以及在具有不平衡和平衡数据集的情况下都是一致的。

Principal component analysis (PCA), a ubiquitous dimensionality reduction technique in signal processing, searches for a projection matrix that minimizes the mean squared error between the reduced dataset and the original one. Since classical PCA is not tailored to address concerns related to fairness, its application to actual problems may lead to disparity in the reconstruction errors of different groups (e.g., men and women, whites and blacks, etc.), with potentially harmful consequences such as the introduction of bias towards sensitive groups. Although several fair versions of PCA have been proposed recently, there still remains a fundamental gap in the search for algorithms that are simple enough to be deployed in real systems. To address this, we propose a novel PCA algorithm which tackles fairness issues by means of a simple strategy comprising a one-dimensional search which exploits the closed-form solution of PCA. As attested by numerical experiments, the proposal can significantly improve fairness with a very small loss in the overall reconstruction error and without resorting to complex optimization schemes. Moreover, our findings are consistent in several real situations as well as in scenarios with both unbalanced and balanced datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题