正确的潜在因素：通过解开的发电性模型进行偏见

论文标题

正确的潜在因素：通过解开的发电性模型进行偏见

Right for the Right Latent Factors: Debiasing Generative Models via Disentanglement

论文作者

Shao, Xiaoting, Stelzner, Karl, Kersting, Kristian

论文摘要

大多数统计机器学习方法的一个关键假设是，他们可以从测试时遇到的数据分布中访问独立样本。因此，这些方法在面对有偏见的数据时通常会表现不佳，这打破了这一假设。特别是，已经证明机器学习模型表现出聪明的汉族行为，这意味着训练集中的虚假相关性是无意中学习的。已经提出了许多作品来修改深层分类器以学习正确的相关性。但是，到目前为止，生成模型已经被忽略了。我们观察到生成模型也容易出现巧妙的类似汉族的行为。为了抵消这个问题，我们建议通过解开其内部表示形式来对DEBIAS生成模型进行，这是通过人类反馈来实现的。我们的实验表明，即使人类反馈仅涵盖了所需分布的一小部分，这也可以有效消除偏差。此外，我们实现了强大的分解结果，可以与最近的方法进行定量比较。

A key assumption of most statistical machine learning methods is that they have access to independent samples from the distribution of data they encounter at test time. As such, these methods often perform poorly in the face of biased data, which breaks this assumption. In particular, machine learning models have been shown to exhibit Clever-Hans-like behaviour, meaning that spurious correlations in the training set are inadvertently learnt. A number of works have been proposed to revise deep classifiers to learn the right correlations. However, generative models have been overlooked so far. We observe that generative models are also prone to Clever-Hans-like behaviour. To counteract this issue, we propose to debias generative models by disentangling their internal representations, which is achieved via human feedback. Our experiments show that this is effective at removing bias even when human feedback covers only a small fraction of the desired distribution. In addition, we achieve strong disentanglement results in a quantitative comparison with recent methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题