论文标题
密度固定:基于类的简单而有效的正则化方法先验
Density Fixing: Simple yet Effective Regularization Method based on the Class Prior
论文作者
论文摘要
机器学习模型遭受了过度拟合的困扰,这是由于缺乏标记的数据而引起的。为了解决这个问题,我们提出了一个称为密度固定方法的正规化方法的框架,该框架通常用于监督和半监督的学习。我们提出的正则化方法通过强迫模型近似类的先前分布或发生频率来改善概括性能。该正则化项自然源自最大似然估计的公式,理论上是合理的。我们进一步提供了所提出方法的几种理论分析,包括渐近行为。我们对多个基准数据集的实验结果足以支持我们的论点,我们建议这种简单有效的正则化方法在现实世界的机器学习问题中很有用。
Machine learning models suffer from overfitting, which is caused by a lack of labeled data. To tackle this problem, we proposed a framework of regularization methods, called density-fixing, that can be used commonly for supervised and semi-supervised learning. Our proposed regularization method improves the generalization performance by forcing the model to approximate the class's prior distribution or the frequency of occurrence. This regularization term is naturally derived from the formula of maximum likelihood estimation and is theoretically justified. We further provide the several theoretical analyses of the proposed method including asymptotic behavior. Our experimental results on multiple benchmark datasets are sufficient to support our argument, and we suggest that this simple and effective regularization method is useful in real-world machine learning problems.