论文标题
稀疏的非负矩阵分解的大量最小化与$β$ -Divergence
Majorization-minimization for Sparse Nonnegative Matrix Factorization with the $β$-divergence
论文作者
论文摘要
本文介绍了针对非负矩阵分解的新乘法更新,其中一个因素之一是$β$ - 差异和稀疏正则化(例如,激活矩阵)。众所周知,需要控制另一个因素(字典矩阵)的规范,以避免使用不良的公式。标准实践包括限制字典的列具有单位标准,这导致了非平凡的优化问题。我们的方法利用原始问题的重新处理来优化等效规模的目标函数。从那里,我们得出了块状大量最小化算法,这些算法可为$ \ ell_ {1} $ - 正则化或更“激进的” log regularization提供简单的乘法更新。与其他最先进的方法相反,我们的算法是普遍的,因为它们可以应用于任何$β$ divergence(即任何$β$的任何值),并且它们具有融合保证。我们使用各种数据集报告了与现有的启发式和拉格朗日方法的数值比较:面部图像,音频谱图,高光谱数据和歌曲播放计数。我们表明,我们的方法获得了收敛时类似质量的解决方案(相似的目标值),但CPU时间显着降低。
This article introduces new multiplicative updates for nonnegative matrix factorization with the $β$-divergence and sparse regularization of one of the two factors (say, the activation matrix). It is well known that the norm of the other factor (the dictionary matrix) needs to be controlled in order to avoid an ill-posed formulation. Standard practice consists in constraining the columns of the dictionary to have unit norm, which leads to a nontrivial optimization problem. Our approach leverages a reparametrization of the original problem into the optimization of an equivalent scale-invariant objective function. From there, we derive block-descent majorization-minimization algorithms that result in simple multiplicative updates for either $\ell_{1}$-regularization or the more "aggressive" log-regularization. In contrast with other state-of-the-art methods, our algorithms are universal in the sense that they can be applied to any $β$-divergence (i.e., any value of $β$) and that they come with convergence guarantees. We report numerical comparisons with existing heuristic and Lagrangian methods using various datasets: face images, an audio spectrogram, hyperspectral data, and song play counts. We show that our methods obtain solutions of similar quality at convergence (similar objective values) but with significantly reduced CPU times.