基于元学习的基于非凸优化的最小化算法

论文标题

基于元学习的基于非凸优化的最小化算法

Meta-learning based Alternating Minimization Algorithm for Non-convex Optimization

论文作者

Xia, Jingyuan, Li, Shengxi, Huang, Jun-Jie, Jaimoukha, Imad, Gunduz, Deniz

论文摘要

在本文中，我们提出了一种新的解决方案，以解决多个变量的非凸问题，尤其是对于通常通过交替最小化（AM）策略来解决的方法，将原始优化问题拆分为对应于每个变量的一组子问题，然后使用固定更新规则迭代地对每个子问题进行迭代优化。但是，由于原始优化问题的固有非凸性，即使在每次迭代中都能最佳地解决每个子问题时，优化通常也会被捕获到虚假的局部最小值中。同时，基于学习的方法（例如深层展开的算法）受到缺乏标记的数据和有限的解释性的高度限制。为了解决这些问题，我们提出了一种基于元学习的交替最小化（MLAM）方法，该方法旨在最大程度地减少全球损失的部分损失，而不是在每个子问题上最小化，并且它倾向于学习一种自适应策略，以替换在优越的表现中取代手工制作的对抗。同时，拟议的Mlam仍然保持原始算法原则，这有助于更好的解释性。我们在两个代表性问题上评估了提出的方法，即双线性逆问题：矩阵完成和非线性问题：高斯混合模型。实验结果验证了我们所提出的方法在标准设置中的表现优于基于AM的方法，并且能够在具有挑战性的情况下实现有效的优化，而其他比较方法通常会失败。

In this paper, we propose a novel solution for non-convex problems of multiple variables, especially for those typically solved by an alternating minimization (AM) strategy that splits the original optimization problem into a set of sub-problems corresponding to each variable, and then iteratively optimize each sub-problem using a fixed updating rule. However, due to the intrinsic non-convexity of the original optimization problem, the optimization can usually be trapped into spurious local minimum even when each sub-problem can be optimally solved at each iteration. Meanwhile, learning-based approaches, such as deep unfolding algorithms, are highly limited by the lack of labelled data and restricted explainability. To tackle these issues, we propose a meta-learning based alternating minimization (MLAM) method, which aims to minimize a partial of the global losses over iterations instead of carrying minimization on each sub-problem, and it tends to learn an adaptive strategy to replace the handcrafted counterpart resulting in advance on superior performance. Meanwhile, the proposed MLAM still maintains the original algorithmic principle, which contributes to a better interpretability. We evaluate the proposed method on two representative problems, namely, bi-linear inverse problem: matrix completion, and non-linear problem: Gaussian mixture models. The experimental results validate that our proposed approach outperforms AM-based methods in standard settings, and is able to achieve effective optimization in challenging cases while other comparing methods would typically fail.

下载PDF全文

下载文献需遵守相关版权规定

论文标题