训练可逆线性层通过排名一的扰动

论文标题

训练可逆线性层通过排名一的扰动

Training Invertible Linear Layers through Rank-One Perturbations

论文作者

Krämer, Andreas, Köhler, Jonas, Noé, Frank

论文摘要

许多类型的神经网络层都依赖于矩阵属性，例如可逆性或正交性。使用基于梯度的随机优化器在优化过程中保留此类属性是一项具有挑战性的任务，通常通过对受影响参数的重新聚体化或直接在歧管上进行优化来解决。这项工作为训练可逆线性层提供了一种新颖的方法。代替直接优化网络参数的，我们训练排名一驱动的扰动，并将其添加到实际的重量矩阵中。此p $^{4} $ Inv Update允许跟踪倒置和决定因素，而无需显式计算它们。我们展示了这种可逆块如何改善混合，从而改善了所得归一化流的模式分离。此外，我们概述了如何利用P $^4 $概念来保留除可逆性以外的其他属性。

Many types of neural network layers rely on matrix properties such as invertibility or orthogonality. Retaining such properties during optimization with gradient-based stochastic optimizers is a challenging task, which is usually addressed by either reparameterization of the affected parameters or by directly optimizing on the manifold. This work presents a novel approach for training invertible linear layers. In lieu of directly optimizing the network parameters, we train rank-one perturbations and add them to the actual weight matrices infrequently. This P$^{4}$Inv update allows keeping track of inverses and determinants without ever explicitly computing them. We show how such invertible blocks improve the mixing and thus the mode separation of the resulting normalizing flows. Furthermore, we outline how the P$^4$ concept can be utilized to retain properties other than invertibility.

下载PDF全文

下载文献需遵守相关版权规定

论文标题