论文标题
Taotf:深神经网络中的两个阶段大约正交训练框架
TAOTF: A Two-stage Approximately Orthogonal Training Framework in Deep Neural Networks
论文作者
论文摘要
正交性约束,包括硬和软软的限制,已被用来使深神经网络(DNN)模型的重量矩阵(尤其是卷积神经网络(CNN)和视觉变压器(VIT))归一化,以减少模型参数延迟性并改善训练稳定性。但是,对这些模型的嘈杂数据的鲁棒性并不总是令人满意的。在这项工作中,我们提出了一个新颖的两阶段,大约是正交培训框架(TAOTF),以在正交解决方案空间和主要的任务解决方案空间之间找到权衡取舍,以在嘈杂的数据方案中解决此问题。在第一阶段,我们提出了一种称为基于极性分解的正交初始化(PDOI)的新型算法,以找到正交优化的良好初始化。在第二阶段,与其他现有方法不同,我们对DNN模型的所有层都应用软正交约束。我们在自然图像和医学图像数据集上评估了所提出的模型 - 反应框架,这些框架表明我们的方法可以达到与现有方法相对于现有方法的稳定和出色的性能。
The orthogonality constraints, including the hard and soft ones, have been used to normalize the weight matrices of Deep Neural Network (DNN) models, especially the Convolutional Neural Network (CNN) and Vision Transformer (ViT), to reduce model parameter redundancy and improve training stability. However, the robustness to noisy data of these models with constraints is not always satisfactory. In this work, we propose a novel two-stage approximately orthogonal training framework (TAOTF) to find a trade-off between the orthogonal solution space and the main task solution space to solve this problem in noisy data scenarios. In the first stage, we propose a novel algorithm called polar decomposition-based orthogonal initialization (PDOI) to find a good initialization for the orthogonal optimization. In the second stage, unlike other existing methods, we apply soft orthogonal constraints for all layers of DNN model. We evaluate the proposed model-agnostic framework both on the natural image and medical image datasets, which show that our method achieves stable and superior performances to existing methods.