利用可解释的指标来增强SGD

论文标题

利用可解释的指标来增强SGD

Exploiting Explainable Metrics for Augmented SGD

论文作者

Hosseini, Mahdi S., Tuli, Mathieu, Plataniotis, Konstantinos N.

论文摘要

解释深度学习的概括特征是高级机器学习中的一个新兴话题。关于随机优化如何真正起作用以及某些策略比其他策略更好的问题，有几个未解决的问题。在本文中，我们解决了以下问题：\ textIt {我们可以探究深神经网络的中间层，以识别和量化每个层的学习质量吗？}，我们提出了新的解释性，以使用低级别分解框架和高度相关的一般化表现的复杂性衡量网络中的网络层中的冗余指标，并量化了一般化的性能，并量化了一般化的性能，并具有一般化的相关性。随后，我们利用这些指标来通过自适应调整每一层的学习率以改善概括性能来增强随机梯度下降（SGD）优化器。与SOTA方法相比，我们的增强SGD（称为RMSGD）引入了最小的计算间接费用，并通过在应用，体系结构和数据集中表现出强大的概括特征来胜过它们。

Explaining the generalization characteristics of deep learning is an emerging topic in advanced machine learning. There are several unanswered questions about how learning under stochastic optimization really works and why certain strategies are better than others. In this paper, we address the following question: \textit{can we probe intermediate layers of a deep neural network to identify and quantify the learning quality of each layer?} With this question in mind, we propose new explainability metrics that measure the redundant information in a network's layers using a low-rank factorization framework and quantify a complexity measure that is highly correlated with the generalization performance of a given optimizer, network, and dataset. We subsequently exploit these metrics to augment the Stochastic Gradient Descent (SGD) optimizer by adaptively adjusting the learning rate in each layer to improve in generalization performance. Our augmented SGD -- dubbed RMSGD -- introduces minimal computational overhead compared to SOTA methods and outperforms them by exhibiting strong generalization characteristics across application, architecture, and dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题