深度频率原则，以了解为什么更深入学习更快

论文标题

深度频率原则，以了解为什么更深入学习更快

Deep frequency principle towards understanding why deeper learning is faster

论文作者

Xu, Zhi-Qin John, Zhou, Hanxu

论文摘要

了解深度学习中深度的影响是一个关键问题。在这项工作中，我们利用傅立叶分析来提供一种有希望的机制，以了解为什么进食更深的学习速度更快。为此，我们将一个由正常随机梯度下降训练的深度神经网络分为分析过程中的两个部分，即，条件组件和学习成分，其中条件前的输出是学习的输入。我们使用过滤方法来表征高维函数的频率分布。基于深网和真实数据集的实验，我们提出了一个深度频率原理，即在训练过程中，更深的隐藏层偏向较低频率的有效目标函数。因此，如果前条件组件具有更多层，则学习组件有效地学习了较低的频率函数。由于研究频率良好的原理，即，深度神经网络了解较低的频率功能，因此深度频率原理为为什么更深入的学习速度更快地提供了合理的解释。我们认为，这些经验研究对于深度深度影响的未来理论研究将是有价值的。

Understanding the effect of depth in deep learning is a critical problem. In this work, we utilize the Fourier analysis to empirically provide a promising mechanism to understand why feedforward deeper learning is faster. To this end, we separate a deep neural network, trained by normal stochastic gradient descent, into two parts during analysis, i.e., a pre-condition component and a learning component, in which the output of the pre-condition one is the input of the learning one. We use a filtering method to characterize the frequency distribution of a high-dimensional function. Based on experiments of deep networks and real dataset, we propose a deep frequency principle, that is, the effective target function for a deeper hidden layer biases towards lower frequency during the training. Therefore, the learning component effectively learns a lower frequency function if the pre-condition component has more layers. Due to the well-studied frequency principle, i.e., deep neural networks learn lower frequency functions faster, the deep frequency principle provides a reasonable explanation to why deeper learning is faster. We believe these empirical studies would be valuable for future theoretical studies of the effect of depth in deep learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题