梯度下降训练的浅神经网络的近似结果$ 1D $

论文标题

梯度下降训练的浅神经网络的近似结果$ 1D $

Approximation results for Gradient Descent trained Shallow Neural Networks in $1d$

论文作者

Gentile, R., Welper, G.

论文摘要

在最近的文献中已经进行了广泛研究的神经网络的两个方面是它们的功能近似特性及其通过梯度下降方法训练。近似问题寻求精确的近似值，重量最少。在当前的大多数文献中，这些权重是完全或部分手工制作的，显示了神经网络的功能，但不一定是其实际性能。相比之下，神经网络的优化理论在很大程度上取决于过度参数化的体重。本文平衡了这两个要求，并为$ 1D $ $ 1D $的浅网络提供了近似结果，并通过梯度下降优化了非convex权重。我们考虑有限的宽度网络和无限样品限制，这是近似理论的典型设置。从技术上讲，与最佳速率相比，该问题并未过度饰带，但某种形式的冗余再次出现是近似率的损失。

Two aspects of neural networks that have been extensively studied in the recent literature are their function approximation properties and their training by gradient descent methods. The approximation problem seeks accurate approximations with a minimal number of weights. In most of the current literature these weights are fully or partially hand-crafted, showing the capabilities of neural networks but not necessarily their practical performance. In contrast, optimization theory for neural networks heavily relies on an abundance of weights in over-parametrized regimes. This paper balances these two demands and provides an approximation result for shallow networks in $1d$ with non-convex weight optimization by gradient descent. We consider finite width networks and infinite sample limits, which is the typical setup in approximation theory. Technically, this problem is not over-parametrized, however, some form of redundancy reappears as a loss in approximation rate compared to best possible rates.

下载PDF全文

下载文献需遵守相关版权规定

论文标题