论文标题

深度神经网络损失表面的普遍特征来自随机矩阵理论

Universal characteristics of deep neural network loss surfaces from random matrix theory

论文作者

Baskerville, Nicholas P, Keating, Jonathan P, Mezzadri, Francesco, Najnudel, Joseph, Granziol, Diego

论文摘要

本文考虑了深度神经网络中随机矩阵普遍性的几个方面。在最近的实验工作中,我们使用与局部统计相关的随机矩阵的普遍特性,以基于其Hessians的现实模型来获得对深神经网络的实际含义。特别是,我们得出了深层神经网络光谱中异常值的普遍方面,并证明了随机矩阵局部定律在流行的前调节前血统算法中的重要作用。我们还通过基于统计物理学和随机矩阵理论的工具的一般参数,对深度神经网络损失表面的见解。

This paper considers several aspects of random matrix universality in deep neural networks. Motivated by recent experimental work, we use universal properties of random matrices related to local statistics to derive practical implications for deep neural networks based on a realistic model of their Hessians. In particular we derive universal aspects of outliers in the spectra of deep neural networks and demonstrate the important role of random matrix local laws in popular pre-conditioning gradient descent algorithms. We also present insights into deep neural network loss surfaces from quite general arguments based on tools from statistical physics and random matrix theory.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源