论文标题
非局部内核网络(NKN):稳定且独立于分辨率的深神经网络
Nonlocal Kernel Network (NKN): a Stable and Resolution-Independent Deep Neural Network
论文作者
论文摘要
神经操作员最近已成为以神经网络形式在功能空间之间设计解决方案图的流行工具。与经典的科学机器学习方法不同,该方法在固定分辨率下学习已知偏微分方程(PDE)的参数(PDE),神经操作员近似于PDE家族的解决方案图。尽管他们成功了,但神经操作员的用途到目前为止仅限于相对较浅的神经网络,并且仅限于学习隐藏的管理法律。在这项工作中,我们提出了一个新型的非局部神经操作员,我们称之为非局部内核网络(NKN),该网络是独立的,具有深度神经网络的特征,并且能够处理各种任务,例如学习方程式和分类图像。我们的NKN源于对神经网络的解释为离散的非局部扩散反应方程,在无限层的极限下,它等同于抛物线非局部方程,其稳定性是通过非局部载体量积分来分析的。与整体形式的神经操作员的相似之处允许NKN捕获特征空间中的远程依赖关系,而对节点 - 节点相互作用的持续处理使NKNS分辨率无关。与神经ODE的相似之处,以非本地意义重新解释,并且层之间的稳定网络动态允许将NKN的最佳参数从浅层网络转变为深网。这一事实可以使用浅到深的初始化技术。我们的测试表明,在学习方程式和图像分类任务中,NKN的表现要优于基线方法,并可以很好地推广到不同的分辨率和深度。
Neural operators have recently become popular tools for designing solution maps between function spaces in the form of neural networks. Differently from classical scientific machine learning approaches that learn parameters of a known partial differential equation (PDE) for a single instance of the input parameters at a fixed resolution, neural operators approximate the solution map of a family of PDEs. Despite their success, the uses of neural operators are so far restricted to relatively shallow neural networks and confined to learning hidden governing laws. In this work, we propose a novel nonlocal neural operator, which we refer to as nonlocal kernel network (NKN), that is resolution independent, characterized by deep neural networks, and capable of handling a variety of tasks such as learning governing equations and classifying images. Our NKN stems from the interpretation of the neural network as a discrete nonlocal diffusion reaction equation that, in the limit of infinite layers, is equivalent to a parabolic nonlocal equation, whose stability is analyzed via nonlocal vector calculus. The resemblance with integral forms of neural operators allows NKNs to capture long-range dependencies in the feature space, while the continuous treatment of node-to-node interactions makes NKNs resolution independent. The resemblance with neural ODEs, reinterpreted in a nonlocal sense, and the stable network dynamics between layers allow for generalization of NKN's optimal parameters from shallow to deep networks. This fact enables the use of shallow-to-deep initialization techniques. Our tests show that NKNs outperform baseline methods in both learning governing equations and image classification tasks and generalize well to different resolutions and depths.