数据驱动数值线性代数的概括界限

论文标题

数据驱动数值线性代数的概括界限

Generalization Bounds for Data-Driven Numerical Linear Algebra

论文作者

Bartlett, Peter, Indyk, Piotr, Wagner, Tal

论文摘要

数据驱动的算法可以通过从输入的训练样本中学习，可以将其内部结构或参数调整为来自未知应用程序分布的输入。最近的一些著作将这种方法应用于数值线性代数中的问题，获得了绩效的显着经验增长。但是，尚无理论上的成功解释。在这项工作中，我们证明了这些算法的概括范围，在Gupta和Roughgarden提出的数据驱动算法选择的PAC学习框架内（Sicomp 2017）。我们的主要结果与Indyk等人的基于学习的低级近似算法的脂肪破碎维度紧密匹配（Neurips 2019）。我们的技术是一般的，并为数值线性代数中的许多其他最近提出的数据驱动算法提供了概括，涵盖了基于草图的基于草图和基于多机的方法。这大大扩展了可用的PAC学习分析的数据驱动算法类别。

Data-driven algorithms can adapt their internal structure or parameters to inputs from unknown application-specific distributions, by learning from a training sample of inputs. Several recent works have applied this approach to problems in numerical linear algebra, obtaining significant empirical gains in performance. However, no theoretical explanation for their success was known. In this work we prove generalization bounds for those algorithms, within the PAC-learning framework for data-driven algorithm selection proposed by Gupta and Roughgarden (SICOMP 2017). Our main results are closely matching upper and lower bounds on the fat shattering dimension of the learning-based low rank approximation algorithm of Indyk et al.~(NeurIPS 2019). Our techniques are general, and provide generalization bounds for many other recently proposed data-driven algorithms in numerical linear algebra, covering both sketching-based and multigrid-based methods. This considerably broadens the class of data-driven algorithms for which a PAC-learning analysis is available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题