论文标题
通用Lipschitz的快速内核方法通过$ p $ -sparsified草图
Fast Kernel Methods for Generic Lipschitz Losses via $p$-Sparsified Sketches
论文作者
论文摘要
内核方法是学习算法,这些算法享有坚实的理论基础,同时遭受了重要的计算限制。素描包括在缩小尺寸的子空间中寻找解决方案,是一种缓解这些计算负担的经过深入研究的方法。但是,统计准确的草图(例如高斯素描)通常包含很少的空条目,因此它们在内核方法中的应用及其非Sparse gram矩阵在实践中的应用程序仍然很慢。在本文中,我们表明,由于有效的\ emph {分解技巧},稀疏的高斯(和Rademacher)草图仍然产生理论上的valid近似值,同时可以节省重要的时间和空间。为了支持我们的方法,我们为单个和多个输出内核问题提供了多余的风险界限,并带有通用Lipschitz损失,从而为广泛的应用提供了新的保证,从稳健的回归到多个分数回归。我们的理论结果与实验相辅相成,表明我们方法比SOTA草图方法的经验优势。
Kernel methods are learning algorithms that enjoy solid theoretical foundations while suffering from important computational limitations. Sketching, which consists in looking for solutions among a subspace of reduced dimension, is a well studied approach to alleviate these computational burdens. However, statistically-accurate sketches, such as the Gaussian one, usually contain few null entries, such that their application to kernel methods and their non-sparse Gram matrices remains slow in practice. In this paper, we show that sparsified Gaussian (and Rademacher) sketches still produce theoretically-valid approximations while allowing for important time and space savings thanks to an efficient \emph{decomposition trick}. To support our method, we derive excess risk bounds for both single and multiple output kernel problems, with generic Lipschitz losses, hereby providing new guarantees for a wide range of applications, from robust regression to multiple quantile regression. Our theoretical results are complemented with experiments showing the empirical superiority of our approach over SOTA sketching methods.