论文标题
在稀疏性约束下,高维M估计的概括范围
Generalization Bounds for High-dimensional M-estimation under Sparsity Constraint
论文作者
论文摘要
$ \ ell_0 $限制的经验风险最小化($ \ ell_0 $ -erm)是用于高维统计估计的有前途的工具。 $ \ ell_0 $ - 估计器的现有分析主要是在参数估计和支持恢复一致性上。从统计学习的角度来看,另一个基本问题是$ \ ell_0 $ - 估计器在看不见的样本上的表现如何。这个问题的答案对于理解这种非凸(以及np-hard)m估计器的可学习性很重要,但仍相对探索。 在本文中,我们调查了这个问题,并为$ \ ell_0 $ -erm开发了概括理论。我们在白色框和黑框统计制度中建立了一组泛化差距和$ \ ell_0 $ - erm的多余风险范围,以表征其稀疏的预测和优化能力。我们的理论主要揭示了三个发现:1)$ \ ell_0 $ -erm可以达到更严格的概括范围,而不是$ \ ell_2 $ -erm的限制,如果风险函数(高概率)受限制强烈凸出; 2)比常规密集的ERM建立更紧密的统一概括范围; 3)可以通过施加其他强信号条件来确保$ \ ell_0 $ -erm的稳定性来建立稀疏级别不变界。鉴于这些结果,我们进一步为迭代硬阈值(IHT)算法提供了概括保证,该算法是大约求解$ \ ell_0 $ -erm的最流行的贪婪追求方法之一。提供数值证据以确认我们的理论预测,即暗示受稀疏受限的线性回归和逻辑回归模型。
The $\ell_0$-constrained empirical risk minimization ($\ell_0$-ERM) is a promising tool for high-dimensional statistical estimation. The existing analysis of $\ell_0$-ERM estimator is mostly on parameter estimation and support recovery consistency. From the perspective of statistical learning, another fundamental question is how well the $\ell_0$-ERM estimator would perform on unseen samples. The answer to this question is important for understanding the learnability of such a non-convex (and also NP-hard) M-estimator but still relatively under explored. In this paper, we investigate this problem and develop a generalization theory for $\ell_0$-ERM. We establish, in both white-box and black-box statistical regimes, a set of generalization gap and excess risk bounds for $\ell_0$-ERM to characterize its sparse prediction and optimization capability. Our theory mainly reveals three findings: 1) tighter generalization bounds can be attained by $\ell_0$-ERM than those of $\ell_2$-ERM if the risk function is (with high probability) restricted strongly convex; 2) tighter uniform generalization bounds can be established for $\ell_0$-ERM than the conventional dense ERM; and 3) sparsity level invariant bounds can be established by imposing additional strong-signal conditions to ensure the stability of $\ell_0$-ERM. In light of these results, we further provide generalization guarantees for the Iterative Hard Thresholding (IHT) algorithm which serves as one of the most popular greedy pursuit methods for approximately solving $\ell_0$-ERM. Numerical evidence is provided to confirm our theoretical predictions when implied to sparsity-constrained linear regression and logistic regression models.