在稀疏性约束下，高维M估计的概括范围

论文标题

在稀疏性约束下，高维M估计的概括范围

Generalization Bounds for High-dimensional M-estimation under Sparsity Constraint

论文作者

Yuan, Xiao-Tong, Li, Ping

论文摘要

$ \ ell_0 $限制的经验风险最小化（$ \ ell_0 $ -erm）是用于高维统计估计的有前途的工具。 $ \ ell_0 $ - 估计器的现有分析主要是在参数估计和支持恢复一致性上。从统计学习的角度来看，另一个基本问题是$ \ ell_0 $ - 估计器在看不见的样本上的表现如何。这个问题的答案对于理解这种非凸（以及np-hard）m估计器的可学习性很重要，但仍相对探索。在本文中，我们调查了这个问题，并为$ \ ell_0 $ -erm开发了概括理论。我们在白色框和黑框统计制度中建立了一组泛化差距和$ \ ell_0 $ - erm的多余风险范围，以表征其稀疏的预测和优化能力。我们的理论主要揭示了三个发现：1）$ \ ell_0 $ -erm可以达到更严格的概括范围，而不是$ \ ell_2 $ -erm的限制，如果风险函数（高概率）受限制强烈凸出； 2）比常规密集的ERM建立更紧密的统一概括范围； 3）可以通过施加其他强信号条件来确保$ \ ell_0 $ -erm的稳定性来建立稀疏级别不变界。鉴于这些结果，我们进一步为迭代硬阈值（IHT）算法提供了概括保证，该算法是大约求解$ \ ell_0 $ -erm的最流行的贪婪追求方法之一。提供数值证据以确认我们的理论预测，即暗示受稀疏受限的线性回归和逻辑回归模型。

The $\ell_0$-constrained empirical risk minimization ($\ell_0$-ERM) is a promising tool for high-dimensional statistical estimation. The existing analysis of $\ell_0$-ERM estimator is mostly on parameter estimation and support recovery consistency. From the perspective of statistical learning, another fundamental question is how well the $\ell_0$-ERM estimator would perform on unseen samples. The answer to this question is important for understanding the learnability of such a non-convex (and also NP-hard) M-estimator but still relatively under explored. In this paper, we investigate this problem and develop a generalization theory for $\ell_0$-ERM. We establish, in both white-box and black-box statistical regimes, a set of generalization gap and excess risk bounds for $\ell_0$-ERM to characterize its sparse prediction and optimization capability. Our theory mainly reveals three findings: 1) tighter generalization bounds can be attained by $\ell_0$-ERM than those of $\ell_2$-ERM if the risk function is (with high probability) restricted strongly convex; 2) tighter uniform generalization bounds can be established for $\ell_0$-ERM than the conventional dense ERM; and 3) sparsity level invariant bounds can be established by imposing additional strong-signal conditions to ensure the stability of $\ell_0$-ERM. In light of these results, we further provide generalization guarantees for the Iterative Hard Thresholding (IHT) algorithm which serves as one of the most popular greedy pursuit methods for approximately solving $\ell_0$-ERM. Numerical evidence is provided to confirm our theoretical predictions when implied to sparsity-constrained linear regression and logistic regression models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题