论文标题
一项关于对抗攻击对稀疏回归影响的影响的理论研究
A Theoretical Study of The Effects of Adversarial Attacks on Sparse Regression
论文作者
论文摘要
本文分析了$ \ ell_1 $正规化的线性回归,这是在只有对抗损坏的数据进行培训的具有挑战性的情况下。我们使用原始的偶偶有证人范式来提供可证明的性能保证,以支持估计的回归参数向量以匹配实际参数。我们的理论分析表明,对手可以通过破坏与回归参数矢量零系数相对应的不相关特征来影响样本复杂性的违反直觉结果,从而不影响因变量。由于任何对抗性鲁棒算法都有其局限性,因此我们的理论分析确定了学习算法和对手可以彼此占主导地位的制度。它有助于我们分析这些基本限制,并解决哪些参数的关键科学问题(例如相互不连贯,协方差矩阵的最大和最低特征值以及对抗性扰动的预算)在Lasso Algorithm成功的高或低可能性中起作用。同样,派生的样品复杂性相对于回归参数矢量的大小是对数,我们的理论主张通过对合成和现实世界数据集的经验分析来验证。
This paper analyzes $\ell_1$ regularized linear regression under the challenging scenario of having only adversarially corrupted data for training. We use the primal-dual witness paradigm to provide provable performance guarantees for the support of the estimated regression parameter vector to match the actual parameter. Our theoretical analysis shows the counter-intuitive result that an adversary can influence sample complexity by corrupting the irrelevant features, i.e., those corresponding to zero coefficients of the regression parameter vector, which, consequently, do not affect the dependent variable. As any adversarially robust algorithm has its limitations, our theoretical analysis identifies the regimes under which the learning algorithm and adversary can dominate over each other. It helps us to analyze these fundamental limits and address critical scientific questions of which parameters (like mutual incoherence, the maximum and minimum eigenvalue of the covariance matrix, and the budget of adversarial perturbation) play a role in the high or low probability of success of the LASSO algorithm. Also, the derived sample complexity is logarithmic with respect to the size of the regression parameter vector, and our theoretical claims are validated by empirical analysis on synthetic and real-world datasets.