论文标题
使用Integer编程的强大支持向量机的集合方法
Ensemble Methods for Robust Support Vector Machines using Integer Programming
论文作者
论文摘要
在这项工作中,我们研究了二进制分类问题,我们假设我们的培训数据受到不确定性的影响,即确切的数据点尚不清楚。在强大的机器学习领域解决此问题的目的是开发模型,这些模型可抵抗训练数据中的小扰动。我们研究了强大的支持向量机(SVM),并通过合奏方法扩展了经典方法,该方法在数据集的不同扰动上迭代地求解了非稳定SVM,在该方法中,扰动是由对抗性问题得出的。之后,对于未知数据点的分类,我们进行了所有计算出的SVM解决方案的多数票。我们研究了针对对抗性问题的三种不同变体,确切的问题,轻松的变体和有效的启发式变体。虽然可以使用整数编程公式对精确的和放松的变体进行建模,但可以通过简单有效的算法实现启发式词。所有派生方法均在随机和现实的数据集上进行测试,结果表明,与经典的鲁棒SVM模型相比,更改保护级别时,派生的集合方法的行为更加稳定。
In this work we study binary classification problems where we assume that our training data is subject to uncertainty, i.e. the precise data points are not known. To tackle this issue in the field of robust machine learning the aim is to develop models which are robust against small perturbations in the training data. We study robust support vector machines (SVM) and extend the classical approach by an ensemble method which iteratively solves a non-robust SVM on different perturbations of the dataset, where the perturbations are derived by an adversarial problem. Afterwards for classification of an unknown data point we perform a majority vote of all calculated SVM solutions. We study three different variants for the adversarial problem, the exact problem, a relaxed variant and an efficient heuristic variant. While the exact and the relaxed variant can be modeled using integer programming formulations, the heuristic one can be implemented by an easy and efficient algorithm. All derived methods are tested on random and realistic datasets and the results indicate that the derived ensemble methods have a much more stable behaviour when changing the protection level compared to the classical robust SVM model.