论文标题
揭示用于使用多目标进化算法分类的核心
Uncovering Coresets for Classification With Multi-Objective Evolutionary Algorithms
论文作者
论文摘要
核心是训练集的一个子集,使用机器学习算法获得的性能类似于对整个原始数据进行训练的训练。 Coleset Discovery是一项积极开放的研究线,因为它允许提高算法的训练速度,并可能有助于人类理解结果。在以前的作品的基础上,提出了一种新颖的方法:迭代优化候选紧身胸衣,添加和删除样品。由于限制训练规模和结果质量之间存在明显的权衡,因此使用多目标进化算法来同时最大程度地减少集合中的点数和分类错误。非平凡基准测试的实验结果表明,所提出的方法能够提供结果,使分类器能够获得比最先进的核心发现技术获得较低的误差和更好地概括对看不见数据的能力。
A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data. Coreset discovery is an active and open line of research as it allows improving training speed for the algorithms and may help human understanding the results. Building on previous works, a novel approach is presented: candidate corsets are iteratively optimized, adding and removing samples. As there is an obvious trade-off between limiting training size and quality of the results, a multi-objective evolutionary algorithm is used to minimize simultaneously the number of points in the set and the classification error. Experimental results on non-trivial benchmarks show that the proposed approach is able to deliver results that allow a classifier to obtain lower error and better ability of generalizing on unseen data than state-of-the-art coreset discovery techniques.