算法稳定性和无监督特征选择算法的概括

论文标题

算法稳定性和无监督特征选择算法的概括

Algorithmic Stability and Generalization of an Unsupervised Feature Selection Algorithm

论文作者

Wu, Xinxing, Cheng, Qiang

论文摘要

特征选择是一种减少重要维度的技术，通过识别输入特征的基本子集来降低数据维度，这可以促进对学习和推理过程的可解释见解。算法稳定性是算法对其对输入样品扰动的敏感性的关键特征。在本文中，我们提出了一种创新的无监督功能选择算法，并具有可证明的保证。我们的算法的架构由功能评分者和功能选择器组成。得分手训练神经网络（NN）在全球范围内为所有功能评分，并且选择器采用依赖的子NN来本地评估选择功能的表示能力。此外，我们提出算法稳定性分析，并表明我们的算法通过构成概括误差具有性能保证。对现实世界数据集的广泛实验结果表明，我们所提出的算法对强基线方法的概括性表现。同样，通过我们的理论分析揭示的特性以及算法选择特征的稳定性得到了经验证实。

Feature selection, as a vital dimension reduction technique, reduces data dimension by identifying an essential subset of input features, which can facilitate interpretable insights into learning and inference processes. Algorithmic stability is a key characteristic of an algorithm regarding its sensitivity to perturbations of input samples. In this paper, we propose an innovative unsupervised feature selection algorithm attaining this stability with provable guarantees. The architecture of our algorithm consists of a feature scorer and a feature selector. The scorer trains a neural network (NN) to globally score all the features, and the selector adopts a dependent sub-NN to locally evaluate the representation abilities for selecting features. Further, we present algorithmic stability analysis and show that our algorithm has a performance guarantee via a generalization error bound. Extensive experimental results on real-world datasets demonstrate superior generalization performance of our proposed algorithm to strong baseline methods. Also, the properties revealed by our theoretical analysis and the stability of our algorithm-selected features are empirically confirmed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题