论文标题
基于聚类的符号知识提取方法
Clustering-Based Approaches for Symbolic Knowledge Extraction
论文作者
论文摘要
在最不同的应用领域,属于机器学习世界的不透明模型越来越多。从人类的角度来看,这些模型充当黑匣子(BB),如果应用程序至关重要,除非存在一种方法来从中提取象征和人类可读的知识,否则这些模型将完全不信任。在本文中,我们分析了符号知识提取器对BB回归器采用的经常性设计,即,创建与高立管输入空间区域相关的规则。我们认为,当手头的数据设置高维或不满足对称约束时,这种分区可能会导致次优的解决方案。然后,我们建议在符号知识提取之前采用一种基于(深)的基于聚类的方法,以通过任何形式的数据集实现更好的性能。
Opaque models belonging to the machine learning world are ever more exploited in the most different application areas. These models, acting as black boxes (BB) from the human perspective, cannot be entirely trusted if the application is critical unless there exists a method to extract symbolic and human-readable knowledge out of them. In this paper we analyse a recurrent design adopted by symbolic knowledge extractors for BB regressors - that is, the creation of rules associated with hypercubic input space regions. We argue that this kind of partitioning may lead to suboptimal solutions when the data set at hand is high-dimensional or does not satisfy symmetric constraints. We then propose a (deep) clustering-based approach to be performed before symbolic knowledge extraction to achieve better performance with data sets of any kind.