论文标题

特征排名的参数平均

Parameter Averaging for Feature Ranking

论文作者

Ucar, Talip, Hajiramezanali, Ehsan

论文摘要

已知神经网络对初始化敏感。依靠神经网络进行特征排名的方法不健壮,因为当模型被初始化并用不同的随机种子训练时,它们的排名可能会有所不同。在这项工作中,我们引入了一种基于参数平均的新方法,以估计表格数据设置的准确和稳健特征的重要性,称为XTAB。我们首先初始化并训练带有“不同随机种子”的浅网络(称为本地面具)的多个实例,以进行下游任务。然后,我们通过“平均本地掩码的参数”来获得全局掩码模型。我们表明,尽管参数平均可能会导致具有更高损失的全局模型,但它仍然比单个模型更加一致地发现地面真相特征的重要性。我们对各种合成和现实世界数据进行了广泛的实验,表明XTAB可用于获得对亚最佳模型初始化不敏感的全局特征重要性。

Neural Networks are known to be sensitive to initialisation. The methods that rely on neural networks for feature ranking are not robust since they can have variations in their ranking when the model is initialized and trained with different random seeds. In this work, we introduce a novel method based on parameter averaging to estimate accurate and robust feature importance in tabular data setting, referred as XTab. We first initialize and train multiple instances of a shallow network (referred as local masks) with "different random seeds" for a downstream task. We then obtain a global mask model by "averaging the parameters" of local masks. We show that although the parameter averaging might result in a global model with higher loss, it still leads to the discovery of the ground-truth feature importance more consistently than an individual model does. We conduct extensive experiments on a variety of synthetic and real-world data, demonstrating that the XTab can be used to obtain the global feature importance that is not sensitive to sub-optimal model initialisation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源