论文标题

跳线:现代Hopfield网络用于表格数据

Hopular: Modern Hopfield Networks for Tabular Data

论文作者

Schäfl, Bernhard, Gruber, Lukas, Bitto-Nemling, Angela, Hochreiter, Sepp

论文摘要

尽管深度学习在视觉和自然语言处理中遇到的结构化数据中表现出色,但它未能满足其对表格数据的期望。对于表格数据,支持向量机(SVM),随机森林和梯度提升是最佳性能技术,梯度提高了。最近,我们看到了针对表格数据量身定制的深度学习方法的激增,但与在小型数据集上的梯度提升相比,表现不佳。我们建议使用中型和小型数据集的新型深度学习体系结构“ Hopular”,每个层都配备了连续的现代Hopfield网络。现代的Hopfield网络使用存储的数据来识别功能 - 特征,特征目标和样本样本依赖性。 Hopular的新颖性是,每一层都可以通过Hopfield Network中的存储数据直接访问原始输入以及整个培训集。因此,Hopular可以逐步更新其当前模型,并在标准迭代学习算法(如标准迭代学习算法)上进行的每一层预测。在对少于1,000个样本的小型表格数据集进行的实验中,Hopular超过了梯度提升,随机森林,SVM,特别是几种深度学习方法。在具有大约10,000个样本的中型表格数据的实验中,Hopular优于Xgboost,Catboost,LightGBM和一种用于表格数据的最先进的深度学习方法。因此,Hopular是表格数据上这些方法的强大替代方法。

While Deep Learning excels in structured data as encountered in vision and natural language processing, it failed to meet its expectations on tabular data. For tabular data, Support Vector Machines (SVMs), Random Forests, and Gradient Boosting are the best performing techniques with Gradient Boosting in the lead. Recently, we saw a surge of Deep Learning methods that were tailored to tabular data but still underperform compared to Gradient Boosting on small-sized datasets. We suggest "Hopular", a novel Deep Learning architecture for medium- and small-sized datasets, where each layer is equipped with continuous modern Hopfield networks. The modern Hopfield networks use stored data to identify feature-feature, feature-target, and sample-sample dependencies. Hopular's novelty is that every layer can directly access the original input as well as the whole training set via stored data in the Hopfield networks. Therefore, Hopular can step-wise update its current model and the resulting prediction at every layer like standard iterative learning algorithms. In experiments on small-sized tabular datasets with less than 1,000 samples, Hopular surpasses Gradient Boosting, Random Forests, SVMs, and in particular several Deep Learning methods. In experiments on medium-sized tabular data with about 10,000 samples, Hopular outperforms XGBoost, CatBoost, LightGBM and a state-of-the art Deep Learning method designed for tabular data. Thus, Hopular is a strong alternative to these methods on tabular data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源