旨在更好地分布神经算法推理任务

论文标题

旨在更好地分布神经算法推理任务

Towards Better Out-of-Distribution Generalization of Neural Algorithmic Reasoning Tasks

论文作者

Mahdavi, Sadegh, Swersky, Kevin, Kipf, Thomas, Hashemi, Milad, Thrampoulidis, Christos, Liao, Renjie

论文摘要

在本文中，我们研究了神经算法推理任务的OOD概括，其中的目标是从输入式传说对使用深层神经网络学习算法（例如，排序，广度优先搜索和深度搜索）。首先，我们认为在这种环境中的OOD概括与常见的OOD设置显着不同。例如，在这里未观察到图像分类（例如\ emph {cemph}的OOD概括中的某些现象}，因此在这里没有观察到，并且诸如数据增强方法之类的技术通常没有帮助，因为经常违反许多增强技术的假设。其次，我们分析了当前领先的基准测试的主要挑战（例如，输入分布变化，非代表性数据生成和非信息验证指标），即CLRS \ citep {deepMind2021clrs}，其中包含30个算法的算法。我们提出了几种解决方案，包括对输入分布变化的简单效率修复和改进的数据生成。最后，我们提出了一个基于注意力的2wl-graph神经网络（GNN）处理器，该处理器补充了消息通讯的GNN，因此它们的组合比所有算法的3％边距都超过了最先进的模型。我们的代码可在：\ url {https://github.com/smahdavi4/clrs}中获得。

In this paper, we study the OOD generalization of neural algorithmic reasoning tasks, where the goal is to learn an algorithm (e.g., sorting, breadth-first search, and depth-first search) from input-output pairs using deep neural networks. First, we argue that OOD generalization in this setting is significantly different than common OOD settings. For example, some phenomena in OOD generalization of image classifications such as \emph{accuracy on the line} are not observed here, and techniques such as data augmentation methods do not help as assumptions underlying many augmentation techniques are often violated. Second, we analyze the main challenges (e.g., input distribution shift, non-representative data generation, and uninformative validation metrics) of the current leading benchmark, i.e., CLRS \citep{deepmind2021clrs}, which contains 30 algorithmic reasoning tasks. We propose several solutions, including a simple-yet-effective fix to the input distribution shift and improved data generation. Finally, we propose an attention-based 2WL-graph neural network (GNN) processor which complements message-passing GNNs so their combination outperforms the state-of-the-art model by a 3% margin averaged over all algorithms. Our code is available at: \url{https://github.com/smahdavi4/clrs}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题