通过立方空间先验学习神经符号的描述性计划模型：航行之家（条带）

论文标题

通过立方空间先验学习神经符号的描述性计划模型：航行之家（条带）

Learning Neural-Symbolic Descriptive Planning Models via Cube-Space Priors: The Voyage Home (to STRIPS)

论文作者

Asai, Masataro, Muise, Christian

论文摘要

我们在使代理商能够自动了解其环境的艰巨任务中实现了一个新的里程碑。我们的神经符号架构是端对端训练的，仅从图像中产生简洁有效的离散状态过渡模型。我们的目标表示形式（计划域定义语言）已经处于现成的求解器可以消耗的形式，并为现代启发式搜索功能的丰富阵列打开了大门。我们展示了我们在学习过程中的先进先前的先验如何显着降低了学到的表示的复杂性，并揭示了与“类似立方体图”的图形理论概念的联系，从而为对学习符号表示的理想特性提供了更深入的理解。我们表明，强大的域独立启发式方法使我们的系统能够求解视觉15个插头实例，这些实例超出了盲目搜索的范围，而无需求助于强化学习方法，该方法需要对依赖域的奖励信息进行大量培训。

We achieved a new milestone in the difficult task of enabling agents to learn about their environment autonomously. Our neuro-symbolic architecture is trained end-to-end to produce a succinct and effective discrete state transition model from images alone. Our target representation (the Planning Domain Definition Language) is already in a form that off-the-shelf solvers can consume, and opens the door to the rich array of modern heuristic search capabilities. We demonstrate how the sophisticated innate prior we place on the learning process significantly reduces the complexity of the learned representation, and reveals a connection to the graph-theoretic notion of "cube-like graphs", thus opening the door to a deeper understanding of the ideal properties for learned symbolic representations. We show that the powerful domain-independent heuristics allow our system to solve visual 15-Puzzle instances which are beyond the reach of blind search, without resorting to the Reinforcement Learning approach that requires a huge amount of training on the domain-dependent reward information.

下载PDF全文

下载文献需遵守相关版权规定

论文标题