论文标题

通过深度加固学习解决车辆路由问题

Solving the vehicle routing problem with deep reinforcement learning

论文作者

Foa, Simone, Coppola, Corrado, Grani, Giorgio, Palagi, Laura

论文摘要

最近,增强学习方法(RL)在NP-HARD组合优化问题上的应用已成为一个流行的话题。从本质上讲,这通常是由于传统组合算法的性质,通常是基于反复试验的过程。 RL旨在自动化此过程。在这方面,本文着重于RL在车辆路由问题(VRP)中的应用,这是属于NP-HARD问题类的著名组合问题。首先,在这项工作中,该问题被建模为Markov决策过程(MDP),然后应用PPO方法(属于Actor-Critic critic crience cense of Greenforce Learning方法)。在第二阶段,已经建立了演员和评论家背后的神经建筑,选择采用基于卷积神经网络的神经建筑,无论是为演员还是评论家。这种选择有效地解决了不同大小的问题。在各种实例上进行的实验表明该算法具有良好的概括能力,并且可以在短时间内达到良好的解决方案。提出的算法与最先进的求解器或最先进的求解器之间的比较表明,后者仍然胜过增强算法。但是,有一些研究观点,旨在升级提出的算法的当前性能。

Recently, the applications of the methodologies of Reinforcement Learning (RL) to NP-Hard Combinatorial optimization problems have become a popular topic. This is essentially due to the nature of the traditional combinatorial algorithms, often based on a trial-and-error process. RL aims at automating this process. At this regard, this paper focuses on the application of RL for the Vehicle Routing Problem (VRP), a famous combinatorial problem that belongs to the class of NP-Hard problems. In this work, first, the problem is modeled as a Markov Decision Process (MDP) and then the PPO method (which belongs to the Actor-Critic class of Reinforcement learning methods) is applied. In a second phase, the neural architecture behind the Actor and Critic has been established, choosing to adopt a neural architecture based on the Convolutional neural networks, both for the Actor and the Critic. This choice resulted in effectively addressing problems of different sizes. Experiments performed on a wide range of instances show that the algorithm has good generalization capabilities and can reach good solutions in a short time. Comparisons between the algorithm proposed and the state-of-the-art solver OR-TOOLS show that the latter still outperforms the Reinforcement learning algorithm. However, there are future research perspectives, that aim to upgrade the current performance of the algorithm proposed.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源