宽度窗口的多车程路由问题：一种多代理增强学习方法

论文标题

宽度窗口的多车程路由问题：一种多代理增强学习方法

Multi-Vehicle Routing Problems with Soft Time Windows: A Multi-Agent Reinforcement Learning Approach

论文作者

Zhang, Ke, Li, Meng, Zhang, Zhengchao, Lin, Xi, He, Fang

论文摘要

软时窗口（MVRPSTW）的多车程路由问题是城市物流分销系统中必不可少的组成部分。在过去的十年中，已经提出了许多有关MVRPSTW的方法，但大多数方法基于需要大量计算时间的启发式规则。随着物流需求的当前快速增加，传统方法会产生计算效率和解决方案质量之间的困境。为了有效地解决该问题，我们提出了一种称为多代理注意模型的新型强化学习算法，该算法可以立即解决路由问题，从而可以立即从冗长的离线训练中受益。具体而言，车辆路由问题被视为车辆旅行的生成过程，并提出了带有注意力层的编码器框架框架以迭代地产生多辆车的游览。此外，为模型培训开发了一种具有无监督辅助网络的多机构增强学习方法。通过在具有不同尺度的四个合成网络上进行评估，结果表明，所提出的方法始终优于Google或Tools和传统方法，而计算时间很少。此外，我们通过改变客户数量和车辆能力来验证训练有素的模型的鲁棒性。

Multi-vehicle routing problem with soft time windows (MVRPSTW) is an indispensable constituent in urban logistics distribution systems. Over the past decade, numerous methods for MVRPSTW have been proposed, but most are based on heuristic rules that require a large amount of computation time. With the current rapid increase of logistics demands, traditional methods incur the dilemma between computational efficiency and solution quality. To efficiently solve the problem, we propose a novel reinforcement learning algorithm called the Multi-Agent Attention Model that can solve routing problem instantly benefit from lengthy offline training. Specifically, the vehicle routing problem is regarded as a vehicle tour generation process, and an encoder-decoder framework with attention layers is proposed to generate tours of multiple vehicles iteratively. Furthermore, a multi-agent reinforcement learning method with an unsupervised auxiliary network is developed for the model training. By evaluated on four synthetic networks with different scales, the results demonstrate that the proposed method consistently outperforms Google OR-Tools and traditional methods with little computation time. In addition, we validate the robustness of the well-trained model by varying the number of customers and the capacities of vehicles.

下载PDF全文

下载文献需遵守相关版权规定

论文标题