在模块化生产环境中使用深度Q学习的车辆管理

论文标题

在模块化生产环境中使用深度Q学习的车辆管理

Vehicle management in a modular production context using Deep Q-Learning

论文作者

Pouget, Lucain, Hasenbichler, Timo, Auer, Jakob, Lichtenegger, Klaus, Windisch, Andreas

论文摘要

我们使用对环境的离散事件模拟，调查了在模块化生产设施的背景下，将基于Deep-Q的深钢筋学习代理部署到工作店调度问题的可行性。这些环境由一个来源和下沉组成，以处理要处理的零件以及（几个）工作站。对代理商进行了培训，以安排自动导向车辆，以最佳方式在这些电台之间来回运输零件。从非常简单的设置开始，我们提高了环境的复杂性，并将代理商的性能与良好的启发式方法进行比较，例如基于首先的代理，成本表和最近的邻居方法。我们此外，寻求对启发式方法挣扎的环境的特殊配置，以调查深Q代理在多大程度上受这些挑战的影响。我们发现，基于深Q的代理表现出与启发式基线相当的性能。此外，我们的发现表明，与常规方法相比，DRL剂对噪声的鲁棒性提高。总体而言，我们发现DRL代理对于此类调度问题构成了一种有价值的方法。

We investigate the feasibility of deploying Deep-Q based deep reinforcement learning agents to job-shop scheduling problems in the context of modular production facilities, using discrete event simulations for the environment. These environments are comprised of a source and sink for the parts to be processed, as well as (several) workstations. The agents are trained to schedule automated guided vehicles to transport the parts back and forth between those stations in an optimal fashion. Starting from a very simplistic setup, we increase the complexity of the environment and compare the agents' performances with well established heuristic approaches, such as first-in-first-out based agents, cost tables and a nearest-neighbor approach. We furthermore seek particular configurations of the environments in which the heuristic approaches struggle, to investigate to what degree the Deep-Q agents are affected by these challenges. We find that Deep-Q based agents show comparable performance as the heuristic baselines. Furthermore, our findings suggest that the DRL agents exhibit an increased robustness to noise, as compared to the conventional approaches. Overall, we find that DRL agents constitute a valuable approach for this type of scheduling problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题