论文标题
加强旅行推销员问题的Lin-Kernighan-Helsgaun算法
Reinforced Lin-Kernighan-Helsgaun Algorithms for the Traveling Salesman Problems
论文作者
论文摘要
TSP是许多实用变体的经典NP-HARD组合优化问题。 LKH是TSP的最先进的本地搜索算法之一。 LKH-3是LKH的强大扩展,可以解决许多TSP变体。 LKH和LKH-3都将一个候选人与每个城市相关联,以提高效率,并有两种不同的方法,分别是$α$量和popmusic,以决定候选人集。在这项工作中,我们首先提出了一种可变策略增强LKH(VSR-LKH)算法,该算法将三种强化学习方法(Q-Learning,SARSA,Monte Carlo)与LKH一起用于TSP。我们进一步提出了一种称为VSR-LKH-3的新算法,该算法将可变策略增强学习方法与LKH-3结合在一起,用于典型的TSP变体,包括带有时间窗口(TSPTW)和彩色TSP(CTSP)的TSP。所提出的算法取代了LKH和LKH-3中的不灵活的遍历操作,并让算法通过增强学习在每个搜索步骤中学习在每个搜索步骤中做出选择。 LKH和LKH-3都具有$α$量或popmusic,我们的方法都可以显着改善。对236个广泛使用的TSP基准的广泛实验,最多85,900个城市证明了VSR-LKH的出色表现。 VSR-LKH-3还显着优于TSPTW和CTSP的最新启发式方法。
TSP is a classical NP-hard combinatorial optimization problem with many practical variants. LKH is one of the state-of-the-art local search algorithms for the TSP. LKH-3 is a powerful extension of LKH that can solve many TSP variants. Both LKH and LKH-3 associate a candidate set to each city to improve the efficiency, and have two different methods, $α$-measure and POPMUSIC, to decide the candidate sets. In this work, we first propose a Variable Strategy Reinforced LKH (VSR-LKH) algorithm, which incorporates three reinforcement learning methods (Q-learning, Sarsa, Monte Carlo) with LKH, for the TSP. We further propose a new algorithm called VSR-LKH-3 that combines the variable strategy reinforcement learning method with LKH-3 for typical TSP variants, including the TSP with time windows (TSPTW) and Colored TSP (CTSP). The proposed algorithms replace the inflexible traversal operations in LKH and LKH-3 and let the algorithms learn to make a choice at each search step by reinforcement learning. Both LKH and LKH-3, with either $α$-measure or POPMUSIC, can be significantly improved by our methods. Extensive experiments on 236 widely-used TSP benchmarks with up to 85,900 cities demonstrate the excellent performance of VSR-LKH. VSR-LKH-3 also significantly outperforms the state-of-the-art heuristics for TSPTW and CTSP.