遗传算法和新颖性搜索深神经进化的自适应组合

论文标题

遗传算法和新颖性搜索深神经进化的自适应组合

Adaptive Combination of a Genetic Algorithm and Novelty Search for Deep Neuroevolution

论文作者

Segal, Eyal, Sipper, Moshe

论文摘要

进化计算（EC）已被证明能够快速训练深人造神经网络（DNNS）来解决加强学习（RL）问题。虽然遗传算法（GA）非常适合利用既不具有欺骗性也不稀疏的奖励功能，但当奖励函数是这些功能时，它会挣扎。为此，新颖性搜索（NS）已被证明能够在某些情况下超过梯度关注的优化器，而在其他情况下则表现不佳。我们提出了一种新的算法：探索 - 探索$γ$ -Aptaptive学习者（$ E^2γAl$或EYAL）。通过保留动态大小的寻求新颖的代理商的利基市场，该算法可以维持人口多样性，在可能的情况下利用奖励信号并探索其他奖励信号。该算法将GA的剥削能力和NS的勘探能力结合在一起，同时保持其简单性和优雅性。我们的实验表明，在大多数情况下，Eyal在与GA相当的情况下都胜过NS - 在某些情况下，它可以均优于两者。 Eyal还允许用其他算法（例如演化策略和惊喜搜索）替换利用组件（GA）和探索组件（NS）（NS），从而为未来的研究打开了大门。

Evolutionary Computation (EC) has been shown to be able to quickly train Deep Artificial Neural Networks (DNNs) to solve Reinforcement Learning (RL) problems. While a Genetic Algorithm (GA) is well-suited for exploiting reward functions that are neither deceptive nor sparse, it struggles when the reward function is either of those. To that end, Novelty Search (NS) has been shown to be able to outperform gradient-following optimizers in some cases, while under-performing in others. We propose a new algorithm: Explore-Exploit $γ$-Adaptive Learner ($E^2γAL$, or EyAL). By preserving a dynamically-sized niche of novelty-seeking agents, the algorithm manages to maintain population diversity, exploiting the reward signal when possible and exploring otherwise. The algorithm combines both the exploitation power of a GA and the exploration power of NS, while maintaining their simplicity and elegance. Our experiments show that EyAL outperforms NS in most scenarios, while being on par with a GA -- and in some scenarios it can outperform both. EyAL also allows the substitution of the exploiting component (GA) and the exploring component (NS) with other algorithms, e.g., Evolution Strategy and Surprise Search, thus opening the door for future research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题