通过缩放协方差矩阵适应地图训练多样化的高维控制器

论文标题

通过缩放协方差矩阵适应地图训练多样化的高维控制器

Training Diverse High-Dimensional Controllers by Scaling Covariance Matrix Adaptation MAP-Annealing

论文作者

Tjanaka, Bryon, Fontaine, Matthew C., Lee, David H., Kalkar, Aniruddha, Nikolaidis, Stefanos

论文摘要

预训练模拟中的一组神经网络控制器已使机器人能够在线适应机器人运动任务中的损害。但是，找到多样化的高性能控制器需要昂贵的网络培训，并对大量超参数进行广泛的调整。另一方面，协方差矩阵适应地图解压缩（CMA-MAE）是一种基于进化策略（ES）的质量多样性算法，没有这些限制，并且已经在标准QD基准测试中实现了先进的性能。但是，由于CMA-MAE由于其二次复杂性而无法扩展到现代神经网络控制器。我们利用ES中的有效近似方法提出了三种新的CMA-MAE变体，以扩展到高维度。我们的实验表明，这些变体在基准机器人运动任务中优于基于ES的基线，同时与最先进的深度强化学习质量多样性算法相当。

Pre-training a diverse set of neural network controllers in simulation has enabled robots to adapt online to damage in robot locomotion tasks. However, finding diverse, high-performing controllers requires expensive network training and extensive tuning of a large number of hyperparameters. On the other hand, Covariance Matrix Adaptation MAP-Annealing (CMA-MAE), an evolution strategies (ES)-based quality diversity algorithm, does not have these limitations and has achieved state-of-the-art performance on standard QD benchmarks. However, CMA-MAE cannot scale to modern neural network controllers due to its quadratic complexity. We leverage efficient approximation methods in ES to propose three new CMA-MAE variants that scale to high dimensions. Our experiments show that the variants outperform ES-based baselines in benchmark robotic locomotion tasks, while being comparable with or exceeding state-of-the-art deep reinforcement learning-based quality diversity algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题