论文标题

TLEAGUE:基于竞争性自我竞争的分布式多代理增强学习框架

TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning

论文作者

Sun, Peng, Xiong, Jiechao, Han, Lei, Sun, Xinghai, Li, Shuxing, Xu, Jiawei, Fang, Meng, Zhang, Zhengyou

论文摘要

基于竞争性的自我游戏(CSP)多代理增强学习(MARL)最近显示了惊人的突破。几个基准,包括Dota 2,国王的荣耀,Quake III,Starcraft II等基准,可以实现强大的AIS。尽管取得了成功,但MARL培训还是非常渴望的,通常需要在训练期间从环境中看到数十亿美元(如果不是数万亿)框架,以学习高性能代理。这给研究人员或工程师带来了非平凡的困难,并防止将MARL应用于更广泛的现实世界问题。为了解决这个问题,在本手稿中,我们描述了一个被称为Tleague的框架,该框架旨在大规模培训并实施几种主流CSP-MARL算法。可以将培训部署在单台计算机或混合机群(CPU和GPU)中,其中标准的Kubernetes以云原生的方式支持。进行分布式培训时,Tleague实现了很高的吞吐量和合理的规模。多亏了模块化设计,也很容易扩展用于解决其他多代理问题或实施和验证MARL算法。我们介绍了关于星际争霸II,Vizdoom和Pommerman的实验,以显示Tleague的效率和有效性。该代码是开源的,可在https://github.com/tencent-ailab/tleague_projpage上找到

Competitive Self-Play (CSP) based Multi-Agent Reinforcement Learning (MARL) has shown phenomenal breakthroughs recently. Strong AIs are achieved for several benchmarks, including Dota 2, Glory of Kings, Quake III, StarCraft II, to name a few. Despite the success, the MARL training is extremely data thirsty, requiring typically billions of (if not trillions of) frames be seen from the environment during training in order for learning a high performance agent. This poses non-trivial difficulties for researchers or engineers and prevents the application of MARL to a broader range of real-world problems. To address this issue, in this manuscript we describe a framework, referred to as TLeague, that aims at large-scale training and implements several main-stream CSP-MARL algorithms. The training can be deployed in either a single machine or a cluster of hybrid machines (CPUs and GPUs), where the standard Kubernetes is supported in a cloud native manner. TLeague achieves a high throughput and a reasonable scale-up when performing distributed training. Thanks to the modular design, it is also easy to extend for solving other multi-agent problems or implementing and verifying MARL algorithms. We present experiments over StarCraft II, ViZDoom and Pommerman to show the efficiency and effectiveness of TLeague. The code is open-sourced and available at https://github.com/tencent-ailab/tleague_projpage

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源