人口游戏中Q学习的动态：一个由物理风格的连续性方程模型

论文标题

人口游戏中Q学习的动态：一个由物理风格的连续性方程模型

The Dynamics of Q-learning in Population Games: a Physics-Inspired Continuity Equation Model

论文作者

Hu, Shuyue, Leung, Chin-Wing, Leung, Ho-fung, Soh, Harold

论文摘要

尽管学习在多代理系统中发现了广泛的应用，但其对系统时间演变的影响远非理解。本文着重于以人口游戏为模型的大型多机构系统中Q学习的动态。我们重新访问了Q学习动力学的复制器方程模型，并观察到该模型不适合我们相关的环境。在此激励的情况下，我们开发了一种新的正式模型，该模型与物理中的连续性方程有正式的联系。我们表明，我们的模型始终准确地描述了跨质量和游戏配置不同初始设置的人口游戏中的Q学习动态。我们还表明，我们的模型可以应用于不同的探索机制，描述平均动力学，并将其扩展到2播放器和N型游戏游戏中的Q学习。最后但并非最不重要的一点是，我们表明我们的模型可以提供有关算法参数的见解并促进参数调整。

Although learning has found wide application in multi-agent systems, its effects on the temporal evolution of a system are far from understood. This paper focuses on the dynamics of Q-learning in large-scale multi-agent systems modeled as population games. We revisit the replicator equation model for Q-learning dynamics and observe that this model is inappropriate for our concerned setting. Motivated by this, we develop a new formal model, which bears a formal connection with the continuity equation in physics. We show that our model always accurately describes the Q-learning dynamics in population games across different initial settings of MASs and game configurations. We also show that our model can be applied to different exploration mechanisms, describe the mean dynamics, and be extended to Q-learning in 2-player and n-player games. Last but not least, we show that our model can provide insights into algorithm parameters and facilitate parameter tuning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题