通过基于政策的深度强化学习来研究对手的行为来实现相关的平衡

论文标题

通过基于政策的深度强化学习来研究对手的行为来实现相关的平衡

Achieving Correlated Equilibrium by Studying Opponent's Behavior Through Policy-Based Deep Reinforcement Learning

论文作者

Tsai, Kuo Chun, Han, Zhu

论文摘要

游戏理论是一项关于分布式决策行为的非常深刻的研究，并且已经由许多学者广泛发展。但是，许多现有的作品依赖于某些严格的假设，例如了解对手的私人行为，这可能是不切实际的。在这项工作中，我们专注于两个诺贝尔获奖概念，即NASH平衡和相关平衡。具体而言，我们通过我们提出的深层增强学习算法成功地达到了纳什均衡器凸面外的相关平衡。随着相关的平衡概率分布，我们还提出了一个数学模型，以反向计算相关的平衡概率分布，以估计对手的回报向量。有了这些回报，深刻的强化学习学习了理性对手的原因和方式，而不仅仅是学习相应的策略和行动区域。通过模拟，我们表明我们提出的方法可以实现NASH平衡的最佳相关平衡，并且在纳什均衡的凸面外部，并且玩家之间的相互作用有限。

Game theory is a very profound study on distributed decision-making behavior and has been extensively developed by many scholars. However, many existing works rely on certain strict assumptions such as knowing the opponent's private behaviors, which might not be practical. In this work, we focused on two Nobel winning concepts, the Nash equilibrium and the correlated equilibrium. Specifically, we successfully reached the correlated equilibrium outside the convex hull of the Nash equilibria with our proposed deep reinforcement learning algorithm. With the correlated equilibrium probability distribution, we also propose a mathematical model to inverse the calculation of the correlated equilibrium probability distribution to estimate the opponent's payoff vector. With those payoffs, deep reinforcement learning learns why and how the rational opponent plays, instead of just learning the regions for corresponding strategies and actions. Through simulations, we showed that our proposed method can achieve the optimal correlated equilibrium and outside the convex hull of the Nash equilibrium with limited interaction among players.

下载PDF全文

下载文献需遵守相关版权规定

论文标题