学会在基于圆形游戏中推理推理：用于购买第一人称射击游戏的多任务序列生成

论文标题

学会在基于圆形游戏中推理推理：用于购买第一人称射击游戏的多任务序列生成

Learning to Reason in Round-based Games: Multi-task Sequence Generation for Purchasing Decision Making in First-person Shooters

论文作者

Zeng, Yilei, Lei, Deren, Li, Beichen, Jiang, Gangrong, Ferrara, Emilio, Zyda, Michael

论文摘要

顺序推理是一种复杂的人类能力，以前的广泛研究重点是在单个连续游戏中进行游戏AI，基于圆形的决策范围扩展到一系列游戏的序列仍然不那么探索。反恐道：全球进攻（CS：GO）作为一个基于丰富专家演示的圆形游戏，为多玩家圆形的顺序推理提供了一个绝佳的环境。在这项工作中，我们提出了一个带有圆形属性编码器和多任务解码器的序列推理器，以解释基于圆形的购买决策背后的策略。我们在匹配中采样了很少的学习来对多个回合进行采样，并为元学习环的修改模型不可知的元学习算法爬行动物。我们将每轮作为多任务序列生成问题。我们的州表示将动作编码器，团队编码器，玩家功能，圆形属性编码器和经济编码器结合在一起，以帮助我们的代理商学会在此特定的基于多人循环的方案下推理推理。完整的消融研究和与贪婪方法的比较证明了我们模型的有效性。我们的研究将为理解游戏社区以外的情节和长期购买策略的可解释AI打开大门。

Sequential reasoning is a complex human ability, with extensive previous research focusing on gaming AI in a single continuous game, round-based decision makings extending to a sequence of games remain less explored. Counter-Strike: Global Offensive (CS:GO), as a round-based game with abundant expert demonstrations, provides an excellent environment for multi-player round-based sequential reasoning. In this work, we propose a Sequence Reasoner with Round Attribute Encoder and Multi-Task Decoder to interpret the strategies behind the round-based purchasing decisions. We adopt few-shot learning to sample multiple rounds in a match, and modified model agnostic meta-learning algorithm Reptile for the meta-learning loop. We formulate each round as a multi-task sequence generation problem. Our state representations combine action encoder, team encoder, player features, round attribute encoder, and economy encoders to help our agent learn to reason under this specific multi-player round-based scenario. A complete ablation study and comparison with the greedy approach certify the effectiveness of our model. Our research will open doors for interpretable AI for understanding episodic and long-term purchasing strategies beyond the gaming community.

下载PDF全文

下载文献需遵守相关版权规定

论文标题