论文标题
Firecommander:一个互动性的,概率的多代理机器人团队的环境
FireCommander: An Interactive, Probabilistic Multi-agent Environment for Heterogeneous Robot Teams
论文作者
论文摘要
本教程的目的是帮助个人使用\下划线{Firecommander}游戏环境来进行研究应用程序。鞭炮是一种互动的,概率的关节感知侦察环境,其中组成的代理团队(例如,机器人)合作,与动态,传播Firespots(例如目标)作斗争。在火车比赛中,必须任务任务在具有传播火灾区域和某些设施(例如房屋,医院,动力站等)的环境中最佳地处理野火局势。代理团队可以通过首先感应(例如,估算了火灾状态)来实现他们的任务,从而使越来越多的频率在频率上进行频率,从而使越来越多的频率频繁地进行频率,从而使越来越多的动作(将其置于火状态下,将ER(Es)置于火状态下,将ER(ES)延伸到Firtept中,EE ER(EE),EE ER(EE),EE,EE,EE,EE ER.火灾位置)。弗雷科参数环境对于涵盖从增强学习(RL)和从演示(LFD)学习到协调,心理学,人机互动(HRI)和团队的广泛应用的研究主题很有用。鞭炮环境的四个重要方面总体上创建了一个非凡的游戏:(1)复杂的目标:多目标随机环境,(2)概率环境:代理的行动导致概率性能,(3)隐藏的目标:部分可观察到的环境和可观察到的环境和(4)UNI-TASK机器人:感知机器人:感知机器人和动作 - 综合典型的人。就包括仅感知和仅动作的代理人而言,鞭炮环境是最初的环境。这是一款通用的多功能游戏,可以在多种组合优化问题和随机游戏中很有用,例如增强学习的应用(RL),从演示中学习(LFD)和逆RL(IRL)。
The purpose of this tutorial is to help individuals use the \underline{FireCommander} game environment for research applications. The FireCommander is an interactive, probabilistic joint perception-action reconnaissance environment in which a composite team of agents (e.g., robots) cooperate to fight dynamic, propagating firespots (e.g., targets). In FireCommander game, a team of agents must be tasked to optimally deal with a wildfire situation in an environment with propagating fire areas and some facilities such as houses, hospitals, power stations, etc. The team of agents can accomplish their mission by first sensing (e.g., estimating fire states), communicating the sensed fire-information among each other and then taking action to put the firespots out based on the sensed information (e.g., dropping water on estimated fire locations). The FireCommander environment can be useful for research topics spanning a wide range of applications from Reinforcement Learning (RL) and Learning from Demonstration (LfD), to Coordination, Psychology, Human-Robot Interaction (HRI) and Teaming. There are four important facets of the FireCommander environment that overall, create a non-trivial game: (1) Complex Objectives: Multi-objective Stochastic Environment, (2)Probabilistic Environment: Agents' actions result in probabilistic performance, (3) Hidden Targets: Partially Observable Environment and, (4) Uni-task Robots: Perception-only and Action-only agents. The FireCommander environment is first-of-its-kind in terms of including Perception-only and Action-only agents for coordination. It is a general multi-purpose game that can be useful in a variety of combinatorial optimization problems and stochastic games, such as applications of Reinforcement Learning (RL), Learning from Demonstration (LfD) and Inverse RL (iRL).