论文标题

协助未知的队友完成未知任务:在局部可观察性下的临时团队合作

Assisting Unknown Teammates in Unknown Tasks: Ad Hoc Teamwork under Partial Observability

论文作者

Ribeiro, João G., Martinho, Cassandro, Sardinha, Alberto, Melo, Francisco S.

论文摘要

在本文中,我们介绍了一种新颖的贝叶斯在线预测算法,该算法是针对部分可观察性(ATPO)的临时团队工作设置的问题,该算法可以与未知的队友在无需协调前协议的情况下与未知的队友进行即时合作。与以前假定环境状态完全可观察到的作品不同,ATPO可容纳部分可观察性,使用代理人的观察结果来确定队友正在执行哪个任务。我们的方法假设队友的行动既不可见,也不是环境奖励信号。我们在三个域中评估ATPO - 具有部分可观察性和过度煮熟的域的两个修改版本的追随域。我们的结果表明,ATPO可以有效且强大,可以从大型任务库中识别队友的任务,有效地在几乎最佳时间内解决它,并且可以扩展到适应越来越大的问题大小。

In this paper, we present a novel Bayesian online prediction algorithm for the problem setting of ad hoc teamwork under partial observability (ATPO), which enables on-the-fly collaboration with unknown teammates performing an unknown task without needing a pre-coordination protocol. Unlike previous works that assume a fully observable state of the environment, ATPO accommodates partial observability, using the agent's observations to identify which task is being performed by the teammates. Our approach assumes neither that the teammate's actions are visible nor an environment reward signal. We evaluate ATPO in three domains -- two modified versions of the Pursuit domain with partial observability and the overcooked domain. Our results show that ATPO is effective and robust in identifying the teammate's task from a large library of possible tasks, efficient at solving it in near-optimal time, and scalable in adapting to increasingly larger problem sizes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源