论文标题
查询代理:通过认知不确定性估计提高样品效率
Query The Agent: Improving sample efficiency through epistemic uncertainty estimation
论文作者
论文摘要
目标条件加强学习者的课程通常依赖于对该药物的认知不确定性的估计不佳,或者无法完全考虑该药物的认知不确定性,从而导致样本效率差。我们提出了一种新颖的算法,请查询代理(QTA),该算法通过估计该药物在整个状态空间中的认知不确定性并在高度不确定的领域设定目标来显着提高样本效率。鼓励代理在高度不确定的状态下收集数据,使代理可以快速提高其对价值功能的估计。 QTA利用一种新技术来估计认知不确定性,预测不确定性网络(PUN),允许QTA评估所有先前观察到的状态中药物的不确定性。我们证明QTA对先前存在的方法提供了决定性的样本效率提高。
Curricula for goal-conditioned reinforcement learning agents typically rely on poor estimates of the agent's epistemic uncertainty or fail to consider the agents' epistemic uncertainty altogether, resulting in poor sample efficiency. We propose a novel algorithm, Query The Agent (QTA), which significantly improves sample efficiency by estimating the agent's epistemic uncertainty throughout the state space and setting goals in highly uncertain areas. Encouraging the agent to collect data in highly uncertain states allows the agent to improve its estimation of the value function rapidly. QTA utilizes a novel technique for estimating epistemic uncertainty, Predictive Uncertainty Networks (PUN), to allow QTA to assess the agent's uncertainty in all previously observed states. We demonstrate that QTA offers decisive sample efficiency improvements over preexisting methods.