主动预测编码：稀疏奖励机器人控制问题的脑灵感增强学习

论文标题

主动预测编码：稀疏奖励机器人控制问题的脑灵感增强学习

Active Predicting Coding: Brain-Inspired Reinforcement Learning for Sparse Reward Robotic Control Problems

论文作者

Ororbia, Alexander, Mali, Ankur

论文摘要

在本文中，我们提出了一种通过神经生成编码（NGC）的神经认知计算框架（NGC）提出的无反向传播方法，设计了一种完全由强大的预测性编码/处理电路构建的代理，从而促进了动态的，从而有助于从稀疏的在线学习，从稀疏的奖励中学习，从而体现了计划 - 提示的原理。具体而言，我们制定了一种自适应剂系统，我们称之为主动预测性编码（ACTPC），它可以平衡内部生成的认知信号（旨在鼓励智能探索）与内部生成的仪器信号（旨在鼓励目标寻求目标行为），以最终学习如何使用模拟的机器人机器人来控制各种模拟机器人的机器人。套件，用于取消任务，可能会选择问题。值得注意的是，我们的实验结果表明，面对稀疏（外部）奖励信号，我们提出的ACTPC代理表现良好，并且具有竞争力或胜过几种强大的基于反向Prop的RL方法。

In this article, we propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC), designing an agent built completely from powerful predictive coding/processing circuits that facilitate dynamic, online learning from sparse rewards, embodying the principles of planning-as-inference. Concretely, we craft an adaptive agent system, which we call active predictive coding (ActPC), that balances an internally-generated epistemic signal (meant to encourage intelligent exploration) with an internally-generated instrumental signal (meant to encourage goal-seeking behavior) to ultimately learn how to control various simulated robotic systems as well as a complex robotic arm using a realistic robotics simulator, i.e., the Surreal Robotics Suite, for the block lifting task and can pick-and-place problems. Notably, our experimental results demonstrate that our proposed ActPC agent performs well in the face of sparse (extrinsic) reward signals and is competitive with or outperforms several powerful backprop-based RL approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题