基于不确定性的终身学习调制

论文标题

基于不确定性的终身学习调制

Uncertainty-based Modulation for Lifelong Learning

论文作者

Brna, Andrew, Brown, Ryan, Connolly, Patrick, Simons, Stephen, Shimizu, Renee, Aguilar-Simon, Mario

论文摘要

为能够连续，终身学习的智能代理的机器学习算法创建是一个关键目标，即在动态环境中部署在现实生活系统上的算法。在这里，我们提出了一种灵感来自人类大脑中神经调节机制的算法，该算法融合并扩展了StephenGrossbergś的突破性适应性共振理论建议。具体而言，它建立在不确定性的概念上，并采用一系列神经调节机制来实现持续学习，包括自我监督和一次性学习。在一系列基准实验中评估了算法成分，这些实验表明学习稳定而没有灾难性遗忘。我们还展示了以闭环方式开发这些系统的关键作用，在这种方式中，环境和代理行为限制并指导学习过程。为此，我们将算法集成到了体现的模拟无人机代理中。实验表明，该算法能够在虚拟环境中持续学习新任务，并且在变化的条件下以高分类的精度（大于94％），而不会造成灾难性的遗忘。该算法接受来自任何最新检测和特征提取算法的高维输入，使其成为现有系统的灵活补充。我们还描述了未来的发展工作，重点是将算法与寻求新知识以及采用更广泛的神经调节过程的机制相关。

The creation of machine learning algorithms for intelligent agents capable of continuous, lifelong learning is a critical objective for algorithms being deployed on real-life systems in dynamic environments. Here we present an algorithm inspired by neuromodulatory mechanisms in the human brain that integrates and expands upon Stephen Grossbergś ground-breaking Adaptive Resonance Theory proposals. Specifically, it builds on the concept of uncertainty, and employs a series of neuromodulatory mechanisms to enable continuous learning, including self-supervised and one-shot learning. Algorithm components were evaluated in a series of benchmark experiments that demonstrate stable learning without catastrophic forgetting. We also demonstrate the critical role of developing these systems in a closed-loop manner where the environment and the agentś behaviors constrain and guide the learning process. To this end, we integrated the algorithm into an embodied simulated drone agent. The experiments show that the algorithm is capable of continuous learning of new tasks and under changed conditions with high classification accuracy (greater than 94 percent) in a virtual environment, without catastrophic forgetting. The algorithm accepts high dimensional inputs from any state-of-the-art detection and feature extraction algorithms, making it a flexible addition to existing systems. We also describe future development efforts focused on imbuing the algorithm with mechanisms to seek out new knowledge as well as employ a broader range of neuromodulatory processes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题