使用多层读数的储层计算的深度Q网络

论文标题

使用多层读数的储层计算的深度Q网络

Deep Q-network using reservoir computing with multi-layered readout

论文作者

Matsuki, Toshitaka

论文摘要

基于复发性的神经网络（RNN）的增强学习（RL）用于学习上下文依赖上下文，并且在最近的研究中也吸引了一种具有出色学习绩效的方法。但是，基于RNN的RL存在一些问题，即学习过程在计算上往往更昂贵，并且由于消失/爆炸的梯度问题而进行的，随着时间的流逝进行了反向传播（BPTT）的培训是不稳定的。已经提出了一种引入储层计算的重播记忆的方法，该方法训练没有BPTT的代理商并避免了这些问题。这种方法的基本思想是，环境的观察是对储层网络的输入，并且观察和储层输出都存储在内存中。本文表明，该方法的性能通过使用多层神经网络进行读取层进行改进，该网络通常由单个线性层组成。实验结果表明，使用多层读数可以改善需要时间序列处理的四个经典控制任务的学习性能。

Recurrent neural network (RNN) based reinforcement learning (RL) is used for learning context-dependent tasks and has also attracted attention as a method with remarkable learning performance in recent research. However, RNN-based RL has some issues that the learning procedures tend to be more computationally expensive, and training with backpropagation through time (BPTT) is unstable because of vanishing/exploding gradients problem. An approach with replay memory introducing reservoir computing has been proposed, which trains an agent without BPTT and avoids these issues. The basic idea of this approach is that observations from the environment are input to the reservoir network, and both the observation and the reservoir output are stored in the memory. This paper shows that the performance of this method improves by using a multi-layered neural network for the readout layer, which regularly consists of a single linear layer. The experimental results show that using multi-layered readout improves the learning performance of four classical control tasks that require time-series processing.

下载PDF全文

下载文献需遵守相关版权规定

论文标题