使用反向体验重播为LTI系统分发在线系统标识

论文标题

使用反向体验重播为LTI系统分发在线系统标识

Distributed Online System Identification for LTI Systems Using Reverse Experience Replay

论文作者

Chang, Ting-Jui, Shahrampour, Shahin

论文摘要

线性时间流（LTI）系统的识别在控制和增强学习中起着重要作用。文献中都对渐近时间和有限的离线系统识别进行了充分研究。对于在线系统识别，最近提出了具有反向体验重播（SGD-RER）的随机梯度下降的想法，其中数据序列存储在几个缓冲区中，随机梯度下降（SGD）更新在每个缓冲区中向后进行，以打破数据点之间的时间依赖关系。受这项工作的启发，我们研究了通过多代理网络分布LTI系统的在线系统识别。我们将代理视为相同的LTI系统，网络目标是通过利用代理之间的通信共同估计系统参数。我们提出了DSGD-RER，SGD-RER算法的分布式变体，理论上表征了相对于网络大小的估计误差的改善。随着网络大小的增长，我们的数值实验证明了估计误差的减少。

Identification of linear time-invariant (LTI) systems plays an important role in control and reinforcement learning. Both asymptotic and finite-time offline system identification are well-studied in the literature. For online system identification, the idea of stochastic-gradient descent with reverse experience replay (SGD-RER) was recently proposed, where the data sequence is stored in several buffers and the stochastic-gradient descent (SGD) update performs backward in each buffer to break the time dependency between data points. Inspired by this work, we study distributed online system identification of LTI systems over a multi-agent network. We consider agents as identical LTI systems, and the network goal is to jointly estimate the system parameters by leveraging the communication between agents. We propose DSGD-RER, a distributed variant of the SGD-RER algorithm, and theoretically characterize the improvement of the estimation error with respect to the network size. Our numerical experiments certify the reduction of estimation error as the network size grows.

下载PDF全文

下载文献需遵守相关版权规定

论文标题