用矢量量化编码的深度加固学习

论文标题

用矢量量化编码的深度加固学习

Deep Reinforcement Learning with Vector Quantized Encoding

论文作者

Zhang, Liang, Lieffers, Justin, Pyarelal, Adarsh

论文摘要

人类的决策通常涉及将类似状态组合为类别的类别和推理，而不是实际状态。在这种直觉的指导下，我们提出了一种新颖的方法，用于在深度增强学习（RL）方法中群集状态特征（RL）方法提高其可解释性。具体而言，我们提出了一个称为\ emph {向量量化的强化学习}（VQ-RL）的即插即用框架，该框架通过基于Vector量化（VQ）编码的辅助分类任务扩展了经典的RL管道，并通过策略培训对齐。 VQ编码方法将具有相似语义的特征分为簇，并导致与经典的深RL方法相比，具有更好的分离簇，从而使神经模型能够更好地学习状态之间的相似性和差异。此外，我们介绍了两种正则化方法，以帮助增加集群之间的分离，并避免与VQ培训相关的风险。在模拟中，我们证明了VQ-RL可提高可解释性，并研究其对DEEP RL的鲁棒性和概括的影响。

Human decision-making often involves combining similar states into categories and reasoning at the level of the categories rather than the actual states. Guided by this intuition, we propose a novel method for clustering state features in deep reinforcement learning (RL) methods to improve their interpretability. Specifically, we propose a plug-and-play framework termed \emph{vector quantized reinforcement learning} (VQ-RL) that extends classic RL pipelines with an auxiliary classification task based on vector quantized (VQ) encoding and aligns with policy training. The VQ encoding method categorizes features with similar semantics into clusters and results in tighter clusters with better separation compared to classic deep RL methods, thus enabling neural models to learn similarities and differences between states better. Furthermore, we introduce two regularization methods to help increase the separation between clusters and avoid the risks associated with VQ training. In simulations, we demonstrate that VQ-RL improves interpretability and investigate its impact on robustness and generalization of deep RL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题