本地私人分布式强化学习

论文标题

本地私人分布式强化学习

Locally Private Distributed Reinforcement Learning

论文作者

Ono, Hajime, Takahashi, Tsubasa

论文摘要

我们研究了本地差异化私人算法，用于增强学习，以获得在分布式私人环境中表现良好的强大政策。我们的算法保护当地代理模型的信息免受对抗反向工程的利用。由于本地政策受到各个环境的强烈影响，因此代理商的输出可能会在不知不觉中发布私人信息。在我们提出的算法中，本地代理在其环境中更新了该模型，并报告旨在满足当地差异隐私（LDP）的嘈杂梯度，从而提供了严格的本地隐私保证。通过利用一组报告的嘈杂梯度，中央聚合器可以更新其模型并将其提供给不同的本地代理。在我们的经验评估中，我们证明了我们的方法在最不发头自然日期的状态下的表现如何。据我们所知，这是第一项在自然党下实现分布式增强学习的工作。这项工作使我们能够获得一个在分布式私人环境中表现良好的强大代理。

We study locally differentially private algorithms for reinforcement learning to obtain a robust policy that performs well across distributed private environments. Our algorithm protects the information of local agents' models from being exploited by adversarial reverse engineering. Since a local policy is strongly being affected by the individual environment, the output of the agent may release the private information unconsciously. In our proposed algorithm, local agents update the model in their environments and report noisy gradients designed to satisfy local differential privacy (LDP) that gives a rigorous local privacy guarantee. By utilizing a set of reported noisy gradients, a central aggregator updates its model and delivers it to different local agents. In our empirical evaluation, we demonstrate how our method performs well under LDP. To the best of our knowledge, this is the first work that actualizes distributed reinforcement learning under LDP. This work enables us to obtain a robust agent that performs well across distributed private environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题