随机约束优化的全球收敛进化策略，并应用于增强学习

论文标题

随机约束优化的全球收敛进化策略，并应用于增强学习

A Globally Convergent Evolutionary Strategy for Stochastic Constrained Optimization with Applications to Reinforcement Learning

论文作者

Diouane, Youssef, Lucchi, Aurelien, Patil, Vihang

论文摘要

最近已证明进化策略可以在增强学习中达到复杂优化问题的竞争水平。在此类问题中，通常需要优化一个受一组约束的目标函数，包括在策略的熵上的约束或限制代理商可访问的一组操作或状态。然而，文献中缺乏融合保证进化策略以优化随机约束问题的保证。在这项工作中，我们通过设计一种具有足够降低机制的新型优化算法来解决此问题，以确保收敛性，并且仅基于功能的估计。我们证明了该算法在两种类型的实验上的适用性：i）最大化奖励的控制任务和ii）最大化奖励受到不可避免的约束集。

Evolutionary strategies have recently been shown to achieve competing levels of performance for complex optimization problems in reinforcement learning. In such problems, one often needs to optimize an objective function subject to a set of constraints, including for instance constraints on the entropy of a policy or to restrict the possible set of actions or states accessible to an agent. Convergence guarantees for evolutionary strategies to optimize stochastic constrained problems are however lacking in the literature. In this work, we address this problem by designing a novel optimization algorithm with a sufficient decrease mechanism that ensures convergence and that is based only on estimates of the functions. We demonstrate the applicability of this algorithm on two types of experiments: i) a control task for maximizing rewards and ii) maximizing rewards subject to a non-relaxable set of constraints.

下载PDF全文

下载文献需遵守相关版权规定

论文标题