随机马鞍点问题的概括范围

论文标题

随机马鞍点问题的概括范围

Generalization Bounds for Stochastic Saddle Point Problems

论文作者

Zhang, Junyu, Hong, Mingyi, Wang, Mengdi, Zhang, Shuzhong

论文摘要

本文研究了随机鞍点（SSP）问题的经验鞍点（ESP）解决方案的概括界限。对于具有Lipschitz连续且强烈凸出的凹面目标函数的SSP，我们建立了一个$ \ Mathcal {O}（1/n）$通过使用统一稳定性参数限制的概括。我们还提供了多种假设下的概括界限，包括没有强凸度和没有界面域的情况。我们在两个示例中说明了我们的结果：马尔可夫决策过程中的批处理政策学习，以及对随机游戏的NASH均衡估计的混合策略。在每个示例中，我们都表明，正则化的ESP解决方案具有近乎最佳的样本复杂性。据我们所知，这是ESP概括理论的第一组结果。

This paper studies the generalization bounds for the empirical saddle point (ESP) solution to stochastic saddle point (SSP) problems. For SSP with Lipschitz continuous and strongly convex-strongly concave objective functions, we establish an $\mathcal{O}(1/n)$ generalization bound by using a uniform stability argument. We also provide generalization bounds under a variety of assumptions, including the cases without strong convexity and without bounded domains. We illustrate our results in two examples: batch policy learning in Markov decision process, and mixed strategy Nash equilibrium estimation for stochastic games. In each of these examples, we show that a regularized ESP solution enjoys a near-optimal sample complexity. To the best of our knowledge, this is the first set of results on the generalization theory of ESP.

下载PDF全文

下载文献需遵守相关版权规定

论文标题