论文标题
通过相对熵正则化的经验风险最小化
Empirical Risk Minimization with Relative Entropy Regularization
论文作者
论文摘要
根据参考度量为$σ$ - 最终度量,不一定是概率度量,研究了相对熵正则化(ERM-RER)的经验风险最小化(ERM)问题。在此假设下,导致ERM-RER问题的概括,允许更大程度的灵活性纳入先验知识,并说明了许多相关特性。在这些属性中,解决该问题的解决方案(如果存在)被证明是一种独特的概率度量,与参考度量相互连续。这样的解决方案对ERM问题表现出可能与后者是否具有解决方案的较高的保证。对于固定的数据集并在特定条件下,当将模型从解决方案到ERM-RER问题采样时,经验风险被证明是次高斯随机变量。通过对预期的经验风险偏离这种解决方案对替代概率度量的偏差的敏感性,研究了解决ERM RER问题(Gibbs算法)的概括能力。最后,建立了灵敏度,概括误差和LAUTUM信息之间的有趣联系。
The empirical risk minimization (ERM) problem with relative entropy regularization (ERM-RER) is investigated under the assumption that the reference measure is a $σ$-finite measure, and not necessarily a probability measure. Under this assumption, which leads to a generalization of the ERM-RER problem allowing a larger degree of flexibility for incorporating prior knowledge, numerous relevant properties are stated. Among these properties, the solution to this problem, if it exists, is shown to be a unique probability measure, mutually absolutely continuous with the reference measure. Such a solution exhibits a probably-approximately-correct guarantee for the ERM problem independently of whether the latter possesses a solution. For a fixed dataset and under a specific condition, the empirical risk is shown to be a sub-Gaussian random variable when the models are sampled from the solution to the ERM-RER problem. The generalization capabilities of the solution to the ERM-RER problem (the Gibbs algorithm) are studied via the sensitivity of the expected empirical risk to deviations from such a solution towards alternative probability measures. Finally, an interesting connection between sensitivity, generalization error, and lautum information is established.