贝叶斯统计的数学理论未知信息来源

论文标题

贝叶斯统计的数学理论未知信息来源

Mathematical Theory of Bayesian Statistics for Unknown Information Source

论文作者

Watanabe, Sumio

论文摘要

在统计推断中，不确定性是未知的，所有模型都是错误的。也就是说，一个制定统计模型和先前分布的人同时意识到两者都是虚构的候选人。为了研究此类情况，已经构建了统计措施，例如交叉验证，信息标准和边际可能性，但是，当统计模型不足和过度参数化时，尚未完全阐明它们的数学特性。我们介绍了贝叶斯统计学数学理论的未知不确定性理论的位置，该理论阐明了交叉验证，信息标准和边际可能性的一般特性，即使模型无法通过任何正态分布近似后验分布，也无法实现未知的数据生成过程。因此，它对不能相信任何特定模型和先验的人提供了有益的观点。本文由三个部分组成。第一个是一个新的结果，而第二和第三是众所周知的新实验结果。我们表明，与保留的交叉验证相比，存在更精确的概括损失，而边际可能性的近似值与BIC相比，具有更准确的近似值，并且最佳的概括损失和边际可能性的最佳超参数是不同的。

In statistical inference, uncertainty is unknown and all models are wrong. That is to say, a person who makes a statistical model and a prior distribution is simultaneously aware that both are fictional candidates. To study such cases, statistical measures have been constructed, such as cross validation, information criteria, and marginal likelihood, however, their mathematical properties have not yet been completely clarified when statistical models are under- and over- parametrized. We introduce a place of mathematical theory of Bayesian statistics for unknown uncertainty, which clarifies general properties of cross validation, information criteria, and marginal likelihood, even if an unknown data-generating process is unrealizable by a model or even if the posterior distribution cannot be approximated by any normal distribution. Hence it gives a helpful standpoint for a person who cannot believe in any specific model and prior. This paper consists of three parts. The first is a new result, whereas the second and third are well-known previous results with new experiments. We show there exists a more precise estimator of the generalization loss than leave-one-out cross validation, there exists a more accurate approximation of marginal likelihood than BIC, and the optimal hyperparameters for generalization loss and marginal likelihood are different.

下载PDF全文

下载文献需遵守相关版权规定

论文标题