论文标题

Shannon熵的插电估算器的渐近正态性

Asymptotic Normality for Plug-in Estimators of Generalized Shannon's Entropy

论文作者

Zhang, Jialin, Shi, Jingyi

论文摘要

香农的熵是信息理论的基础之一,也是机器学习方法的基本方面(例如随机森林)。但是,仅针对可数字字母上的快速衰减尾巴的分布有限定义。 Shannon在字母上所有分布的一般类别上的熵的无限性阻止了其潜在效用。为了填补信息理论的基础上的空白,张(2020)提出了广义香农的熵,该熵在各处是有限的。几乎所有基于熵的ML方法软件包采用的插件估计器是估计香农熵的最流行方法之一。在现有文献中,对香农熵插图估算器的渐近分布进行了很好的研究。本文研究了广义香农在可数字母上的插入式插件估计器的渐近特性。开发的渐近特性不需要对原始分布的假设。提出的渐近特性允许使用广义香农的熵进行间隔估计和统计检验。

Shannon's entropy is one of the building blocks of information theory and an essential aspect of Machine Learning methods (e.g., Random Forests). Yet, it is only finitely defined for distributions with fast decaying tails on a countable alphabet. The unboundedness of Shannon's entropy over the general class of all distributions on an alphabet prevents its potential utility from being fully realized. To fill the void in the foundation of information theory, Zhang (2020) proposed generalized Shannon's entropy, which is finitely defined everywhere. The plug-in estimator, adopted in almost all entropy-based ML method packages, is one of the most popular approaches to estimating Shannon's entropy. The asymptotic distribution for Shannon's entropy's plug-in estimator was well studied in the existing literature. This paper studies the asymptotic properties for the plug-in estimator of generalized Shannon's entropy on countable alphabets. The developed asymptotic properties require no assumptions on the original distribution. The proposed asymptotic properties allow interval estimation and statistical tests with generalized Shannon's entropy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源