论文标题

一种非参数方法来推断连续分布或离散分布的尾巴

A Non-parametric Approach to Inference about the Tail of a Continuous or a Discrete Distribution

论文作者

Zhang, Jialin, Zhang, Zhiyi

论文摘要

本文介绍了一种非参数信息理论方法,以推理连续或离散分布的尾巴。利用一个名为尾部轮廓的新概念 - 一组从可数字字母的吸引域的结果开发的信息理论数量 - 理论证据支持通过一系列图来识别特定离散分布的尾巴类型。该方法通过基准标记来辨别尾巴的类型,针对指数级,而三个指数的家庭:近指数,次指数和幂律(ZIPF,Pareto)。对于尾巴厚度高于指数,该方法还为某些基础分布参数提供了点和间隔估计。虽然主要是为了简化用于详细统计分析的离散参数模型的选择,但支持定理使该方法的扩展用途可用于连续数据,并指出,具有共同宽度的连续数据可在某些条件下保留尾部衰减速率。提出了模拟以在各种情况下证明该方法的性能。

This article introduces a non-parametric information-theoretic approach to inference about the tail of a continuous or a discrete distribution. Leveraging a new concept named tail profile -- a set of information-theoretic quantities developed from results of domains of attraction on countable alphabets -- theoretical evidence supports the identification of specific discrete distributional tail types through a sequence of plots. The approach discerns tail types by bench-marking against exponential, and three thicker-than-exponential families: near-exponential, sub-exponential, and power-law (zipf, Pareto). For tails thicker-than-exponential, the approach also provides point and interval estimates for some of the underlying distribution parameters. While primarily designed to streamline the selection of discrete parametric models for detailed statistical analysis, a supporting theorem enables the method's extension use to continuous data, stating that binning continuous data with a common width preserves the tail decay rate under certain conditions. Simulations are presented to demonstrate the method's performance across various scenarios.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源