论文标题

估计恒定查询复杂性中的有效支持尺寸

Estimating the Effective Support Size in Constant Query Complexity

论文作者

Narayanan, Shyam, Tětek, Jakub

论文摘要

估计分布的支撑大小是统计中有充分研究的问题。 Goldreich [ECCC 2012]研究了这个问题高度不可舒服的事实,因为分布中的小扰动可能会严重影响支撑大小),因此很难估计,研究了估计$ε$ - \ emph {有效supports size} $ text {Ess}_ε$的$ smill smill ats Player的查询复杂性$ε$ -FAR的总变化距离与$ {p} $。 在他的论文中,他在双重访问设置中显示了一种算法(我们可能会收到随机样本并查询任何$ x $的样本概率$ p(x)$,以供双晶法规近似近似值,在$ [\ text} {ess} _ {(1+β)$ fortess $ for Texts $ for Texts pextsect}中给出答案,并给出答案。 0 $。但是,他的算法在支撑大小中具有超稳定查询的复杂性,或者超稳定近似值$ 1+γ=ω(1)$。然后,他询问是否有必要,或者是否可以在许多与支持大小的查询中获得恒定因子近似值。 我们通过表明复杂性独立于$γ> 0 $的$ n $,而且还要以$γ= 0 $的形式来回答他的问题,也就是说,双晶尺放松是不需要的。具体来说,我们显示了具有查询复杂性$ O(\ frac {1} {β^3ε^3})$的算法。也就是说,对于任何$ 0 <ε,β<1 $,我们以这种复杂性输出数字$ \ tilde {n} \ in [\ text {ess} _ {(1+β)ε},\ text {ess}_ε] $。我们还表明,可以以近似值比$ 1 +γ$复杂性$ o \ left(\ frac {1} {β^2ε} + frac {1} {βεγ^2} \ right)求解近似版本的近似版本。我们的算法非常简单,并具有$ 4 $的伪代码。

Estimating the support size of a distribution is a well-studied problem in statistics. Motivated by the fact that this problem is highly non-robust (as small perturbations in the distributions can drastically affect the support size) and thus hard to estimate, Goldreich [ECCC 2019] studied the query complexity of estimating the $ε$-\emph{effective support size} $\text{Ess}_ε$ of a distribution ${P}$, which is equal to the smallest support size of a distribution that is $ε$-far in total variation distance from ${P}$. In his paper, he shows an algorithm in the dual access setting (where we may both receive random samples and query the sampling probability $p(x)$ for any $x$) for a bicriteria approximation, giving an answer in $[\text{Ess}_{(1+β)ε},(1+γ) \text{Ess}_ε]$ for some values $β, γ> 0$. However, his algorithm has either super-constant query complexity in the support size or super-constant approximation ratio $1+γ= ω(1)$. He then asked if this is necessary, or if it is possible to get a constant-factor approximation in a number of queries independent of the support size. We answer his question by showing that not only is complexity independent of $n$ possible for $γ>0$, but also for $γ=0$, that is, that the bicriteria relaxation is not necessary. Specifically, we show an algorithm with query complexity $O(\frac{1}{β^3 ε^3})$. That is, for any $0 < ε, β< 1$, we output in this complexity a number $\tilde{n} \in [\text{Ess}_{(1+β)ε},\text{Ess}_ε]$. We also show that it is possible to solve the approximate version with approximation ratio $1+γ$ in complexity $O\left(\frac{1}{β^2 ε} + \frac{1}{βεγ^2}\right)$. Our algorithm is very simple, and has $4$ short lines of pseudocode.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源