论文标题

内核密度估计数据结构的动态维护:从实践到理论

Dynamic Maintenance of Kernel Density Estimation Data Structure: From Practice to Theory

论文作者

Liang, Jiehao, Song, Zhao, Xu, Zhaozhuo, Yin, Junze, Zhuo, Danyang

论文摘要

内核密度估计(KDE)在机器学习中脱颖而出。该问题按以下方式定义:给定一个内核函数$ f(x,y)$和一组点$ \ {x_1,x_2,x_2,\ cdots,x_n \} \ subset \ subset \ subbb {r}^d $ f(x_i,y)$对于任何查询点$ y \ in \ mathbb {r}^d $。最近,将数据结构用于有效KDE的趋势越来越大。但是,提出的KDE数据结构集中在静态设置上。 KDE数据结构在动态变化的数据分布上的鲁棒性没有解决。在这项工作中,我们专注于具有对对抗性查询的KDE数据结构的动态维护。特别是,我们提供了KDE数据结构的理论框架。在我们的框架中,KDE数据结构仅需要次级空间。此外,我们的数据结构支持sublerear时间中数据集的动态更新。此外,我们可以在均匀时间内使用潜在的对手进行自适应查询。

Kernel density estimation (KDE) stands out as a challenging task in machine learning. The problem is defined in the following way: given a kernel function $f(x,y)$ and a set of points $\{x_1, x_2, \cdots, x_n \} \subset \mathbb{R}^d$, we would like to compute $\frac{1}{n}\sum_{i=1}^{n} f(x_i,y)$ for any query point $y \in \mathbb{R}^d$. Recently, there has been a growing trend of using data structures for efficient KDE. However, the proposed KDE data structures focus on static settings. The robustness of KDE data structures over dynamic changing data distributions is not addressed. In this work, we focus on the dynamic maintenance of KDE data structures with robustness to adversarial queries. Especially, we provide a theoretical framework of KDE data structures. In our framework, the KDE data structures only require subquadratic spaces. Moreover, our data structure supports the dynamic update of the dataset in sublinear time. Furthermore, we can perform adaptive queries with the potential adversary in sublinear time.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源