关于学习提取的搜索树的力量

论文标题

关于学习提取的搜索树的力量

On the Power of Learning-Augmented Search Trees

论文作者

Chen, Jingbang, Cao, Xinyuan, Stepin, Alicia, Chen, Li

论文摘要

我们通过精心设计的优先级的Treaps研究了学习调格的二进制搜索树（BSTS）。结果是一个简单的搜索树，其中每个项目的深度$ x $由其预测的权重$ w_x $确定。具体而言，每个项目$ x $均分配为$ - \ lfloor \ log \ log \ log \ log（1/w_x）\ rfloor + u（0，1）$的复合优先级，其中$ u（0，1）$是统一的随机变量。通过选择$ w_x $作为$ x $的相对频率，由此产生的搜索树实现了静态最优性。这种方法通过将其扩展到任意输入分布来概括了最新的学习增强的BST [Lin-Luo-Woodruff ICML '22]，该方法仅适用于Zipfian分布。此外，我们证明我们的方法可以推广到B-Treap方法[Golovin ICALP '09]。我们的搜索树还能够通过在线自组织来利用访问序列中的地方，从而实现工作集属性。此外，它们对预测错误和支持动态操作（例如插入，删除和预测更新）也很强。我们通过一项实证研究对分析进行补充，表明我们的方法表现优于先前的工作和经典数据结构。

We study learning-augmented binary search trees (BSTs) via Treaps with carefully designed priorities. The result is a simple search tree in which the depth of each item $x$ is determined by its predicted weight $w_x$. Specifically, each item $x$ is assigned a composite priority of $-\lfloor\log\log(1/w_x)\rfloor + U(0, 1)$ where $U(0, 1)$ is the uniform random variable. By choosing $w_x$ as the relative frequency of $x$, the resulting search trees achieve static optimality. This approach generalizes the recent learning-augmented BSTs [Lin-Luo-Woodruff ICML '22], which only work for Zipfian distributions, by extending them to arbitrary input distributions. Furthermore, we demonstrate that our method can be generalized to a B-Tree data structure using the B-Treap approach [Golovin ICALP '09]. Our search trees are also capable of leveraging localities in the access sequence through online self-reorganization, thereby achieving the working-set property. Additionally, they are robust to prediction errors and support dynamic operations, such as insertions, deletions, and prediction updates. We complement our analysis with an empirical study, demonstrating that our method outperforms prior work and classic data structures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题