论文标题
可变跳过以进行自动回收范围密度估计
Variable Skipping for Autoregressive Range Density Estimation
论文作者
论文摘要
深度自回归模型计算单个数据点的可能性估计。但是,许多应用程序(即数据库基数估计)需要估计范围密度,这一功能是由当前神经密度估计文献所探索的。在这些应用程序中,高维数据的快速准确范围密度估计直接影响用户感知的性能。在本文中,我们探索了一种技术,可变的跳过,以加速自回归模型的范围密度估计。该技术利用了范围密度查询的稀疏结构,以避免在近似推断期间对不必要的变量进行采样。我们表明,可变跳过可提供10-100 $ \ times $ $效率的提高,以提高挑战性的高量化错误指标,启用复杂的应用程序,例如文本模式匹配,并且可以通过简单的数据增强过程实现,而无需更改通常的最大可能性可能性目标。
Deep autoregressive models compute point likelihood estimates of individual data points. However, many applications (i.e., database cardinality estimation) require estimating range densities, a capability that is under-explored by current neural density estimation literature. In these applications, fast and accurate range density estimates over high-dimensional data directly impact user-perceived performance. In this paper, we explore a technique, variable skipping, for accelerating range density estimation over deep autoregressive models. This technique exploits the sparse structure of range density queries to avoid sampling unnecessary variables during approximate inference. We show that variable skipping provides 10-100$\times$ efficiency improvements when targeting challenging high-quantile error metrics, enables complex applications such as text pattern matching, and can be realized via a simple data augmentation procedure without changing the usual maximum likelihood objective.