对自回旋神经序列模型的预测查询

论文标题

对自回旋神经序列模型的预测查询

Predictive Querying for Autoregressive Neural Sequence Models

论文作者

Boyd, Alex, Showalter, Sam, Mandt, Stephan, Smyth, Padhraic

论文摘要

在关于顺序事件的推理中，构成概率查询，例如“下一个何时发生”或“在B之前发生A的可能性是什么，以及在用户建模，医学和融资等领域中的应用。但是，随着机器学习转移到RNN和Transformers等神经回归模型，概率查询很大程度上仅限于简单的情况，例如下一事件预测。这部分是由于以下事实，即未来的查询涉及在大路径空间上边缘化，这在此类模型中并非有效地做到这一点。在本文中，我们介绍了一种通用类型学，以用于神经自回归序列模型中的预测性查询，并表明可以通过基本构建块组系统地表示此类查询。我们利用这种类型学来开发基于光束搜索，重要性采样和混合动力的新查询估计方法。在来自不同应用程序域以及GPT-2语言模型的四个大尺度序列数据集中，我们证明了可以在指数级的预测路径空间中进行任意查询的查询答案，并在搜索和采样方法之间找到成本准确性的差异。

In reasoning about sequential events it is natural to pose probabilistic queries such as "when will event A occur next" or "what is the probability of A occurring before B", with applications in areas such as user modeling, medicine, and finance. However, with machine learning shifting towards neural autoregressive models such as RNNs and transformers, probabilistic querying has been largely restricted to simple cases such as next-event prediction. This is in part due to the fact that future querying involves marginalization over large path spaces, which is not straightforward to do efficiently in such models. In this paper we introduce a general typology for predictive queries in neural autoregressive sequence models and show that such queries can be systematically represented by sets of elementary building blocks. We leverage this typology to develop new query estimation methods based on beam search, importance sampling, and hybrids. Across four large-scale sequence datasets from different application domains, as well as for the GPT-2 language model, we demonstrate the ability to make query answering tractable for arbitrary queries in exponentially-large predictive path-spaces, and find clear differences in cost-accuracy tradeoffs between search and sampling methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题