重新访问SoftMax以实现文本分类中的不确定性近似

论文标题

重新访问SoftMax以实现文本分类中的不确定性近似

Revisiting Softmax for Uncertainty Approximation in Text Classification

论文作者

Holm, Andreas Nugaard, Wright, Dustin, Augenstein, Isabelle

论文摘要

文本分类中的不确定性近似是在域适应性和解释性中应用的重要领域。使用最广泛的不确定性近似方法之一是Monte Carlo（MC）辍学，它在计算上很昂贵，因为它需要多个正向通过该模型。一个更便宜的替代方法是简单地基于单个前向通行证而无需辍学即可估算模型不确定性。但是，先前的工作表明这些预测往往过于自信。在本文中，我们对具有两个基本神经体系结构的五个数据集对这些方法进行了彻底的经验分析，以确定两者之间的权衡。我们将SoftMax和MC辍学的有效版本都在其不确定性近似和下游文本分类性能上进行比较，同时权衡其运行时（成本）与性能（益处）。我们发现，尽管MC辍学会产生最佳的不确定性近似值，但使用简单的软效果会导致竞争性，在某些情况下，以较低的计算成本对文本分类进行更好的不确定性估计，这表明当计算资源是一个问题时，SoftMax实际上可以是足够的不确定性估计值。

Uncertainty approximation in text classification is an important area with applications in domain adaptation and interpretability. One of the most widely used uncertainty approximation methods is Monte Carlo (MC) Dropout, which is computationally expensive as it requires multiple forward passes through the model. A cheaper alternative is to simply use the softmax based on a single forward pass without dropout to estimate model uncertainty. However, prior work has indicated that these predictions tend to be overconfident. In this paper, we perform a thorough empirical analysis of these methods on five datasets with two base neural architectures in order to identify the trade-offs between the two. We compare both softmax and an efficient version of MC Dropout on their uncertainty approximations and downstream text classification performance, while weighing their runtime (cost) against performance (benefit). We find that, while MC dropout produces the best uncertainty approximations, using a simple softmax leads to competitive and in some cases better uncertainty estimation for text classification at a much lower computational cost, suggesting that softmax can in fact be a sufficient uncertainty estimate when computational resources are a concern.

下载PDF全文

下载文献需遵守相关版权规定

论文标题