论文标题
用于零弹性概括的通用歧视器
A Universal Discriminator for Zero-Shot Generalization
论文作者
论文摘要
生成建模一直是大规模预处理和零拍的概括的主要方法。在这项工作中,我们通过表明歧视方法的表现要比在大量NLP任务上的生成性方法要好得多。从技术上讲,我们训练一个单个歧视者,以预测文本样本是否来自类似于gan的真实数据分布。由于可以从几个选项中选择许多NLP任务,因此我们使用此歧视器来预测输入的串联,哪个选项具有来自真实数据分布的最高概率。这种简单的公式在T0基准上实现了最新的零击结果,在不同尺度上,T0的表现分别优于16.0 \%,7.8 \%和11.5 \%。在填充设置中,我们的方法还可以在多种NLP任务上实现新的最新结果,而先前方法的参数仅为1/4。同时,我们的方法需要最少的促进努力,这在很大程度上可以改善鲁棒性,对于现实世界的应用至关重要。此外,我们还共同培训广义UD与生成任务相结合,该任务在判别任务上保持了优势,并同时处理生成任务。
Generative modeling has been the dominant approach for large-scale pretraining and zero-shot generalization. In this work, we challenge this convention by showing that discriminative approaches perform substantially better than generative ones on a large number of NLP tasks. Technically, we train a single discriminator to predict whether a text sample comes from the true data distribution, similar to GANs. Since many NLP tasks can be formulated as selecting from a few options, we use this discriminator to predict the concatenation of input and which option has the highest probability of coming from the true data distribution. This simple formulation achieves state-of-the-art zero-shot results on the T0 benchmark, outperforming T0 by 16.0\%, 7.8\%, and 11.5\% respectively on different scales. In the finetuning setting, our approach also achieves new state-of-the-art results on a wide range of NLP tasks, with only 1/4 parameters of previous methods. Meanwhile, our approach requires minimal prompting efforts, which largely improves robustness and is essential for real-world applications. Furthermore, we also jointly train a generalized UD in combination with generative tasks, which maintains its advantage on discriminative tasks and simultaneously works on generative tasks.