少数弹药语言模型微调的冷启动数据选择：一种基于及时的不确定性传播方法

论文标题

少数弹药语言模型微调的冷启动数据选择：一种基于及时的不确定性传播方法

Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Propagation Approach

论文作者

Yu, Yue, Zhang, Rongzhi, Xu, Ran, Zhang, Jieyu, Shen, Jiaming, Zhang, Chao

论文摘要

大型语言模型表现出了很少的表现，但是性能可能对选择几个实例的选择很敏感。我们提出了Patron，这是一种新方法，它使用基于及时的不确定性估算，用于在冷启动场景下进行预训练的语言模型进行微调的数据选择，即，没有初始标记的数据可用。在顾客中，我们设计（1）一种基于迅速的不确定性传播方法来估计数据点的重要性和（2）分区 - 然后 - 剥离（PTR）策略，以促进对注释的样品多样性。六个文本分类数据集的实验表明，赞助人的表现优于最强的冷启动数据选择基准，最高可达6.9％。此外，顾客仅具有128个标签，分别基于香草微调和及时的学习，可实现91.0％和92.1％的全面监督性能。我们的赞助人实施可在\ url {https://github.com/yueyu1030/patron}上获得。

Large Language Models have demonstrated remarkable few-shot performance, but the performance can be sensitive to the selection of few-shot instances. We propose PATRON, a new method that uses prompt-based uncertainty estimation for data selection for pre-trained language model fine-tuning under cold-start scenarios, i.e., no initial labeled data are available. In PATRON, we design (1) a prompt-based uncertainty propagation approach to estimate the importance of data points and (2) a partition-then-rewrite (PTR) strategy to promote sample diversity when querying for annotations. Experiments on six text classification datasets show that PATRON outperforms the strongest cold-start data selection baselines by up to 6.9%. Besides, with 128 labels only, PATRON achieves 91.0% and 92.1% of the fully supervised performance based on vanilla fine-tuning and prompt-based learning respectively. Our implementation of PATRON is available at \url{https://github.com/yueyu1030/Patron}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题