论文标题

少数弹药语言模型微调的冷启动数据选择:一种基于及时的不确定性传播方法

Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Propagation Approach

论文作者

Yu, Yue, Zhang, Rongzhi, Xu, Ran, Zhang, Jieyu, Shen, Jiaming, Zhang, Chao

论文摘要

大型语言模型表现出了很少的表现,但是性能可能对选择几个实例的选择很敏感。我们提出了Patron,这是一种新方法,它使用基于及时的不确定性估算,用于在冷启动场景下进行预训练的语言模型进行微调的数据选择,即,没有初始标记的数据可用。在顾客中,我们设计(1)一种基于迅速的不确定性传播方法来估计数据点的重要性和(2)分区 - 然后 - 剥离(PTR)策略,以促进对注释的样品多样性。六个文本分类数据集的实验表明,赞助人的表现优于最强的冷启动数据选择基准,最高可达6.9%。此外,顾客仅具有128个标签,分别基于香草微调和及时的学习,可实现91.0%和92.1%的全面监督性能。我们的赞助人实施可在\ url {https://github.com/yueyu1030/patron}上获得。

Large Language Models have demonstrated remarkable few-shot performance, but the performance can be sensitive to the selection of few-shot instances. We propose PATRON, a new method that uses prompt-based uncertainty estimation for data selection for pre-trained language model fine-tuning under cold-start scenarios, i.e., no initial labeled data are available. In PATRON, we design (1) a prompt-based uncertainty propagation approach to estimate the importance of data points and (2) a partition-then-rewrite (PTR) strategy to promote sample diversity when querying for annotations. Experiments on six text classification datasets show that PATRON outperforms the strongest cold-start data selection baselines by up to 6.9%. Besides, with 128 labels only, PATRON achieves 91.0% and 92.1% of the fully supervised performance based on vanilla fine-tuning and prompt-based learning respectively. Our implementation of PATRON is available at \url{https://github.com/yueyu1030/Patron}.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源