人口统计学因素可以改善文本分类吗？重新审视变压器时代的人口适应

论文标题

人口统计学因素可以改善文本分类吗？重新审视变压器时代的人口适应

Can Demographic Factors Improve Text Classification? Revisiting Demographic Adaptation in the Age of Transformers

论文作者

Hung, Chia-Chien, Lauscher, Anne, Hovy, Dirk, Ponzetto, Simone Paolo, Glavaš, Goran

论文摘要

人口因素（例如性别或年龄）塑造了我们的语言。先前的工作表明，合并人口因素可以始终如一地通过传统的NLP模型提高各种NLP任务的性能。在这项工作中，我们研究了这些先前的发现是否仍然存在于最新的基于变压器的语言模型（PLM）。我们使用三种常见的专业方法被证明可有效地将外部知识纳入预验证的变压器（例如，特定于领域的特定或地理知识）。我们使用连续的语言建模和动态多任务学习来适应性别和年龄的人口统计学维度来适应语言表示，以适应，我们将语言建模目标与人口统计学类的预测相结合。我们的结果在采用多语言PLM时显示出跨四种语言（英语，德语，法语和丹麦语）的任务性能的可观提高，这与先前的工作结果一致。但是，控制混杂因素（主要是基于变形金刚的PLM的领域和语言水平）表明，从我们的人口适应中获得的下游绩效实际上并不源于人口统计学知识。我们的结果表明，PLM的人口专业化，同时对积极的社会影响有望，但对于（现代）NLP来说仍然是一个未解决的问题。

Demographic factors (e.g., gender or age) shape our language. Previous work showed that incorporating demographic factors can consistently improve performance for various NLP tasks with traditional NLP models. In this work, we investigate whether these previous findings still hold with state-of-the-art pretrained Transformer-based language models (PLMs). We use three common specialization methods proven effective for incorporating external knowledge into pretrained Transformers (e.g., domain-specific or geographic knowledge). We adapt the language representations for the demographic dimensions of gender and age, using continuous language modeling and dynamic multi-task learning for adaptation, where we couple language modeling objectives with the prediction of demographic classes. Our results, when employing a multilingual PLM, show substantial gains in task performance across four languages (English, German, French, and Danish), which is consistent with the results of previous work. However, controlling for confounding factors - primarily domain and language proficiency of Transformer-based PLMs - shows that downstream performance gains from our demographic adaptation do not actually stem from demographic knowledge. Our results indicate that demographic specialization of PLMs, while holding promise for positive societal impact, still represents an unsolved problem for (modern) NLP.

下载PDF全文

下载文献需遵守相关版权规定

论文标题