论文标题

一种内在探测的潜在可变化模型

A Latent-Variable Model for Intrinsic Probing

论文作者

Stańczak, Karolina, Hennigen, Lucas Torroba, Williams, Adina, Cotterell, Ryan, Augenstein, Isabelle

论文摘要

预训练的情境化表示的成功促使研究人员分析了他们是否存在语言信息。确实,自然可以假设这些预训练的表示确实编码了一定程度的语言知识,因为它们对各种NLP任务进行了巨大的经验改进,这表明他们正在学习真正的语言概括。在这项工作中,我们专注于固有的探测,这是一种分析技术,该技术不仅要确定表示表示是否编码语言属性,而且还指出了该属性编码的位置。我们提出了一种新型的潜在变量公式,用于构建固有的探针并得出对数类似物的可缝隙变异近似。我们的结果表明,与文献中先前提出的两个固有探针相比,我们的模型具有通用性,并且可以产生更严格的相互信息估计。最后,我们发现预先培训的表示形式形成了跨语法的概念,我们发现了经验证据。

The success of pre-trained contextualized representations has prompted researchers to analyze them for the presence of linguistic information. Indeed, it is natural to assume that these pre-trained representations do encode some level of linguistic knowledge as they have brought about large empirical improvements on a wide variety of NLP tasks, which suggests they are learning true linguistic generalization. In this work, we focus on intrinsic probing, an analysis technique where the goal is not only to identify whether a representation encodes a linguistic attribute but also to pinpoint where this attribute is encoded. We propose a novel latent-variable formulation for constructing intrinsic probes and derive a tractable variational approximation to the log-likelihood. Our results show that our model is versatile and yields tighter mutual information estimates than two intrinsic probes previously proposed in the literature. Finally, we find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源