论文标题
探测器和解析器的故事
A Tale of a Probe and a Parser
论文作者
论文摘要
在NLP中,测量在语言神经模型中编码的语言信息已被流行。研究人员通过训练“探针”的监督模型来处理该企业,旨在从另一个模型的输出中提取语言结构。一个这样的探针是结构探针(Hewitt and Manning,2019年),旨在量化在上下文化的单词表示中编码句法信息的程度。结构探针具有一种新颖的设计,在解析文献中未经证实,其确切好处并不明显。为了探索句法探针是否会更好地利用现有技术,我们将结构探针与更传统的解析器进行比较,并具有相同的轻质参数化。解析器在九种分析的语言中的七个中,对UUA的结构探针都优于结构探测,通常用大量数量(例如,英语为11.1分)。但是,在第二个不常见的度量下,存在相反的趋势 - 结构探针的表现优于解析器。这就提出了一个问题:我们应该喜欢哪个指标?
Measuring what linguistic information is encoded in neural models of language has become popular in NLP. Researchers approach this enterprise by training "probes" - supervised models designed to extract linguistic structure from another model's output. One such probe is the structural probe (Hewitt and Manning, 2019), designed to quantify the extent to which syntactic information is encoded in contextualised word representations. The structural probe has a novel design, unattested in the parsing literature, the precise benefit of which is not immediately obvious. To explore whether syntactic probes would do better to make use of existing techniques, we compare the structural probe to a more traditional parser with an identical lightweight parameterisation. The parser outperforms structural probe on UUAS in seven of nine analysed languages, often by a substantial amount (e.g. by 11.1 points in English). Under a second less common metric, however, there is the opposite trend - the structural probe outperforms the parser. This begs the question: which metric should we prefer?