CLAR：语义角色标签的跨语性论点正常

论文标题

CLAR：语义角色标签的跨语性论点正常

CLAR: A Cross-Lingual Argument Regularizer for Semantic Role Labeling

论文作者

Jindal, Ishan, Li, Yunyao, Brahma, Siddhartha, Zhu, Huaiyu

论文摘要

语义角色标记（SRL）在给定的句子中识别谓词题目结构。尽管不同的语言有不同的论点注释，但是多语言培训，即对多种语言进行训练的一种模型的想法，以前已被证明超过了单语言基线，尤其是对于低资源语言而言。实际上，即使是简单的数据组合也被证明可以通过代表共享表示空间中的遥远词汇来有效。同时，尽管语言之间的参数注释不同，但某些参数标签确实在跨语言中具有共同的语义含义（例如，辅助语在跨语言具有或多或少具有相似的语义含义）。为了利用语言跨语言的注释空间的相似性，我们提出了一种称为跨语性参数正常化程序（CLAR）的方法。 CLAR识别语言之间的这种语言注释相似性，并利用此信息使用源语言参数所在的空间的转换来映射目标语言参数。通过这样做，我们的实验结果表明，CLAR始终如一地改善低资源语言的单语言和多语言基线的多种语言的SRL性能。

Semantic role labeling (SRL) identifies predicate-argument structure(s) in a given sentence. Although different languages have different argument annotations, polyglot training, the idea of training one model on multiple languages, has previously been shown to outperform monolingual baselines, especially for low resource languages. In fact, even a simple combination of data has been shown to be effective with polyglot training by representing the distant vocabularies in a shared representation space. Meanwhile, despite the dissimilarity in argument annotations between languages, certain argument labels do share common semantic meaning across languages (e.g. adjuncts have more or less similar semantic meaning across languages). To leverage such similarity in annotation space across languages, we propose a method called Cross-Lingual Argument Regularizer (CLAR). CLAR identifies such linguistic annotation similarity across languages and exploits this information to map the target language arguments using a transformation of the space on which source language arguments lie. By doing so, our experimental results show that CLAR consistently improves SRL performance on multiple languages over monolingual and polyglot baselines for low resource languages.

下载PDF全文

下载文献需遵守相关版权规定

论文标题