论文标题
基于过渡的抽象含义表示与上下文嵌入的解析
Transition-based Abstract Meaning Representation Parsing with Contextual Embeddings
论文作者
论文摘要
理解和生成语言的能力使人类认知与其他已知的生命形式不同。我们研究一种在语义解析的任务中,将两种最成功的途径梳理到语言含义的两种最成功的途径(统计语言模型和符号语义形式主义)。在基于基于过渡的,抽象的含义表示(AMR)解析器,AMREAGER的基础上,我们探索了将验证的上下文感知的单词嵌入 - 例如Bert和Roberta(例如AMR解析的问题),这是在AMR解析的问题上,为我们贡献了一个新的解析器,我们将我们扮演Amrberger。实验发现这些丰富的词汇特征对于改善解析器的整体性能并不特别有用,因为与非上下文相比,Smatch分数衡量了,而其他概念信息则赋予了系统以优于基线的系统。通过病变研究,我们发现上下文嵌入的使用有助于使系统更强大,以消除显式句法特征。这些发现揭示了上下文嵌入的优势和弱点和当前形式的语言模型,并激发了更深入的理解。
The ability to understand and generate languages sets human cognition apart from other known life forms'. We study a way of combing two of the most successful routes to meaning of language--statistical language models and symbolic semantics formalisms--in the task of semantic parsing. Building on a transition-based, Abstract Meaning Representation (AMR) parser, AmrEager, we explore the utility of incorporating pretrained context-aware word embeddings--such as BERT and RoBERTa--in the problem of AMR parsing, contributing a new parser we dub as AmrBerger. Experiments find these rich lexical features alone are not particularly helpful in improving the parser's overall performance as measured by the SMATCH score when compared to the non-contextual counterpart, while additional concept information empowers the system to outperform the baselines. Through lesion study, we found the use of contextual embeddings helps to make the system more robust against the removal of explicit syntactical features. These findings expose the strength and weakness of the contextual embeddings and the language models in the current form, and motivate deeper understanding thereof.