论文标题

半自动WordNet使用Word Embeddings链接

Semi-automatic WordNet Linking using Word Embeddings

论文作者

Patel, Kevin, Kanojia, Diptesh, Bhattacharyya, Pushpak

论文摘要

WordNet是丰富的词典语义资源。链接的WordNet是WordNet的扩展,它们在不同语言的WordNet中链接了相似的概念。这些资源在许多自然语言处理(NLP)应用程序中非常有用,主要是基于知识的方法的应用程序。在这种方法中,这些资源被视为黄金标准/甲骨文。因此,这些资源拥有正确的信息至关重要。因此,它们是由人类专家创建的。但是,手动维护这种资源是一件乏味且昂贵的事情。因此,可以帮助专家的技术是可取的。在本文中,我们提出了一种链接WordNet的方法。给定源语言的合成,该方法返回了目标语言中潜在候选合成器的排名列表,人类专家可以从中选择正确的一种。我们的技术能够以所有合成器的60%和70%的名词Synsets的60%的排名排名第10名的列表中的获胜者同步。

Wordnets are rich lexico-semantic resources. Linked wordnets are extensions of wordnets, which link similar concepts in wordnets of different languages. Such resources are extremely useful in many Natural Language Processing (NLP) applications, primarily those based on knowledge-based approaches. In such approaches, these resources are considered as gold standard/oracle. Thus, it is crucial that these resources hold correct information. Thereby, they are created by human experts. However, manual maintenance of such resources is a tedious and costly affair. Thus techniques that can aid the experts are desirable. In this paper, we propose an approach to link wordnets. Given a synset of the source language, the approach returns a ranked list of potential candidate synsets in the target language from which the human expert can choose the correct one(s). Our technique is able to retrieve a winner synset in the top 10 ranked list for 60% of all synsets and 70% of noun synsets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源