论文标题

NLP-CIC @ ducr-iTa:POS和基于邻居的词汇语义变化的分布模型

NLP-CIC @ DIACR-Ita: POS and Neighbor Based Distributional Models for Lexical Semantic Change in Diachronic Italian Corpora

论文作者

Angel, Jason, Rodriguez-Diaz, Carlos A., Gelbukh, Alexander, Jimenez, Sergio

论文摘要

我们介绍了在Diacr-ita共享任务中的无监督的词汇语义变化的系统和发现。我们建议在整个时期内代表目标词的两个模型,以使用阈值和投票方案来预测不断变化的单词。我们的第一个模型仅依赖于语音的一部分和距离衡量标准的集合。第二个模型使用单词嵌入表示形式来提取邻居之间的邻居相对距离,并提出“绝对差异的平均值”以估计词汇语义变化。我们的模型取得了胜利的结果,在Diacr-Ita竞赛中排名第三。此外,我们尝试了第二个模型的K_neighbor参数,以比较使用“绝对差异的平均值”与Hamilton等人使用的余弦距离的影响。 (2016)。

We present our systems and findings on unsupervised lexical semantic change for the Italian language in the DIACR-Ita shared-task at EVALITA 2020. The task is to determine whether a target word has evolved its meaning with time, only relying on raw-text from two time-specific datasets. We propose two models representing the target words across the periods to predict the changing words using threshold and voting schemes. Our first model solely relies on part-of-speech usage and an ensemble of distance measures. The second model uses word embedding representation to extract the neighbor's relative distances across spaces and propose "the average of absolute differences" to estimate lexical semantic change. Our models achieved competent results, ranking third in the DIACR-Ita competition. Furthermore, we experiment with the k_neighbor parameter of our second model to compare the impact of using "the average of absolute differences" versus the cosine distance used in Hamilton et al. (2016).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源