1CADEMY在SEMEVAL-2022任务1：调查反向字典任务的多语言，多任务和语言技巧的有效性

论文标题

1CADEMY在SEMEVAL-2022任务1：调查反向字典任务的多语言，多任务和语言技巧的有效性

1Cademy at Semeval-2022 Task 1: Investigating the Effectiveness of Multilingual, Multitask, and Language-Agnostic Tricks for the Reverse Dictionary Task

论文作者

Wang, Zhiyong, Zhang, Ge, Lashkarashvili, Nineli

论文摘要

本文介绍了我们的SEMEVAL2022任务的系统，该任务将字典光泽与单词嵌入匹配。我们专注于竞争的反向字典曲目，该字典将多语言光泽映射到重建矢量表示形式。更具体地说，模型将句子的输入转换为三种类型的嵌入：sgns，char和electra。我们提出了几个用于应用神经网络单元，一般多语言和多任务结构以及对任务的语言技巧的实验。我们还提供了不同类型的单词嵌入和消融研究的比较，以提出有用的策略。我们最初的基于变压器的模型可实现相对较低的性能。但是，对不同递归方法的试验表明性能的提高。我们提出的基于Elmob的单语言模型可实现最高的结果，其多任务和多语言品种也显示出竞争性的结果。

This paper describes our system for the SemEval2022 task of matching dictionary glosses to word embeddings. We focus on the Reverse Dictionary Track of the competition, which maps multilingual glosses to reconstructed vector representations. More specifically, models convert the input of sentences to three types of embeddings: SGNS, Char, and Electra. We propose several experiments for applying neural network cells, general multilingual and multitask structures, and language-agnostic tricks to the task. We also provide comparisons over different types of word embeddings and ablation studies to suggest helpful strategies. Our initial transformer-based model achieves relatively low performance. However, trials on different retokenization methodologies indicate improved performance. Our proposed Elmobased monolingual model achieves the highest outcome, and its multitask, and multilingual varieties show competitive results as well.

下载PDF全文

下载文献需遵守相关版权规定

论文标题