论文标题
评估语义空间中分布假设的局限性:基于特质的关系知识和共发生的影响
Assessing the Limits of the Distributional Hypothesis in Semantic Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences
论文作者
论文摘要
由于分布模型的流行和深度学习的流行,NLP的性能提高使它的可解释性相互降低。这促使人们专注于神经网络对自然语言的了解,而不再关注如何。一些工作集中在用于开发数据驱动模型的数据上,但通常这一工作旨在突出数据中的问题,例如突出和抵消有害偏见。这项工作有助于模型中数据中所需的内容相对不受欢迎的路径,以捕获自然语言的有意义表示。这需要评估英语和西班牙语的语义空间如何捕获特定类型的关系知识,即与概念相关的特征(例如Bananas-Yellow),并在这种情况下探索共同发生的作用。
The increase in performance in NLP due to the prevalence of distributional models and deep learning has brought with it a reciprocal decrease in interpretability. This has spurred a focus on what neural networks learn about natural language with less of a focus on how. Some work has focused on the data used to develop data-driven models, but typically this line of work aims to highlight issues with the data, e.g. highlighting and offsetting harmful biases. This work contributes to the relatively untrodden path of what is required in data for models to capture meaningful representations of natural language. This entails evaluating how well English and Spanish semantic spaces capture a particular type of relational knowledge, namely the traits associated with concepts (e.g. bananas-yellow), and exploring the role of co-occurrences in this context.