论文标题
社会科学的单词嵌入:一项跨学科调查
Word Embedding for Social Sciences: An Interdisciplinary Survey
论文作者
论文摘要
为了从复杂的数据中提取基本信息,计算机科学家一直在开发学习低维表示模式的机器学习模型。从机器学习研究的这种进步中,不仅计算机科学家,而且社会科学家也使他们的研究受益并推进了他们的研究,因为人类的行为或社会现象在于复杂的数据。但是,这种新兴趋势并未得到充分记录,因为不同的社会科学领域很少涵盖彼此的工作,从而导致文献中的知识分散。为了记录这一新兴趋势,我们调查了将单词嵌入技术应用于人类行为挖掘的最新研究。我们建立了一个分类学来说明被调查论文中使用的方法和程序,并帮助社会科学研究人员在文献中有关单词嵌入应用的文献中的研究将其背景下。这项调查还进行了一个简单的实验,以警告文献中使用的共同相似性测量值,即使它们在总级别返回一致的结果,也可能产生不同的结果。
To extract essential information from complex data, computer scientists have been developing machine learning models that learn low-dimensional representation mode. From such advances in machine learning research, not only computer scientists but also social scientists have benefited and advanced their research because human behavior or social phenomena lies in complex data. However, this emerging trend is not well documented because different social science fields rarely cover each other's work, resulting in fragmented knowledge in the literature. To document this emerging trend, we survey recent studies that apply word embedding techniques to human behavior mining. We built a taxonomy to illustrate the methods and procedures used in the surveyed papers, aiding social science researchers in contextualizing their research within the literature on word embedding applications. This survey also conducts a simple experiment to warn that common similarity measurements used in the literature could yield different results even if they return consistent results at an aggregate level.