论文标题
通过科学技术链接绘制复杂技术;神经科学的情况 - 基于变压器的关键字提取方法
Mapping Complex Technologies via Science-Technology Linkages; The Case of Neuroscience -- A transformer based keyword extraction approach
论文作者
论文摘要
在本文中,我们提出了一种有效的基于深度学习的方法,用于提取科学文献中与技术相关的主题和关键字,并在专利应用程序中确定相应的技术。具体来说,我们利用基于变压器的语言模型,该模型量身定制,用于与科学文本一起使用,并随着时间的推移检测相干主题,并通过从大型文本语料库中自动提取的相关关键字来描述这些主题。我们使用命名实体识别来确定这些关键字,从而区分描述方法,应用程序和其他科学术语的关键字。我们根据方法和应用程序关键字的组合创建大量搜索查询,我们用来进行语义搜索并识别相关专利。通过这样做,我们旨在为基于文本的技术映射和预测的不断增长的研究贡献,这些研究利用了自然语言处理和深度学习的最新进展。我们能够将科学文献中确定的技术绘制为专利申请,从而为科学技术联系研究提供了经验基础。我们说明了通过将神经科学领域中的出版物映射到相关专利申请中获得的工作流程和结果。
In this paper, we present an efficient deep learning based approach to extract technology-related topics and keywords within scientific literature, and identify corresponding technologies within patent applications. Specifically, we utilize transformer based language models, tailored for use with scientific text, to detect coherent topics over time and describe these by relevant keywords that are automatically extracted from a large text corpus. We identify these keywords using Named Entity Recognition, distinguishing between those describing methods, applications and other scientific terminology. We create a large amount of search queries based on combinations of method- and application-keywords, which we use to conduct semantic search and identify related patents. By doing so, we aim at contributing to the growing body of research on text-based technology mapping and forecasting that leverages latest advances in natural language processing and deep learning. We are able to map technologies identified in scientific literature to patent applications, thereby providing an empirical foundation for the study of science-technology linkages. We illustrate the workflow as well as results obtained by mapping publications within the field of neuroscience to related patent applications.