调查作者和出版物特异性特征对学者的H-Index预测的贡献

论文标题

调查作者和出版物特异性特征对学者的H-Index预测的贡献

Investigating the contribution of author- and publication-specific features to scholars' h-index prediction

论文作者

Momeni, Fakhri, Mayr, Philipp, Dietze, Stefan

论文摘要

评估研究人员的产出对于招聘委员会和资助机构至关重要，通常通过其科学生产力，引用或诸如H-Index等组合度量进行衡量。评估年轻研究人员更为关键，因为需要一段时间才能获得引用和h索引的增加。因此，预测H指数可以帮助发现研究人员的科学影响。此外，确定预测科学影响的影响因素对寻求改进解决方案的研究人员有助于。这项研究调查了作者，纸张和特定场所特征对未来H指数的影响。为此，我们使用机器学习方法来预测H-指数和特征分析技术，以提高对功能影响的理解。利用Scopus中的文献计量数据，我们定义并提取了两个主要特征组。第一个涉及先前的科学影响，我们将其命名为“基于影响力的特征”，包括出版物的数量，引用和H索引。第二组是“基于非影响力的功能”，并包含与作者，共同制作，纸张和场地特征相关的功能。我们探讨了它们在三个不同职业阶段的研究人员预测H-INDEX方面的重要性。此外，我们研究了预测不同特征类别的性能的时间维度，以找出哪些功能对于长期和短期预测更可靠。我们提到了作者的性别，以研究作者在预测任务中的作用。我们的发现表明，性别在预测H索引方面具有很小的影响。我们发现，在短期内，基于非影响力的特征比老年人更强大。同样，从长期来看，基于影响的功能将失去比其他功能更多的预测能力。

Evaluation of researchers' output is vital for hiring committees and funding bodies, and it is usually measured via their scientific productivity, citations, or a combined metric such as h-index. Assessing young researchers is more critical because it takes a while to get citations and increment of h-index. Hence, predicting the h-index can help to discover the researchers' scientific impact. In addition, identifying the influential factors to predict the scientific impact is helpful for researchers seeking solutions to improve it. This study investigates the effect of author, paper and venue-specific features on the future h-index. For this purpose, we used machine learning methods to predict the h-index and feature analysis techniques to advance the understanding of feature impact. Utilizing the bibliometric data in Scopus, we defined and extracted two main groups of features. The first relates to prior scientific impact, and we name it 'prior impact-based features' and includes the number of publications, received citations, and h-index. The second group is 'non-impact-based features' and contains the features related to author, co-authorship, paper, and venue characteristics. We explored their importance in predicting h-index for researchers in three different career phases. Also, we examine the temporal dimension of predicting performance for different feature categories to find out which features are more reliable for long- and short-term prediction. We referred to the gender of the authors to examine the role of this author's characteristics in the prediction task. Our findings showed that gender has a very slight effect in predicting the h-index. We found that non-impact-based features are more robust predictors for younger scholars than seniors in the short term. Also, prior impact-based features lose their power to predict more than other features in the long-term.

下载PDF全文

下载文献需遵守相关版权规定

论文标题