论文标题

公共法规贡献中的地理多样性

Geographic Diversity in Public Code Contributions

论文作者

Rossi, Davide, Zacchiroli, Stefano

论文摘要

我们对50年的公共版本控制系统存储库进行了探索性的,大规模的纵向研究,以表征贡献者对公共法规的地理多样性及其演变的演变。我们分析了来自1.6亿个项目的软件遗产收集的总计22亿美元的投入,并在1971 - 2021年期间由4,300万作者撰写。我们将开发人员挖掘到源自联合国Geoscheme的12个世界区域,用作信号电子邮件顶级域名,作者名称与世界各地的名称分布相比,以及从Commit Metadata开采的UTC偏移。我们找到了北美早期在开源软件中北美早期优势的证据,后来又加入了欧洲。在那段时间之后,公共法规的地理多样性一直在不断增加。我们还确定了与UNIX战争有关的相关历史转变,中亚和南亚的编码素养的提高以及更广泛的现象,例如殖民主义和跨国的人们(移民/移民)。

We conduct an exploratory, large-scale, longitudinal study of 50 years of commits to publicly available version control system repositories, in order to characterize the geographic diversity of contributors to public code and its evolution over time. We analyze in total 2.2 billion commits collected by Software Heritage from 160 million projects and authored by 43 million authors during the 1971-2021 time period. We geolocate developers to 12 world regions derived from the United Nation geoscheme, using as signals email top-level domains, author names compared with names distributions around the world, and UTC offsets mined from commit metadata.We find evidence of the early dominance of North America in open source software, later joined by Europe. After that period, the geographic diversity in public code has been constantly increasing. We also identify relevant historical shifts related to the UNIX wars, the increase of coding literacy in Central and South Asia, and broader phenomena like colonialism and people movement across countries (immigration/emigration).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源