论文标题

中文的“字母”发行

The 'Letter' Distribution in the Chinese Language

论文作者

Chen, Qinghua, Wang, Yan, Wang, Mengmeng, Li, Xiaomeng

论文摘要

基于语料库的统计分析在语言研究中起着重要作用,并且充分的证据表明,不同的语言表现出一些共同的法律。研究发现,某些字母写作语言中的字母具有非常相似的统计用法频率分布。这适用于使用意识形态写作的中国人?我们获得了某些字母写作语言的字母频率数据,并找到了字母分布的普通法。此外,我们在从唐朝到现在的不同历史时期收集了中国文学公司,并将中文的书面语言拆除为三种基本粒子:角色,笔触和建设性部分。统计分析的结果表明,在不同的历史时期,在中文写作中使用基本粒子的强度各不相同,但是分布的形式是一致的。特别是,中国建设性部分的分布肯定与那些字母写作语言一致。这项研究提供了人类语言一致性的新证据。

Corpus-based statistical analysis plays a significant role in linguistic research, and ample evidence has shown that different languages exhibit some common laws. Studies have found that letters in some alphabetic writing languages have strikingly similar statistical usage frequency distributions. Does this hold for Chinese, which employs ideogram writing? We obtained letter frequency data of some alphabetic writing languages and found the common law of the letter distributions. In addition, we collected Chinese literature corpora for different historical periods from the Tang Dynasty to the present, and we dismantled the Chinese written language into three kinds of basic particles: characters, strokes and constructive parts. The results of the statistical analysis showed that, in different historical periods, the intensity of the use of basic particles in Chinese writing varied, but the form of the distribution was consistent. In particular, the distributions of the Chinese constructive parts are certainly consistent with those alphabetic writing languages. This study provides new evidence of the consistency of human languages.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源