论文标题
儿童声音的自动性别分类和随着年龄的年龄的不同因素的变化
Automated Sex Classification of Children's Voices and Changes in Differentiating Factors with Age
论文作者
论文摘要
儿童声音的性别分类允许调查次要性特征的发展,这一直是语音分析领域的关键兴趣。这项研究调查了脚本和自发语音的广泛声学特征,并应用了基于等级聚类的机器学习模型,以区分5至15岁之间的儿童性别。我们提出了一个最佳特征集,我们的建模在所有年龄段的平均F1得分(精度和回忆的谐波平均值)为0.84。我们的结果表明,当为每个年龄组开发模型而不是4岁年龄段的儿童时,性别分类通常更准确,而分类精度对于年龄较大的年龄组更好。我们发现,自发的演讲可以提供性别分类的有用线索,而不是脚本演讲,尤其是对于7岁以下的儿童。对于年龄段的年龄段,广泛的声学因素对性别分类均匀贡献,而对于年龄段的年龄段,发现与F0相关的声学因素通常是最关键的预测因素。年龄段的其他重要声学因素包括声道长度估计器,光谱通量,响度和未发音的特征。
Sex classification of children's voices allows for an investigation of the development of secondary sex characteristics which has been a key interest in the field of speech analysis. This research investigated a broad range of acoustic features from scripted and spontaneous speech and applied a hierarchical clustering-based machine learning model to distinguish the sex of children aged between 5 and 15 years. We proposed an optimal feature set and our modelling achieved an average F1 score (the harmonic mean of the precision and recall) of 0.84 across all ages. Our results suggest that the sex classification is generally more accurate when a model is developed for each year group rather than for children in 4-year age bands, with classification accuracy being better for older age groups. We found that spontaneous speech could provide more helpful cues in sex classification than scripted speech, especially for children younger than 7 years. For younger age groups, a broad range of acoustic factors contributed evenly to sex classification, while for older age groups, F0-related acoustic factors were found to be the most critical predictors generally. Other important acoustic factors for older age groups include vocal tract length estimators, spectral flux, loudness and unvoiced features.