论文标题
对演讲者验证的语音补偿对声音努力条件有牢固的验证
Shouted Speech Compensation for Speaker Verification Robust to Vocal Effort Conditions
论文作者
论文摘要
当入学率和测试之间的声音努力条件(例如,大喊大叫与正常语音)不同时,说话者验证系统的性能会降低。这是非合作扬声器验证任务的潜在情况。在本文中,我们介绍了一项研究,以利用高斯混合模型来大喊大叫和正常语音域的不同方法进行嵌入的不同方法。这些补偿技术是从自动语音识别的鲁棒性领域借来的,在这项工作中,我们将其应用于扬声器验证中的大喊和正常条件之间的不匹配。在补偿之前,通过逻辑回归自动检测到大喊的条件。该过程在计算上是光线,并且在X-Vector系统的后端进行。实验结果表明,在存在声乐努力不匹配的情况下,应用所提出的方法的结果相对于既不应用喊叫语音检测或补偿的系统的误差率相对改善,都会产生13.8%的误差率相对改善。
The performance of speaker verification systems degrades when vocal effort conditions between enrollment and test (e.g., shouted vs. normal speech) are different. This is a potential situation in non-cooperative speaker verification tasks. In this paper, we present a study on different methods for linear compensation of embeddings making use of Gaussian mixture models to cluster shouted and normal speech domains. These compensation techniques are borrowed from the area of robustness for automatic speech recognition and, in this work, we apply them to compensate the mismatch between shouted and normal conditions in speaker verification. Before compensation, shouted condition is automatically detected by means of logistic regression. The process is computationally light and it is performed in the back-end of an x-vector system. Experimental results show that applying the proposed approach in the presence of vocal effort mismatch yields up to 13.8% equal error rate relative improvement with respect to a system that applies neither shouted speech detection nor compensation.