如何利用基于DNN的语音增强多通道扬声器验证？

论文标题

如何利用基于DNN的语音增强多通道扬声器验证？

How to Leverage DNN-based speech enhancement for multi-channel speaker verification?

论文作者

Dowerah, Sandipana, Serizel, Romain, Jouvet, Denis, Mohammadamini, Mohammad, Matrouf, Driss

论文摘要

扬声器验证（SV）由于环境噪音和房间混响的不利影响而在远场场景中表现不佳。这项工作为远场验证的多通道语音增强的基准提出了基准。一种方法是一种基于深度神经网络的方法，另一种方法是深度神经网络和信号处理的组合。我们将DNN体系结构与信号处理技术集成在一起，以进行各种实验。将我们的操作与现有的最新方法进行了比较。我们研究了入学预处理的重要性，在先前的研究中，这在很大程度上被忽略了。实验评估表明，只要注册文件与测试数据类似，并且测试和注册发生在相似的SNR范围内，则预处理可以提高性能。在声音数据集的生成和所有噪声条件下，获得了可观的改进。

Speaker verification (SV) suffers from unsatisfactory performance in far-field scenarios due to environmental noise andthe adverse impact of room reverberation. This work presents a benchmark of multichannel speech enhancement for far-fieldspeaker verification. One approach is a deep neural network-based, and the other is a combination of deep neural network andsignal processing. We integrated a DNN architecture with signal processing techniques to carry out various experiments. Ourapproach is compared to the existing state-of-the-art approaches. We examine the importance of enrollment in pre-processing,which has been largely overlooked in previous studies. Experimental evaluation shows that pre-processing can improve the SVperformance as long as the enrollment files are processed similarly to the test data and that test and enrollment occur within similarSNR ranges. Considerable improvement is obtained on the generated and all the noise conditions of the VOiCES dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题