论文标题
从基于DNN的语音识别者中利用隐藏表示的语音可理解性预测,听力受损的听众
Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners
论文作者
论文摘要
对于许多应用,诸如助听器的语音增强之类的应用程序,准确的客观语音可理解性预测算法引起了极大的兴趣。大多数算法都测量了清洁参考信号和降级信号的声学特征之间的信噪比或相关性。但是,这些手工挑选的声学特征通常与识别无明显相关。同时,基于深度神经网络(DNN)的自动语音识别者(ASR)在某些语音识别任务中正在接近人类绩效。这项工作利用了基于DNN的ASR的隐藏表示形式,作为听力受损听众中语音可理解性预测的功能。基于助听器可理解性数据库的实验表明,与广泛使用的短期客观可理解性(Stoi)双耳测度相比,提出的方法可以做出更好的预测。
An accurate objective speech intelligibility prediction algorithms is of great interest for many applications such as speech enhancement for hearing aids. Most algorithms measures the signal-to-noise ratios or correlations between the acoustic features of clean reference signals and degraded signals. However, these hand-picked acoustic features are usually not explicitly correlated with recognition. Meanwhile, deep neural network (DNN) based automatic speech recogniser (ASR) is approaching human performance in some speech recognition tasks. This work leverages the hidden representations from DNN-based ASR as features for speech intelligibility prediction in hearing-impaired listeners. The experiments based on a hearing aid intelligibility database show that the proposed method could make better prediction than a widely used short-time objective intelligibility (STOI) based binaural measure.