论文标题
盲人面具以提高非平稳嘈杂演讲的清晰度
Blind Mask to Improve Intelligibility of Non-Stationary Noisy Speech
论文作者
论文摘要
这封信提出了一种新型的盲目声学面具(BAM),旨在适应噪声组件并保留时间域中的目标语音段。强大的标准偏差估计器应用于非平稳噪声语音以识别噪声掩盖元件。提出的解决方案的主要贡献是使用此噪声统计数据来得出自适应信息来定义和选择噪声比例较低的样品。因此,保持语音清晰度。此外,此非理想面具先前不需要目标语音和噪声信号统计信息的信息。考虑语音信号损坏了三个非常规声音噪声和六个非固定的声音噪声和六个值的信噪比(SNR),评估了BAM和三种竞争方法,即理想的二进制掩码(IBM),目标二进制掩码(TBM)和非平稳性噪声估计(NNESE)。结果表明,BAM技术可以在保持良好的语音质量的同时,可以达到与理想面具相当的可理解性提高。
This letter proposes a novel blind acoustic mask (BAM) designed to adaptively detect noise components and preserve target speech segments in time-domain. A robust standard deviation estimator is applied to the non-stationary noisy speech to identify noise masking elements. The main contribution of the proposed solution is the use of this noise statistics to derive an adaptive information to define and select samples with lower noise proportion. Thus, preserving speech intelligibility. Additionally, no information of the target speech and noise signals statistics is previously required to this non-ideal mask. The BAM and three competitive methods, Ideal Binary Mask (IBM), Target Binary Mask (TBM), and Non-stationary Noise Estimation for Speech Enhancement (NNESE), are evaluated considering speech signals corrupted by three non-stationary acoustic noises and six values of signal-to-noise ratio (SNR). Results demonstrate that the BAM technique achieves intelligibility gains comparable to ideal masks while maintaining good speech quality.