平行的封闭式神经网络具有注意力的注意机制

论文标题

平行的封闭式神经网络具有注意力的注意机制

Parallel Gated Neural Network With Attention Mechanism For Speech Enhancement

论文作者

Cui, Jianqiao, Bleeck, Stefan

论文摘要

深度学习算法越来越多地用于增强语音（SE）。在有监督的方法中，需要全球和本地信息才能进行准确的光谱映射。关键限制通常是对关键上下文信息的捕获量不佳。为了利用长期的目标扬声器并补偿清洁言语的扭曲，本文采用了序列序列（S2S）映射结构，并提出了一种新型的单声道语音增强系统，包括特征提取块（FEB），由补偿增强块（COMEB）（COMEB）和MASK BLOCK（MB）（MB）组成。在2月，一个U-NET块用于使用复杂值值的光谱提取抽象特征，该光谱具有一条路径，使用掩蔽方法抑制幅度域中的背景噪声，而MB从Feband中获得了Feband的幅度特征，可以补偿COMEB从COMEB产生的丢失的复杂域特征，以恢复最终的清洁语音。实验是在Librispeech数据集上进行的，结果表明，在ESTOI和PESQ分数方面，所提出的模型比最近的模型更好。

Deep learning algorithm are increasingly used for speech enhancement (SE). In supervised methods, global and local information is required for accurate spectral mapping. A key restriction is often poor capture of key contextual information. To leverage long-term for target speakers and compensate distortions of cleaned speech, this paper adopts a sequence-to-sequence (S2S) mapping structure and proposes a novel monaural speech enhancement system, consisting of a Feature Extraction Block (FEB), a Compensation Enhancement Block (ComEB) and a Mask Block (MB). In the FEB a U-net block is used to extract abstract features using complex-valued spectra with one path to suppress the background noise in the magnitude domain using masking methods and the MB takes magnitude features from the FEBand compensates the lost complex-domain features produced from ComEB to restore the final cleaned speech. Experiments are conducted on the Librispeech dataset and results show that the proposed model obtains better performance than recent models in terms of ESTOI and PESQ scores.

下载PDF全文

下载文献需遵守相关版权规定

论文标题