深度剩余密集的晶格网络，用于增强语音

论文标题

深度剩余密集的晶格网络，用于增强语音

Deep Residual-Dense Lattice Network for Speech Enhancement

论文作者

Nikzad, Mohammad, Nicolson, Aaron, Gao, Yongsheng, Zhou, Jun, Paliwal, Kuldip K., Shang, Fanhua

论文摘要

卷积神经网络（CNN）具有残留联系（重新连接）和因果关系扩张的卷积单元一直是深度学习方法增强方法的首选网络。尽管残留链接可以改善训练期间的梯度流，但由于重复的求和，浅层输出的特征会减小，而较深的层输出的求和。提高功能重新使用的一种策略是融合重新NET和密集连接的CNN（登录）。但是，Densenets的特征重新使用参数过度分配。在此激励的情况下，我们提出了残留的密度晶格网络（RDL-NET），这是一种用于语音增强的新CNN，可以使用残留和致密聚集，而无需过度分配参数以重新使用特征。这是通过RDL块的拓扑来管理的，RDL块的拓扑限制了用于致密聚合的输出数量。我们广泛的实验研究表明，与使用残留和/或密集聚集的CNN相比，RDL-NET能够实现更高的语音增强性能。 RDL-NETS还使用较少的参数，并且具有较低的计算要求。此外，我们证明，RDL网络的表现优于许多最先进的深度学习方法来增强语音。

Convolutional neural networks (CNNs) with residual links (ResNets) and causal dilated convolutional units have been the network of choice for deep learning approaches to speech enhancement. While residual links improve gradient flow during training, feature diminution of shallow layer outputs can occur due to repetitive summations with deeper layer outputs. One strategy to improve feature re-usage is to fuse both ResNets and densely connected CNNs (DenseNets). DenseNets, however, over-allocate parameters for feature re-usage. Motivated by this, we propose the residual-dense lattice network (RDL-Net), which is a new CNN for speech enhancement that employs both residual and dense aggregations without over-allocating parameters for feature re-usage. This is managed through the topology of the RDL blocks, which limit the number of outputs used for dense aggregations. Our extensive experimental investigation shows that RDL-Nets are able to achieve a higher speech enhancement performance than CNNs that employ residual and/or dense aggregations. RDL-Nets also use substantially fewer parameters and have a lower computational requirement. Furthermore, we demonstrate that RDL-Nets outperform many state-of-the-art deep learning approaches to speech enhancement.

下载PDF全文

下载文献需遵守相关版权规定

论文标题