DBNET：DOA驱动的边界网络，用于端到端的Farfield声源分离

论文标题

DBNET：DOA驱动的边界网络，用于端到端的Farfield声源分离

DBNET: DOA-driven beamforming network for end-to-end farfield sound source separation

论文作者

Aroudi, Ali, Braun, Sebastian

论文摘要

许多深度学习技术可用于执行源分离并减少背景噪声。但是，使用深度学习和传统的声学信号处理技术设计端到端的多通道源分离方法仍然具有挑战性。在本文中，我们提出了一个由到达方向驱动的横梁成形网络（DBNET），该网络由到达方向（DOA）估计和端到端源分离的波束形成层组成。我们建议使用仅基于分离的语音信号和目标语音信号之间的距离的损失函数来训练DBNET，而无需对扬声器的真实性驱动器。为了提高源分离性能，我们还建议将DBNET的端到端扩展包含，以结合后掩蔽网络。我们在一个非常具有挑战性的数据集中评估了提出的DBNET及其扩展，以逼真的远场声源分离在混响和嘈杂的环境中。实验结果表明，提出的使用卷积终止后掩蔽网络的扩展DBNET优于最先进的源分离方法。

Many deep learning techniques are available to perform source separation and reduce background noise. However, designing an end-to-end multi-channel source separation method using deep learning and conventional acoustic signal processing techniques still remains challenging. In this paper we propose a direction-of-arrival-driven beamforming network (DBnet) consisting of direction-of-arrival (DOA) estimation and beamforming layers for end-to-end source separation. We propose to train DBnet using loss functions that are solely based on the distances between the separated speech signals and the target speech signals, without a need for the ground-truth DOAs of speakers. To improve the source separation performance, we also propose end-to-end extensions of DBnet which incorporate post masking networks. We evaluate the proposed DBnet and its extensions on a very challenging dataset, targeting realistic far-field sound source separation in reverberant and noisy environments. The experimental results show that the proposed extended DBnet using a convolutional-recurrent post masking network outperforms state-of-the-art source separation methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题