论文标题

使用时间卷积注意网络改善音频异常识别

Improving Audio Anomalies Recognition Using Temporal Convolutional Attention Network

论文作者

Huang, Qiang, Hain, Thomas

论文摘要

语音记录中的异常音频通常是由说话者语音失真,外部噪声甚至电动干扰引起的。在某些领域,这些障碍已成为一个严重的问题,例如高质量的音乐混合和语音处理。在本文中,提出了一种使用时间卷积注意网络(TCAN)的新方法来解决此问题。时间传统网络(TCN)的使用可以使用时间卷积过滤器的层次结构捕获远程模式。为了增强在不同的声学条件下处理音频异常的能力,在TCN中使用了注意机制,在每个时间卷积层之后,添加了一个自我发项障碍。这旨在突出与目标相关的功能,并减轻无关信息的干扰。为了评估提出的模型的性能,从TIMIT数据集收集了音频记录,然后通过添加五种不同类型的音频扭曲来更改:高斯噪声,幅度噪声,随机辍学,降低时间分辨率和时间扭曲。以不同的信噪比(SNR)(5dB,10dB,15dB,20dB,20dB,25dB,30dB)混合失真。实验结果表明,与某些强大的基线方法(例如基于LSTM和TCN的模型)相比,提出的模型的使用可以产生更好的分类性能,以大约3 $ \ sim $ 10 \%的相对改进。

Anomalous audio in speech recordings is often caused by speaker voice distortion, external noise, or even electric interferences. These obstacles have become a serious problem in some fields, such as high-quality music mixing and speech processing. In this paper, a novel approach using a temporal convolutional attention network (TCAN) is proposed to tackle this problem. The use of temporal conventional network (TCN) can capture long range patterns using a hierarchy of temporal convolutional filters. To enhance the ability to tackle audio anomalies in different acoustic conditions, an attention mechanism is used in TCN, where a self-attention block is added after each temporal convolutional layer. This aims to highlight the target related features and mitigate the interferences from irrelevant information. To evaluate the performance of the proposed model, audio recordings are collected from the TIMIT dataset, and are then changed by adding five different types of audio distortions: gaussian noise, magnitude drift, random dropout, reduction of temporal resolution, and time warping. Distortions are mixed at different signal-to-noise ratios (SNRs) (5dB, 10dB, 15dB, 20dB, 25dB, 30dB). The experimental results show that the use of proposed model can yield better classification performances than some strong baseline methods, such as the LSTM and TCN based models, by approximate 3$\sim$ 10\% relative improvements.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源