使用基于相关的功能表示的音频障碍识别

论文标题

使用基于相关的功能表示的音频障碍识别

Audio Impairment Recognition Using a Correlation-Based Feature Representation

论文作者

Ragano, Alessandro, Benetos, Emmanouil, Hines, Andrew

论文摘要

音频障碍识别是基于在音频文件中找到噪声并对损伤类型进行分类的基础。最近，由于使用先进的深度学习模型，已经获得了显着的绩效提高。但是，功能鲁棒性仍然是一个尚未解决的问题，这是我们需要强大的深度学习体系结构的主要原因之一。在各种音乐风格的情况下，手工制作的功能在捕获音频降解特征方面的效率较低，并且在识别音频障碍时容易出现失败，并且可能会错误地学习音乐概念而不是损害类型。在本文中，我们提出了基于特征对的相关性的手工制作特征的新表示。我们通过实验将所提出的基于相关的特征表示与机器学习中使用的典型原始特征表示形式进行了比较，并且在紧凑的特征维度和提高测试阶段的计算速度方面，我们在测试阶段表现出了卓越的性能，同时实现了可比的精度。

Audio impairment recognition is based on finding noise in audio files and categorising the impairment type. Recently, significant performance improvement has been obtained thanks to the usage of advanced deep learning models. However, feature robustness is still an unresolved issue and it is one of the main reasons why we need powerful deep learning architectures. In the presence of a variety of musical styles, hand-crafted features are less efficient in capturing audio degradation characteristics and they are prone to failure when recognising audio impairments and could mistakenly learn musical concepts rather than impairment types. In this paper, we propose a new representation of hand-crafted features that is based on the correlation of feature pairs. We experimentally compare the proposed correlation-based feature representation with a typical raw feature representation used in machine learning and we show superior performance in terms of compact feature dimensionality and improved computational speed in the test stage whilst achieving comparable accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题