有效的动态滤波器，用于鲁棒和低计算特征提取

论文标题

有效的动态滤波器，用于鲁棒和低计算特征提取

Efficient dynamic filter for robust and low computational feature extraction

论文作者

Kim, Donghyeon, Kim, Gwantae, Lee, Bokyeung, Kwak, Jeong-gi, Han, David K., Ko, Hanseok

论文摘要

在模型训练过程中未考虑的看不见的噪声信号很难预期，并且会导致性能降解。已经研究了各种方法来减轻看不见的噪声。在我们以前的工作中，提出了实例级动力滤波器（IDF）和像素动态滤波器（PDF）来提取噪声般的特征。但是，由于简单的功能池用于减少IDF部分中的计算资源，因此动态过滤器的性能可能会降低。在本文中，我们提出了一个有效的动态滤波器，以增强动态滤波器的性能。我们没有利用简单的特征均值，而是将时间频率（T-F）特征分开为非重叠的块，并为每个功能方向进行可分离的卷积（块间和内部块）。此外，我们提出了动态注意集合，以将高维特征映射为低维特征嵌入。这些方法应用于IDF，以进行关键字发现和扬声器验证任务。我们确认我们所提出的方法在看不见的环境（看不见的噪音和看不见的扬声器）中的性能要比最新的模型更好。

Unseen noise signal which is not considered in a model training process is difficult to anticipate and would lead to performance degradation. Various methods have been investigated to mitigate unseen noise. In our previous work, an Instance-level Dynamic Filter (IDF) and a Pixel Dynamic Filter (PDF) were proposed to extract noise-robust features. However, the performance of the dynamic filter might be degraded since simple feature pooling is used to reduce the computational resource in the IDF part. In this paper, we propose an efficient dynamic filter to enhance the performance of the dynamic filter. Instead of utilizing the simple feature mean, we separate Time-Frequency (T-F) features as non-overlapping chunks, and separable convolutions are carried out for each feature direction (inter chunks and intra chunks). Additionally, we propose Dynamic Attention Pooling that maps high dimensional features as low dimensional feature embeddings. These methods are applied to the IDF for keyword spotting and speaker verification tasks. We confirm that our proposed method performs better in unseen environments (unseen noise and unseen speakers) than state-of-the-art models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题