论文标题

用户定义的关键字发现的公制学习

Metric Learning for User-defined Keyword Spotting

论文作者

Jung, Jaemin, Kim, Youkyum, Park, Jihwan, Lim, Youshin, Kim, Byeong-Yeol, Jang, Youngjoon, Chung, Joon Son

论文摘要

这项工作的目的是检测用户定义的新口语术语。虽然大多数以前的作品将关键字发现(KWS)作为封闭设置的分类问题,但这将其转移性限制为看不见的条款。定义自定义关键字的能力在用户体验方面具有优势。 在本文中,我们提出了一种基于公制的学习培训策略,以用于用户定义的关键字发现。特别是,我们做出以下贡献:(1)我们使用现有语音语料库构建一个大规模的关键字数据集,并提出了一种过滤方法来删除降低模型培训的数据; (2)我们提出了一种基于公制的两阶段培训策略,并证明所提出的方法通过丰富其表示形式来改善用户定义的关键字发现任务的性能; (3)为了促进用户定义的KWS字段中的公平比较,我们提出了统一的评估协议和指标。 我们提出的系统不需要对用户定义的关键字进行增量培训,并且使用拟议的指标以及现有的指标,在​​Google Speech Commands数据集上胜过以前的工作。

The goal of this work is to detect new spoken terms defined by users. While most previous works address Keyword Spotting (KWS) as a closed-set classification problem, this limits their transferability to unseen terms. The ability to define custom keywords has advantages in terms of user experience. In this paper, we propose a metric learning-based training strategy for user-defined keyword spotting. In particular, we make the following contributions: (1) we construct a large-scale keyword dataset with an existing speech corpus and propose a filtering method to remove data that degrade model training; (2) we propose a metric learning-based two-stage training strategy, and demonstrate that the proposed method improves the performance on the user-defined keyword spotting task by enriching their representations; (3) to facilitate the fair comparison in the user-defined KWS field, we propose unified evaluation protocol and metrics. Our proposed system does not require an incremental training on the user-defined keywords, and outperforms previous works by a significant margin on the Google Speech Commands dataset using the proposed as well as the existing metrics.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源