低资源语音识别的课程优化

论文标题

低资源语音识别的课程优化

Curriculum optimization for low-resource speech recognition

论文作者

Kuznetsova, Anastasia, Kumar, Anurag, Fox, Jennifer Drexler, Tyers, Francis

论文摘要

现代的端到端语音识别模型显示出令人惊讶的结果，将音频信号转录为书面文本。但是，对于低资源语音识别，传统的数据馈送管道可能是最佳的，这仍然是一项具有挑战性的任务。我们提出了一种自动化课程学习方法，以基于模型的进度以及培训和有关培训示例难度的先验知识来优化培训示例的顺序。我们引入了一种称为压缩比的新难度度量，可以用作各种噪声条件下原始音频的评分功能。所提出的方法将语音识别单词错误率的性能提高了高达33％的基线系统

Modern end-to-end speech recognition models show astonishing results in transcribing audio signals into written text. However, conventional data feeding pipelines may be sub-optimal for low-resource speech recognition, which still remains a challenging task. We propose an automated curriculum learning approach to optimize the sequence of training examples based on both the progress of the model while training and prior knowledge about the difficulty of the training examples. We introduce a new difficulty measure called compression ratio that can be used as a scoring function for raw audio in various noise conditions. The proposed method improves speech recognition Word Error Rate performance by up to 33% relative over the baseline system

下载PDF全文

下载文献需遵守相关版权规定

论文标题