论文标题

Intermpl:中级CTC损失的动量伪标记

InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss

论文作者

Higuchi, Yosuke, Ogawa, Tetsuji, Kobayashi, Tetsunori, Watanabe, Shinji

论文摘要

本文介绍了Intermpl,这是一种半监督的端到端自动语音识别(ASR)的学习方法,该方法通过中间监督执行伪标记(PL)。动量PL(MPL)通过不断生成伪标签并提高其质量,训练未标记数据的连接派时间分类(CTC)模型。与基于注意力的编码器和传感器等自回旋配方相比,CTC非常适合MPL或一般的基于PL的半监督ASR,由于其简单/快速推理算法和鲁棒性,可抵抗产生折叠的Labels。但是,由于条件独立性假设,CTC通常比自回归模型的性能低,从而限制了MPL的性能。我们建议通过引入中间损失来增强MPL,这是受到基于CTC的建模的最新进展的启发。具体而言,我们专注于自我条件和分层条件性CTC,这些CTC将辅助CTC损失应用于中间层,从而明确放宽了条件独立性假设。我们还探讨了如何生成伪标签并将其用作中间损失的监督。不同的半监督设置中的实验结果表明,所提出的方法的表现优于MPL,并提高了ASR模型的绝对性能增益高达12.1%。此外,我们的详细分析验证了中间损失的重要性。

This paper presents InterMPL, a semi-supervised learning method of end-to-end automatic speech recognition (ASR) that performs pseudo-labeling (PL) with intermediate supervision. Momentum PL (MPL) trains a connectionist temporal classification (CTC)-based model on unlabeled data by continuously generating pseudo-labels on the fly and improving their quality. In contrast to autoregressive formulations, such as the attention-based encoder-decoder and transducer, CTC is well suited for MPL, or PL-based semi-supervised ASR in general, owing to its simple/fast inference algorithm and robustness against generating collapsed labels. However, CTC generally yields inferior performance than the autoregressive models due to the conditional independence assumption, thereby limiting the performance of MPL. We propose to enhance MPL by introducing intermediate loss, inspired by the recent advances in CTC-based modeling. Specifically, we focus on self-conditional and hierarchical conditional CTC, that apply auxiliary CTC losses to intermediate layers such that the conditional independence assumption is explicitly relaxed. We also explore how pseudo-labels should be generated and used as supervision for intermediate losses. Experimental results in different semi-supervised settings demonstrate that the proposed approach outperforms MPL and improves an ASR model by up to a 12.1% absolute performance gain. In addition, our detailed analysis validates the importance of the intermediate loss.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源