红色ACE：使用置信嵌入ASR的强大错误检测

论文标题

红色ACE：使用置信嵌入ASR的强大错误检测

RED-ACE: Robust Error Detection for ASR using Confidence Embeddings

论文作者

Gekhman, Zorik, Zverinski, Dina, Mallinson, Jonathan, Beryozkin, Genady

论文摘要

ASR错误检测（AED）模型旨在后处理自动语音识别（ASR）系统，以检测转录错误。现代方法通常使用基于文本的输入，仅由ASR转录假设组成，无视ASR模型的其他信号。取而代之的是，我们建议利用ASR系统的单词级信心得分来提高AED性能。具体而言，我们将ASR置信度嵌入（ACE）层添加到AED模型的编码器中，使我们能够共同编码置信度分数，并将转录的文本编码为上下文化表示。我们的实验显示了ASR置信度评分的好处，其对文本信号的互补作用以及ACE合并这些信号的有效性和鲁棒性。为了促进进一步的研究，我们发布了一个新颖的AED数据集，该数据集由带有带注释的转录错误的Librispeech语料库上的ASR输出组成。

ASR Error Detection (AED) models aim to post-process the output of Automatic Speech Recognition (ASR) systems, in order to detect transcription errors. Modern approaches usually use text-based input, comprised solely of the ASR transcription hypothesis, disregarding additional signals from the ASR model. Instead, we propose to utilize the ASR system's word-level confidence scores for improving AED performance. Specifically, we add an ASR Confidence Embedding (ACE) layer to the AED model's encoder, allowing us to jointly encode the confidence scores and the transcribed text into a contextualized representation. Our experiments show the benefits of ASR confidence scores for AED, their complementary effect over the textual signal, as well as the effectiveness and robustness of ACE for combining these signals. To foster further research, we publish a novel AED dataset consisting of ASR outputs on the LibriSpeech corpus with annotated transcription errors.

下载PDF全文

下载文献需遵守相关版权规定

论文标题