suber：用于自动评估字幕质量的指标

论文标题

suber：用于自动评估字幕质量的指标

SubER: A Metric for Automatic Evaluation of Subtitle Quality

论文作者

Wilken, Patrick, Georgakopoulou, Panayota, Matusov, Evgeny

论文摘要

本文解决了评估自动生成字幕质量的问题，该字幕不仅包括机器转录或翻译的语音的质量，还包括线段分段和字幕时机的质量。我们提出了SUBER - 一个基于编辑距离的单一新颖指标，其移位将所有这些字幕属性都考虑在内。我们将其与现有的指标进行比较，以评估转录，翻译和字幕质量。在编辑后的场景中进行仔细的人类评估表明，新指标与后编辑的工作和直接的人类评估评分具有很高的相关性，优于仅考虑诸如WER和BLEU等字幕文本的基线指标，以及现有的方法，以及集成了分割和时序特征的现有方法。

This paper addresses the problem of evaluating the quality of automatically generated subtitles, which includes not only the quality of the machine-transcribed or translated speech, but also the quality of line segmentation and subtitle timing. We propose SubER - a single novel metric based on edit distance with shifts that takes all of these subtitle properties into account. We compare it to existing metrics for evaluating transcription, translation, and subtitle quality. A careful human evaluation in a post-editing scenario shows that the new metric has a high correlation with the post-editing effort and direct human assessment scores, outperforming baseline metrics considering only the subtitle text, such as WER and BLEU, and existing methods to integrate segmentation and timing features.

下载PDF全文

下载文献需遵守相关版权规定

论文标题