关于机器翻译质量估计的实际观点

论文标题

关于机器翻译质量估计的实际观点

Practical Perspectives on Quality Estimation for Machine Translation

论文作者

Zhou, Junpei, Chelba, Ciprian, Yuezhang, Li

论文摘要

机器翻译（MT）的句子级别质量估计（QE）试图预测纠正MT输出所需的后编辑工作的翻译编辑率（TER）成本。我们描述了我们对句子级别量化宽松的看法，如行业中遇到的几种实际设置所决定的。我们发现，MT输出的消费者 - - 无论是人类还是算法的消费者 - 主要对二元质量指标感兴趣：翻译的句子是否足够，还是需要后编辑后编辑？在此的激励下，我们提出了质量分类（QC）对句子级别量化宽松的观点，我们将重点放在以高于给定阈值的精确度上最大化召回率。我们证明，尽管经典的量化量化宽松回归模型在此任务上的表现不佳，但可以通过用二进制分类替换输出回归层来重新使用它们，从而在90 \％的精度下实现50-60 \％的召回率。对于产生75-80 \％正确翻译的高质量MT系统，这确实有望大大降低后编辑工作。

Sentence level quality estimation (QE) for machine translation (MT) attempts to predict the translation edit rate (TER) cost of post-editing work required to correct MT output. We describe our view on sentence-level QE as dictated by several practical setups encountered in the industry. We find consumers of MT output---whether human or algorithmic ones---to be primarily interested in a binary quality metric: is the translated sentence adequate as-is or does it need post-editing? Motivated by this we propose a quality classification (QC) view on sentence-level QE whereby we focus on maximizing recall at precision above a given threshold. We demonstrate that, while classical QE regression models fare poorly on this task, they can be re-purposed by replacing the output regression layer with a binary classification one, achieving 50-60\% recall at 90\% precision. For a high-quality MT system producing 75-80\% correct translations, this promises a significant reduction in post-editing work indeed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题