外语单词的发音生成在句子内代码转换语音识别中

论文标题

外语单词的发音生成在句子内代码转换语音识别中

Pronunciation Generation for Foreign Language Words in Intra-Sentential Code-Switching Speech Recognition

论文作者

Wang, Wei, Zhang, Chao, Wu, Xiaopei

论文摘要

代码转换是指句子或话语中切换语言的现象。但是，有限的代码转换，不同的语言音素和高重建成本引发了挑战，以使专门的声学模型用于代码转换语音识别。 In this paper, we make use of limited code-switching data as driving materials and explore a shortcut to quickly develop intra-sentential code-switching recognition skill on the commissioned native language acoustic model, where we propose a data-driven method to make the seed lexicon which is used to train grapheme-to-phoneme model to predict mapping pronunciations for foreign language word in code-switching sentences.数据驱动技术在本文中的核心工作包括语音解码方法和不同的选择方法。对于不平衡的单词级驾驶材料问题，我们具有内部援助的灵感，该灵感可以在具有足够的材料的单词中学习良好的发音规则，该单词使用谱系到词素模型来帮助稀缺。我们的实验表明，在纯粹的中文英语代码转换识别中的错误率从29.15 \％降低到以纯中国识别仪获得的29.15％降低到12.13 \％，通过我们的数据驱动方法添加外语单词的发音，并最终获得最佳的结果11.14 \％的组合，并获得了不同选择方法和不同选择方法和内部选择方法的组合。

Code-Switching refers to the phenomenon of switching languages within a sentence or discourse. However, limited code-switching , different language phoneme-sets and high rebuilding costs throw a challenge to make the specialized acoustic model for code-switching speech recognition. In this paper, we make use of limited code-switching data as driving materials and explore a shortcut to quickly develop intra-sentential code-switching recognition skill on the commissioned native language acoustic model, where we propose a data-driven method to make the seed lexicon which is used to train grapheme-to-phoneme model to predict mapping pronunciations for foreign language word in code-switching sentences. The core work of the data-driven technology in this paper consists of a phonetic decoding method and different selection methods. And for imbalanced word-level driving materials problem, we have an internal assistance inspiration that learning the good pronunciation rules in the words that possess sufficient materials using the grapheme-to-phoneme model to help the scarce. Our experiments show that the Mixed Error Rate in intra-sentential Chinese-English code-switching recognition reduced from 29.15\%, acquired on the pure Chinese recognizer, to 12.13\% by adding foreign language words' pronunciation through our data-driven approach, and finally get the best result 11.14\% with the combination of different selection methods and internal assistance tactic.

下载PDF全文

下载文献需遵守相关版权规定

论文标题