端到端ASR的新课程标准的比较和分析

论文标题

端到端ASR的新课程标准的比较和分析

Comparison and Analysis of New Curriculum Criteria for End-to-End ASR

论文作者

Karakasidis, Georgios, Grósz, Tamás, Kurimo, Mikko

论文摘要

众所周知，培训数据的数量和质量在创建良好的机器学习模型中起着重要作用。在本文中，我们将其进一步迈出了一步，并证明了培训示例的安排方式也至关重要。课程学习建立在有组织和结构化的知识同化的观察基础上，具有更快的培训和更好理解的能力。当人类学会说话时，他们首先尝试说出基本的电话，然后逐渐朝着更复杂的结构（例如单词和句子）发展。该方法被称为课程学习，我们在自动语音识别的背景下使用它。我们假设端到端模型在提供有组织的训练集时可以实现更好的性能，该培训集由示例组成，这些示例表现出越来越高的难度（即课程）。为了在训练集上强加结构并定义一个简单的示例概念，我们探索了多个评分功能，这些功能要么使用外部神经网络的反馈，要么将模型本身的反馈纳入其中。经验结果表明，通过不同的课程，我们可以平衡培训时间和网络的表现。

It is common knowledge that the quantity and quality of the training data play a significant role in the creation of a good machine learning model. In this paper, we take it one step further and demonstrate that the way the training examples are arranged is also of crucial importance. Curriculum Learning is built on the observation that organized and structured assimilation of knowledge has the ability to enable faster training and better comprehension. When humans learn to speak, they first try to utter basic phones and then gradually move towards more complex structures such as words and sentences. This methodology is known as Curriculum Learning, and we employ it in the context of Automatic Speech Recognition. We hypothesize that end-to-end models can achieve better performance when provided with an organized training set consisting of examples that exhibit an increasing level of difficulty (i.e. a curriculum). To impose structure on the training set and to define the notion of an easy example, we explored multiple scoring functions that either use feedback from an external neural network or incorporate feedback from the model itself. Empirical results show that with different curriculums we can balance the training times and the network's performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题