Finstreder：使用现代语音到文本模型对有限状态传感器的简单而快速的语言理解

论文标题

Finstreder：使用现代语音到文本模型对有限状态传感器的简单而快速的语言理解

Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models

论文作者

Bermuth, Daniel, Poeppel, Alexander, Reif, Wolfgang

论文摘要

在口语理解（SLU）中，任务是从音频命令中提取重要信息，例如用户想要该系统执行的操作和特殊实体（例如位置或数字）的意图。本文提出了一种简单的方法，可以将意图和实体嵌入有限的状态换能器中，并结合了预验证的通用语音到文本模型，可以构建SLU模型，而无需任何其他培训。构建这些型号非常快，只需几秒钟。它也完全是独立的。通过对不同基准测试的比较，可以表明该方法可以胜过多种其他资源要求的SLU方法。

In Spoken Language Understanding (SLU) the task is to extract important information from audio commands, like the intent of what a user wants the system to do and special entities like locations or numbers. This paper presents a simple method for embedding intents and entities into Finite State Transducers, and, in combination with a pretrained general-purpose Speech-to-Text model, allows building SLU-models without any additional training. Building those models is very fast and only takes a few seconds. It is also completely language independent. With a comparison on different benchmarks it is shown that this method can outperform multiple other, more resource demanding SLU approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题