论文标题
自动语音摘要:范围评论
Automatic Speech Summarisation: A Scoping Review
论文作者
论文摘要
语音摘要技术将人类语音作为输入,然后输出删节版本作为文本或语音。语音摘要在从信息技术到医疗保健的许多领域中都有应用,例如改善语音档案或减轻临床文档负担。此范围评论绘制了语音摘要文献,没有时间范围,语言摘要,研究方法或纸质类型的限制。通过文献搜索发现的153套和所使用的语音功能,方法,范围和培训语料库中,我们总共回顾了110篇论文。大多数研究采用了四个语音摘要架构之一:(1)句子提取和压实; (2)功能提取和分类或基于等级的句子选择; (3)句子压缩和压缩摘要; (4)语言建模。我们还讨论了这些不同方法和语音特征的优点和缺点。总体而言,有监督的方法(例如,隐藏的马尔可夫支持向量机,排名支持向量机,有条件的随机字段)的执行效果要好于无监督的方法。由于有监督的方法需要手动注释的培训数据,这可能是昂贵的,因此对无监督的方法有更多的兴趣。对无监督方法的最新研究重点是扩展语言建模,例如,将Uni-Gram建模与深层神经网络相结合。协议注册:此范围审查的协议在https://osf.io上注册。
Speech summarisation techniques take human speech as input and then output an abridged version as text or speech. Speech summarisation has applications in many domains from information technology to health care, for example improving speech archives or reducing clinical documentation burden. This scoping review maps the speech summarisation literature, with no restrictions on time frame, language summarised, research method, or paper type. We reviewed a total of 110 papers out of a set of 153 found through a literature search and extracted speech features used, methods, scope, and training corpora. Most studies employ one of four speech summarisation architectures: (1) Sentence extraction and compaction; (2) Feature extraction and classification or rank-based sentence selection; (3) Sentence compression and compression summarisation; and (4) Language modelling. We also discuss the strengths and weaknesses of these different methods and speech features. Overall, supervised methods (e.g. Hidden Markov support vector machines, Ranking support vector machines, Conditional random fields) performed better than unsupervised methods. As supervised methods require manually annotated training data which can be costly, there was more interest in unsupervised methods. Recent research into unsupervised methods focusses on extending language modelling, for example by combining Uni-gram modelling with deep neural networks. Protocol registration: The protocol for this scoping review is registered at https://osf.io.