polyscriber：集成的提取器和歌词transcriber for polyphonic音乐的微调

论文标题

polyscriber：集成的提取器和歌词transcriber for polyphonic音乐的微调

PoLyScriber: Integrated Fine-tuning of Extractor and Lyrics Transcriber for Polyphonic Music

论文作者

Gao, Xiaoxue, Gupta, Chitralekha, Li, Haizhou

论文摘要

随着背景音乐影响歌词清晰度，复音音乐的歌词转录具有挑战性。通常，歌词转录可以由两步管道（即歌唱的人声提取前端）进行，然后是歌词transcriber后端，前端和后端分别训练。这样的两步管道遭受了不完美的声带提取和前端和后端之间的不匹配。在这项工作中，我们提出了一个新颖的端到端集成微调框架，该框架称为Polyscriber，以全球优化声音提取器前端和歌词transcriber后端，用于多音音乐中的歌词转录。实验结果表明，我们提出的polyscriber对公开可用的测试数据集的现有方法取得了重大改进。

Lyrics transcription of polyphonic music is challenging as the background music affects lyrics intelligibility. Typically, lyrics transcription can be performed by a two-step pipeline, i.e. a singing vocal extraction front end, followed by a lyrics transcriber back end, where the front end and back end are trained separately. Such a two-step pipeline suffers from both imperfect vocal extraction and mismatch between front end and back end. In this work, we propose a novel end-to-end integrated fine-tuning framework, that we call PoLyScriber, to globally optimize the vocal extractor front end and lyrics transcriber back end for lyrics transcription in polyphonic music. The experimental results show that our proposed PoLyScriber achieves substantial improvements over the existing approaches on publicly available test datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题