MUSFA：通过部分标记的数据改善音乐结构功能分析

论文标题

MUSFA：通过部分标记的数据改善音乐结构功能分析

MuSFA: Improving Music Structural Function Analysis with Partially Labeled Data

论文作者

Wang, Ju-Chiang, Smith, Jordan B. L., Hung, Yun-Ning

论文摘要

音乐结构分析（MSA）系统旨在将歌曲录制分为具有有用标签的非重叠部分。以前的MSA系统通常会在后处理步骤中预测抽象标签，并需要歌曲的完整上下文。相比之下，我们最近提出了一个被监督的框架，称为“音乐结构函数分析”（MUSFA），该框架直接从音频中对'verse'和'verse'和'Chorus'等有意义的标签进行建模和预测，而无需歌曲的完整背景。但是，该系统的性能取决于培训数据的数量和质量。在本文中，我们建议将公共数据集（Hook Theory Lead Steep）数据集（HLSD）重新利用，以提高性能。 HLSD包含最初收集的用于研究自动旋律协调的18K摘录。我们将每个摘录视为部分标记的歌曲，并提供标签映射，以便可以将HLSD与其他公共数据集一起使用，例如Salami，RWC和Isophonics。在跨数据库评估中，我们发现在训练中包括HLSD可以分别提高最新边界检测和截面标记分数〜3％和〜1％。

Music structure analysis (MSA) systems aim to segment a song recording into non-overlapping sections with useful labels. Previous MSA systems typically predict abstract labels in a post-processing step and require the full context of the song. By contrast, we recently proposed a supervised framework, called "Music Structural Function Analysis" (MuSFA), that models and predicts meaningful labels like 'verse' and 'chorus' directly from audio, without requiring the full context of a song. However, the performance of this system depends on the amount and quality of training data. In this paper, we propose to repurpose a public dataset, HookTheory Lead Sheet Dataset (HLSD), to improve the performance. HLSD contains over 18K excerpts of music sections originally collected for studying automatic melody harmonization. We treat each excerpt as a partially labeled song and provide a label mapping, so that HLSD can be used together with other public datasets, such as SALAMI, RWC, and Isophonics. In cross-dataset evaluations, we find that including HLSD in training can improve state-of-the-art boundary detection and section labeling scores by ~3% and ~1% respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题