SINTRA：从单个多轨音乐部分学习灵感模型

论文标题

SINTRA：从单个多轨音乐部分学习灵感模型

SinTra: Learning an inspiration model from a single multi-track music segment

论文作者

Song, Qingwei, Sun, Qiwei, Guo, Dongsheng, Zheng, Haiyong

论文摘要

在本文中，我们提出了SINTRA，这是一种自动回归的顺序生成模型，可以从单个多轨音乐段中学习，以生成具有任意栏长度的多启动的连贯，美学和可变的多形音乐。为此，为了确保生成的样本和训练音乐的相关性，我们提出了一个新颖的音调组表示。 Sintra由Transformer-XL的金字塔和多尺度训练策略组成，可以学习音乐结构和单个训练音乐节目的音符之间的相对位置关系。此外，为了维持轨道间的相关性，我们使用卷积操作来处理多轨音乐，当解码时，轨道彼此独立以防止干扰。我们通过主观研究和客观指标评估SINTRA。比较结果表明，我们的框架可以比音乐变形金刚从单个音乐片段中学习信息。同样，Sintra及其变体之间的比较，即仅具有第一阶段的单阶段Sintra，表明金字塔结构可以有效地抑制过度折叠的注释。

In this paper, we propose SinTra, an auto-regressive sequential generative model that can learn from a single multi-track music segment, to generate coherent, aesthetic, and variable polyphonic music of multi-instruments with an arbitrary length of bar. For this task, to ensure the relevance of generated samples and training music, we present a novel pitch-group representation. SinTra, consisting of a pyramid of Transformer-XL with a multi-scale training strategy, can learn both the musical structure and the relative positional relationship between notes of the single training music segment. Additionally, for maintaining the inter-track correlation, we use the convolution operation to process multi-track music, and when decoding, the tracks are independent to each other to prevent interference. We evaluate SinTra with both subjective study and objective metrics. The comparison results show that our framework can learn information from a single music segment more sufficiently than Music Transformer. Also the comparison between SinTra and its variant, i.e., the single-stage SinTra with the first stage only, shows that the pyramid structure can effectively suppress overly-fragmented notes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题