论文标题

迈向情境感知的神经性能得分同步

Towards Context-Aware Neural Performance-Score Synchronisation

论文作者

Agrawal, Ruchit

论文摘要

音乐可以多种形式表示,例如音频形式作为表演的录制,符号形式作为计算机可读分数,或以图像形式作为扫描乐谱的扫描。音乐同步提供了一种通过在音乐之间产生准确的映射,以统一的方式在音乐之间导航的方法,并借出了适用于音乐教育,表演分析,自动伴奏和音乐编辑等无数领域的贷款。传统同步方法使用知识驱动和随机方法计算对齐方式,通常采用手工制作的功能。这些方法通常无法很好地推广到不同的工具,声学环境和记录条件,并且通常在表演和分数之间达到完全的结构一致性。该博士学位通过在三个方面提出数据驱动的,上下文感知的对准方法来进一步发展性能得分同步研究:首先,我通过采用基于度量的学习方法来替换手工制作的特征,该方法采用了适用于不同的声学环境,并在数据筛选条件下表现良好。其次,我解决了表演和得分之间的结构差异的处理,这是标准比对方法的普遍限制。最后,我避免了对特征工程和动态编程的依赖,并提出了一种完全数据驱动的同步方法,该方法使用神经框架计算对齐,同时也对性能和得分之间的结构差异也很强。

Music can be represented in multiple forms, such as in the audio form as a recording of a performance, in the symbolic form as a computer readable score, or in the image form as a scan of the sheet music. Music synchronisation provides a way to navigate among multiple representations of music in a unified manner by generating an accurate mapping between them, lending itself applicable to a myriad of domains like music education, performance analysis, automatic accompaniment and music editing. Traditional synchronisation methods compute alignment using knowledge-driven and stochastic approaches, typically employing handcrafted features. These methods are often unable to generalise well to different instruments, acoustic environments and recording conditions, and normally assume complete structural agreement between the performances and the scores. This PhD furthers the development of performance-score synchronisation research by proposing data-driven, context-aware alignment approaches, on three fronts: Firstly, I replace the handcrafted features by employing a metric learning based approach that is adaptable to different acoustic settings and performs well in data-scarce conditions. Secondly, I address the handling of structural differences between the performances and scores, which is a common limitation of standard alignment methods. Finally, I eschew the reliance on both feature engineering and dynamic programming, and propose a completely data-driven synchronisation method that computes alignments using a neural framework, whilst also being robust to structural differences between the performances and scores.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源