目标源分离的最佳条件培训

论文标题

目标源分离的最佳条件培训

Optimal Condition Training for Target Source Separation

论文作者

Tzinis, Efthymios, Wichern, Gordon, Smaragdis, Paris, Roux, Jonathan Le

论文摘要

最近的研究表明，在利用多个不合时宜的条件和非纯粹的语义概念来进行声源分离方面的表现出色，从而使灵活性基于多个不同的查询提取给定的目标源。在这项工作中，我们基于与给定目标源相关的等效条件之间的最高绩效条件，提出了一种新的最佳条件训练（OCT）用于单通道目标源分离的方法。我们的实验表明，与单条件模型相比，各种语义概念所携带的互补信息有助于更有效地解散和分离感兴趣的来源。此外，我们提出了与条件细化的OCT变化，其中初始条件矢量适应给定的混合物，并转化为更友善的代表，以进行目标源提取。我们展示了OCT对具有Oracle分配的置换模型的各种源分离实验的有效性，并在基于文本的源分离的更具挑战性的任务中获得了最先进的绩效，甚至超过了专用的文本条件模型。

Recent research has shown remarkable performance in leveraging multiple extraneous conditional and non-mutually exclusive semantic concepts for sound source separation, allowing the flexibility to extract a given target source based on multiple different queries. In this work, we propose a new optimal condition training (OCT) method for single-channel target source separation, based on greedy parameter updates using the highest performing condition among equivalent conditions associated with a given target source. Our experiments show that the complementary information carried by the diverse semantic concepts significantly helps to disentangle and isolate sources of interest much more efficiently compared to single-conditioned models. Moreover, we propose a variation of OCT with condition refinement, in which an initial conditional vector is adapted to the given mixture and transformed to a more amenable representation for target source extraction. We showcase the effectiveness of OCT on diverse source separation experiments where it improves upon permutation invariant models with oracle assignment and obtains state-of-the-art performance in the more challenging task of text-based source separation, outperforming even dedicated text-only conditioned models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题