论文标题

用对比歧视器来提高星际符号的声音转换

Boosting Star-GANs for Voice Conversion with Contrastive Discriminator

论文作者

Si, Shijing, Wang, Jianzong, Zhang, Xulong, Qu, Xiaoyang, Cheng, Ning, Xiao, Jing

论文摘要

在许多情况下,非平行的多域语音转换方法(例如Stargan-VC)已被广泛应用。但是,这些模型的培训通常由于其复杂的对抗网络体系结构而构成挑战。为了解决这个问题,在这项工作中,我们利用最先进的对比学习技术,并将有效的暹罗网络结构纳入Stargan歧视者。我们的方法称为simsiam-Stargan-VC,它提高了训练稳定性,并有效地防止了训练过程中的歧视者过度拟合问题。我们对语音转换挑战(VCC 2018)数据集进行了实验,以及用户研究以验证框架的性能。我们的实验结果表明,Simsiam-Stargan-VC在客观和主观指标方面都显着优于现有的Stargan-VC方法。

Nonparallel multi-domain voice conversion methods such as the StarGAN-VCs have been widely applied in many scenarios. However, the training of these models usually poses a challenge due to their complicated adversarial network architectures. To address this, in this work we leverage the state-of-the-art contrastive learning techniques and incorporate an efficient Siamese network structure into the StarGAN discriminator. Our method is called SimSiam-StarGAN-VC and it boosts the training stability and effectively prevents the discriminator overfitting issue in the training process. We conduct experiments on the Voice Conversion Challenge (VCC 2018) dataset, plus a user study to validate the performance of our framework. Our experimental results show that SimSiam-StarGAN-VC significantly outperforms existing StarGAN-VC methods in terms of both the objective and subjective metrics.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源