Ppg-based Singing Voice Conversion With Adversarial Representation Learning
2020 Β· Zhonghao Li, Benlai Tang, Xiang Yin, et al.
Abstract
Singing voice conversion (SVC) aims to convert the voice of one singer to that of other singers while keeping the singing content and melody. On top of recent voice conversion works, we propose a novel model to steadily convert songs while keeping their naturalness and intonation. We build an end-to-end architecture, taking phonetic posteriorgrams (PPGs) as inputs and generating mel spectrograms. Specifically, we implement two separate encoders: one encodes PPGs as content, and the other compresses mel spectrograms to supply acoustic and musical information. To improve the performance on timbre and melody, an adversarial singer confusion module and a mel-regressive representation learning module are designed for the model. Objective and subjective experiments are conducted on our private Chinese singing corpus. Comparing with the baselines, our methods can significantly improve the conversion performance in terms of naturalness, melody, and voice similarity. Moreover, our PPG-based met
Authors
(none)
Tags
Stats
Related papers
- Phonetic Posteriorgrams Based Many-to-many Singing Voice Conversion Via Adversarial Training (2020)0.00
- Towards High-fidelity Singing Voice Conversion With Acoustic Reference And Contrastive Predictive Coding (2021)7.81
- Robustsvc: Hubert-based Melody Extractor And Adversarial Learning For Robust Singing Voice Conversion (2024)3.58
- Leveraging Diverse Semantic-based Audio Pretrained Models For Singing Voice Conversion (2023)0.00
- Pitchnet: Unsupervised Singing Voice Conversion With Pitch Adversarial Network (2019)10.97
- Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-based Approach For One-shot Singing Voice Conversion (2023)7.50
- Singing Voice Conversion With Disentangled Representations Of Singer And Vocal Technique Using Variational Autoencoders (2019)10.97
- Singing Voice Conversion With Accompaniment Using Self-supervised Representation-based Melody Features (2025)0.00