Generative Moment Matching Network-based Random Modulation Post-filter For Dnn-based Singing Voice Synthesis And Neural Double-tracking
2019 Β· Hiroki Tamaru, Yuki Saito, Shinnosuke Takamichi, et al.
Abstract
This paper proposes a generative moment matching network (GMMN)-based post-filter that provides inter-utterance pitch variation for deep neural network (DNN)-based singing voice synthesis. The natural pitch variation of a human singing voice leads to a richer musical experience and is used in double-tracking, a recording method in which two performances of the same phrase are recorded and mixed to create a richer, layered sound. However, singing voices synthesized using conventional DNN-based methods never vary because the synthesis process is deterministic and only one waveform is synthesized from one musical score. To address this problem, we use a GMMN to model the variation of the modulation spectrum of the pitch contour of natural singing voices and add a randomized inter-utterance variation to the pitch contour generated by conventional DNN-based singing voice synthesis. Experimental evaluations suggest that 1) our approach can provide perceptible inter-utterance pitch variation
Authors
(none)
Tags
Stats
Related papers
- Sampling-based Speech Parameter Generation Using Moment-matching Networks (2017)6.34
- Singgan: Generative Adversarial Network For High-fidelity Singing Voice Generation (2021)10.61
- Adversarial Multi-task Learning For Disentangling Timbre And Pitch In Singing Voice Synthesis (2022)4.52
- Singing Voice Synthesis Based On Convolutional Neural Networks (2019)0.00
- Wgansing: A Multi-voice Singing Voice Synthesizer Based On The Wasserstein-gan (2019)11.08
- Adversarially Trained Multi-singer Sequence-to-sequence Singing Synthesizer (2020)7.81
- Fast And High-quality Singing Voice Synthesis System Based On Convolutional Neural Networks (2019)8.82
- Periodgrad: Towards Pitch-controllable Neural Vocoder Based On A Diffusion Probabilistic Model (2024)0.00