ACVAE-VC: Non-parallel Many-to-many Voice Conversion With Auxiliary Classifier Variational Autoencoder
2018 Β· Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, et al.
Abstract
This paper proposes a non-parallel many-to-many voice conversion (VC) method using a variant of the conditional variational autoencoder (VAE) called an auxiliary classifier VAE (ACVAE). The proposed method has three key features. First, it adopts fully convolutional architectures to construct the encoder and decoder networks so that the networks can learn conversion rules that capture time dependencies in the acoustic feature sequences of source and target speech. Second, it uses an information-theoretic regularization for the model training to ensure that the information in the attribute class label will not be lost in the conversion process. With regular CVAEs, the encoder and decoder are free to ignore the attribute class label input. This can be problematic since in such a situation, the attribute class label will have little effect on controlling the voice characteristics of input speech at test time. Such situations can be avoided by introducing an auxiliary classifier and traini
Authors
(none)
Tags
Stats
Related papers
- Conditional Deep Hierarchical Variational Autoencoder For Voice Conversion (2021)0.00
- Fastvc: Fast Voice Conversion With Non-parallel Data (2020)5.24
- Non-parallel Voice Conversion With Cyclic Variational Autoencoder (2019)12.10
- F0-consistent Many-to-many Non-parallel Voice Conversion Via Conditional Autoencoder (2020)13.17
- Voice Conversion From Unaligned Corpora Using Variational Autoencoding Wasserstein Generative Adversarial Networks (2017)16.34
- AC-VC: Non-parallel Low Latency Phonetic Posteriorgrams Based Voice Conversion (2021)7.50
- Many-to-many Voice Conversion Based Feature Disentanglement Using Variational Autoencoder (2021)7.81
- CVC: Contrastive Learning For Non-parallel Voice Conversion (2020)7.50