Nonparallel Voice Conversion With Augmented Classifier Star Generative Adversarial Networks
2020 Β· Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, et al.
Abstract
We previously proposed a method that allows for nonparallel voice conversion (VC) by using a variant of generative adversarial networks (GANs) called StarGAN. The main features of our method, called StarGAN-VC, are as follows: First, it requires no parallel utterances, transcriptions, or time alignment procedures for speech generator training. Second, it can simultaneously learn mappings across multiple domains using a single generator network and thus fully exploit available training data collected from multiple domains to capture latent features that are common to all the domains. Third, it can generate converted speech signals quickly enough to allow real-time implementations and requires only several minutes of training examples to generate reasonably realistic-sounding speech. In this paper, we describe three formulations of StarGAN, including a newly introduced novel StarGAN variant called "Augmented classifier StarGAN (A-StarGAN)", and compare them in a nonparallel VC task. We a
Authors
(none)
Tags
Stats
Related papers
- Stargan-vc: Non-parallel Many-to-many Voice Conversion With Star Generative Adversarial Networks (2018)18.09
- Starganv2-vc: A Diverse, Unsupervised, Non-parallel Framework For Natural-sounding Voice Conversion (2021)13.70
- Stargan-vc+asr: Stargan-based Non-parallel Voice Conversion Regularized By Automatic Speech Recognition (2021)5.24
- Stargan-vc2: Rethinking Conditional Methods For Stargan-based Voice Conversion (2019)0.00
- Cyclegan-vc2: Improved Cyclegan-based Non-parallel Voice Conversion (2019)17.45
- CVC: Contrastive Learning For Non-parallel Voice Conversion (2020)7.50
- Voice Conversion From Unaligned Corpora Using Variational Autoencoding Wasserstein Generative Adversarial Networks (2017)16.34
- High-quality Nonparallel Voice Conversion Based On Cycle-consistent Adversarial Network (2018)0.00