Style Equalization: Unsupervised Learning Of Controllable Generative Sequence Models
2021 Β· Jen-Hao Rick Chang, Ashish Shrivastava, Hema Swetha Koppula, et al.
Abstract
Controllable generative sequence models with the capability to extract and replicate the style of specific examples enable many applications, including narrating audiobooks in different voices, auto-completing and auto-correcting written handwriting, and generating missing training samples for downstream recognition tasks. However, under an unsupervised-style setting, typical training algorithms for controllable sequence generative models suffer from the training-inference mismatch, where the same sample is used as content and style input during training but unpaired samples are given during inference. In this paper, we tackle the training-inference mismatch encountered during unsupervised learning of controllable generative sequence models. The proposed method is simple yet effective, where we use a style transformation module to transfer target style information into an unrelated style input. This method enables training using unpaired content and style samples and thereby mitigate t
Authors
(none)
Tags
Stats
Related papers
- Self-supervised Context-aware Style Representation For Expressive Speech Synthesis (2022)6.34
- Learning Latent Representations For Style Control And Transfer In End-to-end Speech Synthesis (2018)0.00
- Generspeech: Towards Style Transfer For Generalizable Out-of-domain Text-to-speech (2022)5.24
- Styletts: A Style-based Generative Model For Natural And Diverse Text-to-speech Synthesis (2022)10.97
- Stylebook: Content-dependent Speaking Style Modeling For Any-to-any Voice Conversion Using Only Speech Data (2023)0.00
- Style Tokens: Unsupervised Style Modeling, Control And Transfer In End-to-end Speech Synthesis (2018)0.00
- Text-driven Emotional Style Control And Cross-speaker Style Transfer In Neural TTS (2022)7.81
- Fine-grained Style Control In Transformer-based Text-to-speech Synthesis (2021)11.19