Stochastic Wavenet: A Generative Latent Variable Model For Sequential Data
2018 Β· Guokun Lai, Bohan Li, Guoqing Zheng, et al.
Abstract
How to model distribution of sequential data, including but not limited to speech and human motions, is an important ongoing research problem. It has been demonstrated that model capacity can be significantly enhanced by introducing stochastic latent variables in the hidden states of recurrent neural networks. Simultaneously, WaveNet, equipped with dilated convolutions, achieves astonishing empirical performance in natural speech generation task. In this paper, we combine the ideas from both stochastic latent variables and dilated convolutions, and propose a new architecture to model sequential data, termed as Stochastic WaveNet, where stochastic latent variables are injected into the WaveNet structure. We argue that Stochastic WaveNet enjoys powerful distribution modeling capacity and the advantage of parallel training from dilated convolutions. In order to efficiently infer the posterior distribution of the latent variables, a novel inference network structure is designed based on th
Authors
(none)
Tags
Stats
Related papers
- Wavenet: A Generative Model For Raw Audio (2016)0.00
- Learning Waveform-based Acoustic Models Using Deep Variational Convolutional Neural Networks (2019)6.77
- Sequential Neural Models With Stochastic Layers (2016)0.00
- Parallel Wavenet: Fast High-fidelity Speech Synthesis (2017)0.00
- Re-examination Of The Role Of Latent Variables In Sequence Modeling (2019)0.00
- Regularized Sequential Latent Variable Models With Adversarial Neural Networks (2021)2.26
- Semi-recurrent Cnn-based VAE-GAN For Sequential Data Generation (2018)10.21
- Multi-task Wavenet: A Multi-task Generative Model For Statistical Parametric Speech Synthesis Without Fundamental Frequency Conditions (2018)8.09