Interpretable Timbre Synthesis Using Variational Autoencoders Regularized On Timbre Descriptors
2023 Β· Anastasia Natsiou, Luca Longo, Sean O'Leary
Abstract
Controllable timbre synthesis has been a subject of research for several decades, and deep neural networks have been the most successful in this area. Deep generative models such as Variational Autoencoders (VAEs) have the ability to generate a high-level representation of audio while providing a structured latent space. Despite their advantages, the interpretability of these latent spaces in terms of human perception is often limited. To address this limitation and enhance the control over timbre generation, we propose a regularized VAE-based latent space that incorporates timbre descriptors. Moreover, we suggest a more concise representation of sound by utilizing its harmonic content, in order to minimize the dimensionality of the latent space.
Authors
(none)
Tags
Stats
Related papers
- Neural Music Synthesis For Flexible Timbre Control (2018)9.92
- Timbre Transfer With Variational Auto Encoding And Cycle-consistent Adversarial Networks (2021)0.00
- Wavetable Synthesis Using CVAE For Timbre Control Based On Semantic Label (2024)0.00
- RAVE: A Variational Autoencoder For Fast And High-quality Neural Audio Synthesis (2021)0.00
- Learning And Controlling The Source-filter Representation Of Speech With A Variational Autoencoder (2022)7.50
- Conditioning Autoencoder Latent Spaces For Real-time Timbre Interpolation And Synthesis (2020)3.58
- Deep Encoder-decoder Models For Unsupervised Learning Of Controllable Speech Synthesis (2018)0.00
- Conditional Variational Autoencoder To Improve Neural Audio Synthesis For Polyphonic Music Sound (2022)0.00