Multi-view Midivae: Fusing Track- And Bar-view Representations For Long Multi-track Symbolic Music Generation
2024 Β· Zhiwei Lin, Jun Chen, Boshi Tang, et al.
Abstract
Variational Autoencoders (VAEs) constitute a crucial component of neural symbolic music generation, among which some works have yielded outstanding results and attracted considerable attention. Nevertheless, previous VAEs still encounter issues with overly long feature sequences and generated results lack contextual coherence, thus the challenge of modeling long multi-track symbolic music still remains unaddressed. To this end, we propose Multi-view MidiVAE, as one of the pioneers in VAE methods that effectively model and generate long multi-track symbolic music. The Multi-view MidiVAE utilizes the two-dimensional (2-D) representation, OctupleMIDI, to capture relationships among notes while reducing the feature sequences length. Moreover, we focus on instrumental characteristics and harmony as well as global and local information about the musical composition by employing a hybrid variational encoding-decoding strategy to integrate both Track- and Bar-view MidiVAE features. Objective a
Authors
(none)
Tags
Stats
Related papers
- Midi-sandwich: Multi-model Multi-task Hierarchical Conditional VAE-GAN Networks For Symbolic Single-track Music Generation (2019)0.00
- A Multimodal Dynamical Variational Autoencoder For Audiovisual Speech Representation Learning (2023)2.26
- RAVE: A Variational Autoencoder For Fast And High-quality Neural Audio Synthesis (2021)0.00
- Conditional Variational Autoencoder To Improve Neural Audio Synthesis For Polyphonic Music Sound (2022)0.00
- Learning Style-aware Symbolic Music Representations By Adversarial Autoencoders (2020)2.26
- Emotion-conditioned Melody Harmonization With Hierarchical Variational Autoencoder (2023)5.24
- Rethinking Recurrent Latent Variable Model For Music Composition (2018)7.50
- Interpretable Timbre Synthesis Using Variational Autoencoders Regularized On Timbre Descriptors (2023)0.00