Symphonygen: 3D Hierarchical Orchestral Generation With Controllable Harmony Skeleton
2026 Β· Xuzheng He, Nan Nan, Zhilin Wang, et al.
Abstract
arXiv:2604.25498v1 Announce Type: new Abstract: Generating symphonic music requires simultaneously managing high-level structural form and dense, multi-track orchestration. Existing symbolic models often struggle with a "complexity-control imbalance", in which scaling bottlenecks limit long-term granular steerability. We present SymphonyGen, a 3D hierarchical framework for contemporary cinematic orchestration. SymphonyGen employs a cascading decoder architecture that decomposes the Bar, Track, and Event axes, improving computational efficiency and scalability over conventional 1D or 2D models. We introduce "short-score" conditioning via a beat-quantized multi-voice harmony skeleton, enabling outline control while preserving textural diversity. The model is further refined using Group Relative Policy Optimization (GRPO) with a cross-modal audio-perceptual reward, aligning symbolic output with modern acoustic expectations. Additionally, we implement a dissonance-averse sampling algorith
Authors
(none)
Tags
Stats
Related papers
- Imposing Higher-level Structure In Polyphonic Music Generation Using Convolutional Restricted Boltzmann Machines And Constraints (2016)6.77
- Polyphonic Music Generation With Sequence Generative Adversarial Networks (2017)2.26
- Amadeus: Autoregressive Model With Bidirectional Attribute Modelling For Symbolic Music (2025)0.00
- End-to-end Real-world Polyphonic Piano Audio-to-score Transcription With Hierarchical Decoding (2024)0.00
- Hierarchical Generative Modeling Of Melodic Vocal Contours In Hindustani Classical Music (2024)0.00
- Gesture2music: A Low-latency Real-time Framework For Continuous Gesture-driven Music Generation (2026)0.00
- Midi-sandwich: Multi-model Multi-task Hierarchical Conditional VAE-GAN Networks For Symbolic Single-track Music Generation (2019)0.00
- Diff-a-riff: Musical Accompaniment Co-creation Via Latent Diffusion Models (2024)0.00