CSL-L2M: Controllable Song-level Lyric-to-melody Generation Based On Conditional Transformer With Fine-grained Lyric And Musical Controls
2024 Β· Li Chai, Donglin Wang
Abstract
Lyric-to-melody generation is a highly challenging task in the field of AI music generation. Due to the difficulty of learning strict yet weak correlations between lyrics and melodies, previous methods have suffered from weak controllability, low-quality and poorly structured generation. To address these challenges, we propose CSL-L2M, a controllable song-level lyric-to-melody generation method based on an in-attention Transformer decoder with fine-grained lyric and musical controls, which is able to generate full-song melodies matched with the given lyrics and user-specified musical attributes. Specifically, we first introduce REMI-Aligned, a novel music representation that incorporates strict syllable- and sentence-level alignments between lyrics and melodies, facilitating precise alignment modeling. Subsequently, sentence-level semantic lyric embeddings independently extracted from a sentence-wise Transformer encoder are combined with word-level part-of-speech embeddings and syllabl
Authors
(none)
Tags
Stats
Related papers
- Conditional LSTM-GAN For Melody Generation From Lyrics (2019)14.69
- Segtune: Structured And Fine-grained Control For Song Generation (2025)0.00
- Songglm: Lyric-to-melody Generation With 2D Alignment Encoding And Multi-task Pre-training (2024)3.58
- Unsupervised Melody-to-lyric Generation (2023)0.00
- Melody-conditioned Lyrics Generation With Seqgans (2020)7.50
- Interpretable Melody Generation From Lyrics With Discrete-valued Adversarial Training (2022)6.34
- Joint Learning Of Wording And Formatting For Singable Melody-to-lyric Generation (2023)0.00
- A Syllable-structured, Contextually-based Conditionally Generation Of Chinese Lyrics (2019)7.16