Songprep: A Preprocessing Framework And End-to-end Model For Full-song Structure Parsing And Lyrics Transcription
2025 Β· Wei Tan, Shun Lei, Huaicheng Zhang, et al.
Abstract
Artificial Intelligence Generated Content (AIGC) is currently a popular research area. Among its various branches, song generation has attracted growing interest. Despite the abundance of available songs, effective data preparation remains a significant challenge. Converting these songs into training-ready datasets typically requires extensive manual labeling, which is both time consuming and costly. To address this issue, we propose SongPrep, an automated preprocessing pipeline designed specifically for song data. This framework streamlines key processes such as source separation, structure analysis, and lyric recognition, producing structured data that can be directly used to train song generation models. Furthermore, we introduce SongPrepE2E, an end-to-end structured lyrics recognition model based on pretrained language models. Without the need for additional source separation, SongPrepE2E is able to analyze the structure and lyrics of entire songs and provide precise timestamps. By
Authors
(none)
Tags
Stats
Related papers
- Songtrans: An Unified Song Transcription And Alignment Method For Lyrics And Notes (2024)0.00
- Segtune: Structured And Fine-grained Control For Song Generation (2025)0.00
- Songmass: Automatic Song Writing With Pre-training And Alignment Constraint (2020)11.39
- Songgen: A Single Stage Auto-regressive Transformer For Text-to-song Generation (2025)4.98
- Songglm: Lyric-to-melody Generation With 2D Alignment Encoding And Multi-task Pre-training (2024)3.58
- Unsupervised Melody-to-lyric Generation (2023)0.00
- Joint Learning Of Wording And Formatting For Singable Melody-to-lyric Generation (2023)0.00
- Transcribing Lyrics From Commercial Song Audio: The First Step Towards Singing Content Processing (2018)7.16