Liwhiz: A Non-intrusive Lyric Intelligibility Prediction System For The Cadenza Challenge
2025 · Ram C. M. C. Shekar, Iván López-Espejo
Abstract
We present LIWhiz, a non-intrusive lyric intelligibility prediction system submitted to the ICASSP 2026 Cadenza Challenge. LIWhiz leverages Whisper for robust feature extraction and a trainable back-end for score prediction. Tested on the Cadenza Lyric Intelligibility Prediction (CLIP) evaluation set, LIWhiz achieves a root mean square error (RMSE) of 27.07%, a 22.4% relative RMSE reduction over the STOI-based baseline, yielding a substantial improvement in normalized cross-correlation.
Authors
(none)
Tags
Stats
Related papers
- Leveraging Whisper Embeddings For Audio-based Lyrics Matching (2025)0.00
- Lyricwhiz: Robust Multilingual Zero-shot Lyrics Transcription By Whispering To Chatgpt (2023)0.00
- HCLAS-X: Hierarchical And Cascaded Lyrics Alignment System Using Multimodal Cross-correlation (2023)0.00
- Adapting Pretrained Speech Model For Mandarin Lyrics Transcription And Alignment (2023)3.58
- Whispervc: Decoupled Cross-domain Alignment And Speech Generation For Low-resource Whisper-to-normal Conversion (2025)0.00
- Probing The Hidden Talent Of ASR Foundation Models For L2 English Oral Assessment (2025)0.00
- Lstm-based Whisper Detection (2018)0.00
- Whistle: Data-efficient Multilingual And Crosslingual Speech Recognition Via Weakly Phonetic Supervision (2024)10.38