Disentangling Score Content And Performance Style For Joint Piano Rendering And Transcription
2025 Β· Wei Zeng, Junchuan Zhao, Ye Wang
Abstract
Expressive performance rendering (EPR) and automatic piano transcription (APT) are fundamental yet inverse tasks in music information retrieval: EPR generates expressive performances from symbolic scores, while APT recovers scores from performances. Despite their dual nature, prior work has addressed them independently. In this paper we propose a unified framework that jointly models EPR and APT by disentangling note-level score content and global performance style representations from both paired and unpaired data. Our framework is built on a transformer-based sequence-to-sequence architecture and is trained using only sequence-aligned data, without requiring fine-grained note-level alignment. To automate the rendering process while ensuring stylistic compatibility with the score, we introduce an independent diffusion-based performance style recommendation module that generates style embeddings directly from score content. This modular component supports both style transfer and flexib
Authors
(none)
Tags
Stats
Related papers
- End-to-end Real-world Polyphonic Piano Audio-to-score Transcription With Hierarchical Decoding (2024)0.00
- Dexter: Learning And Controlling Performance Expression With Diffusion Models (2024)7.16
- Expressivity-aware Music Performance Retrieval Using Mid-level Perceptual Features And Emotion Word Embeddings (2024)0.00
- Audio-to-score Alignment Of Piano Music Using Rnn-based Automatic Music Transcription (2017)0.00
- A Convolutional-attentional Neural Framework For Structure-aware Performance-score Synchronization (2022)6.34
- Play As You Like: Timbre-enhanced Multi-modal Music Style Transfer (2018)9.92
- Unified Cross-modal Translation Of Score Images, Symbolic Music, And Performance Audio (2025)0.00
- Piano Transcription By Hierarchical Language Modeling With Pretrained Roll-based Encoders (2025)4.52