Dexter: Learning And Controlling Performance Expression With Diffusion Models
2024 · Huan Zhang, Shreyan Chowdhury, Carlos Eduardo Cancino-Chacón, et al.
Abstract
In the pursuit of developing expressive music performance models using artificial intelligence, this paper introduces DExter, a new approach leveraging diffusion probabilistic models to render Western classical piano performances. In this approach, performance parameters are represented in a continuous expression space and a diffusion model is trained to predict these continuous parameters while being conditioned on the musical score. Furthermore, DExter also enables the generation of interpretations (expressive variations of a performance) guided by perceptually meaningful features by conditioning jointly on score and perceptual feature representations. Consequently, we find that our model is useful for learning expressive performance, generating perceptually steered performances, and transferring performance styles. We assess the model through quantitative and qualitative analyses, focusing on specific performance metrics regarding dimensions like asynchrony and articulation, as well
Authors
(none)
Tags
Stats
Related papers
- D3RM: A Discrete Denoising Diffusion Refinement Model For Piano Transcription (2025)5.93
- Extract And Diffuse: Latent Integration For Improved Diffusion-based Speech And Vocal Enhancement (2024)0.00
- Diff-a-riff: Musical Accompaniment Co-creation Via Latent Diffusion Models (2024)0.00
- Disentangling Score Content And Performance Style For Joint Piano Rendering And Transcription (2025)0.00
- Conditional Diffusion As Latent Constraints For Controllable Symbolic Music Generation (2025)0.00
- MUSIC: Learning Muscle-driven Dexterous Hand Control (2026)0.00
- DEX-TTS: Diffusion-based Expressive Text-to-speech With Style Modeling On Time Variability (2024)0.00
- Diffsheg: A Diffusion-based Approach For Real-time Speech-driven Holistic 3D Expression And Gesture Generation (2024)0.00