Diffusiontalker: Efficient And Compact Speech-driven 3D Talking Head Via Personalizer-guided Distillation
2025 Β· Peng Chen, Xiaobao Wei, Ming Lu, et al.
Abstract
Real-time speech-driven 3D facial animation has been attractive in academia and industry. Traditional methods mainly focus on learning a deterministic mapping from speech to animation. Recent approaches start to consider the nondeterministic fact of speech-driven 3D face animation and employ the diffusion model for the task. Existing diffusion-based methods can improve the diversity of facial animation. However, personalized speaking styles conveying accurate lip language is still lacking, besides, efficiency and compactness still need to be improved. In this work, we propose DiffusionTalker to address the above limitations via personalizer-guided distillation. In terms of personalization, we introduce a contrastive personalizer that learns identity and emotion embeddings to capture speaking styles from audio. We further propose a personalizer enhancer during distillation to enhance the influence of embeddings on facial animation. For efficiency, we use iterative distillation to reduce
Authors
(none)
Tags
Stats
Related papers
- Facediffuser: Speech-driven 3D Facial Animation Synthesis Using Diffusion (2023)13.79
- Diffspeaker: Speech-driven 3D Facial Animation With Diffusion Transformer (2024)5.24
- Emotivetalk: Expressive Talking Head Generation Through Audio Information Decoupling And Emotional Video Diffusion (2024)0.00
- Df-3dface: One-to-many Speech Synchronized 3D Face Animation With Diffusion (2023)0.00
- Freetalker: Controllable Speech And Text-driven Gesture Generation Based On Diffusion Models For Enhanced Speaker Naturalness (2024)9.59
- Controllable Expressive 3D Facial Animation Via Diffusion In A Unified Multimodal Space (2025)0.00
- FADA: Fast Diffusion Avatar Synthesis With Mixed-supervised Multi-cfg Distillation (2024)2.26
- Said: Speech-driven Blendshape Facial Animation With Diffusion (2023)0.00