Probabilistic Speech-driven 3D Facial Motion Synthesis: New Benchmarks, Methods, And Applications
2023 Β· Karren D. Yang, Anurag Ranjan, Jen-Hao Rick Chang, et al.
Abstract
We consider the task of animating 3D facial geometry from speech signal. Existing works are primarily deterministic, focusing on learning a one-to-one mapping from speech signal to 3D face meshes on small datasets with limited speakers. While these models can achieve high-quality lip articulation for speakers in the training set, they are unable to capture the full and diverse distribution of 3D facial motions that accompany speech in the real world. Importantly, the relationship between speech and facial motion is one-to-many, containing both inter-speaker and intra-speaker variations and necessitating a probabilistic approach. In this paper, we identify and address key challenges that have so far limited the development of probabilistic models: lack of datasets and metrics that are suitable for training and evaluating them, as well as the difficulty of designing a model that generates diverse results while remaining faithful to a strong conditioning signal as speech. We first propose
Authors
(none)
Tags
Stats
Related papers
- Df-3dface: One-to-many Speech Synchronized 3D Face Animation With Diffusion (2023)0.00
- Probtalk3d: Non-deterministic Emotion Controllable Speech-driven 3D Facial Animation Synthesis Using VQ-VAE (2024)11.53
- Facediffuser: Speech-driven 3D Facial Animation Synthesis Using Diffusion (2023)13.79
- Said: Speech-driven Blendshape Facial Animation With Diffusion (2023)0.00
- Cstalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation (2024)3.58
- Fake It To Make It: Using Synthetic Data To Remedy The Data Shortage In Joint Multimodal Speech-and-gesture Synthesis (2024)6.34
- Pmmtalk: Speech-driven 3D Facial Animation From Complementary Pseudo Multi-modal Features (2023)3.58
- Facexhubert: Text-less Speech-driven E(x)pressive 3D Facial Animation Synthesis Using Self-supervised Speech Representation Learning (2023)11.49