Awesome Speech Audio
π
Papers
π§
Topics
π₯
Authors
π₯
Trending
πΊοΈ
Map
π
Leaderboards
π
Packs
π οΈ
Tools
π
Blogs
π€
Ask AI
βοΈ
Newsletter
π
Pro
+ Add Paper
βΎ
β
β all papers
Β·
overview
ZSV2C-MLLM: Zero-Shot Visual Voice Cloning Via Multimodal Large Language Models
Yanling Zhang
Β·
Linqin Wang
Β·
Shengxiang Gao
Β·
2026
Citations
0
GitHub
0β
HF
0
π
π
in
βοΈ
arXiv:s2_b755d2dfd70a β
Google Scholar β
Semantic Scholar β
Voice Cloning
Abstract
(no abstract)
Related papers
X-Voice: Enabling Everyone to Speak 30 Languages via Zero-Shot Cross-Lingual Voice Cloning
(2026)
β
MM-Sonate: Multimodal Controllable Audio-Video Generation with Zero-Shot Voice Cloning
(2026)
β
LM-VC: Zero-shot Voice Conversion via Speech Generation based on Language Models
(2023)
β
Multi-modal Adversarial Training for Zero-Shot Voice Cloning
(2024)
β
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
(2022)
β
MimicLM: Zero-Shot Voice Imitation through Autoregressive Modeling of Pseudo-Parallel Speech Corpora
(2026)
β
OneVoice: One Model, Triple Scenarios-Towards Unified Zero-shot Voice Conversion
(2026)
β
The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024
(2025)
β