Dialogueagents: A Hybrid Agent-based Speech Synthesis Framework For Multi-party Dialogue
2025 Β· Xiang Li, Duyi Pan, Hongru Xiao, et al.
Abstract
Speech synthesis is crucial for human-computer interaction, enabling natural and intuitive communication. However, existing datasets involve high construction costs due to manual annotation and suffer from limited character diversity, contextual scenarios, and emotional expressiveness. To address these issues, we propose DialogueAgents, a novel hybrid agent-based speech synthesis framework, which integrates three specialized agents -- a script writer, a speech synthesizer, and a dialogue critic -- to collaboratively generate dialogues. Grounded in a diverse character pool, the framework iteratively refines dialogue scripts and synthesizes speech based on speech review, boosting emotional expressiveness and paralinguistic features of the synthesized dialogues. Using DialogueAgent, we contribute MultiTalk, a bilingual, multi-party, multi-turn speech dialogue dataset covering diverse topics. Extensive experiments demonstrate the effectiveness of our framework and the high quality of the M
Authors
(none)
Tags
Stats
Related papers
- Speechdialoguefactory: Generating High-quality Speech Dialogue Data To Accelerate Your Speech-llm Development (2025)0.00
- Speechagents: Human-communication Simulation With Multi-modal Multi-agent Systems (2024)3.87
- Fake It To Make It: Using Synthetic Data To Remedy The Data Shortage In Joint Multimodal Speech-and-gesture Synthesis (2024)6.34
- A Framework For Synthetic Audio Conversations Generation Using Large Language Models (2024)3.58
- Speechrole: A Large-scale Dataset And Benchmark For Evaluating Speech Role-playing Agents (2025)1.91
- Investigating The Effects Of Large-scale Pseudo-stereo Data And Different Speech Foundation Model On Dialogue Generative Spoken Language Model (2024)0.00
- Sd-eval: A Benchmark Dataset For Spoken Dialogue Understanding Beyond Words (2024)11.32
- Property-aware Multi-speaker Data Simulation: A Probabilistic Modelling Technique For Synthetic Data Generation (2023)6.34