Dailytalk: Spoken Dialogue Dataset For Conversational Text-to-speech
2022 Β· Keon Lee, Kyumin Park, Daeyoung Kim
Abstract
The majority of current Text-to-Speech (TTS) datasets, which are collections of individual utterances, contain few conversational aspects. In this paper, we introduce DailyTalk, a high-quality conversational speech dataset designed for conversational TTS. We sampled, modified, and recorded 2,541 dialogues from the open-domain dialogue dataset DailyDialog inheriting its annotated attributes. On top of our dataset, we extend prior work as our baseline, where a non-autoregressive TTS is conditioned on historical information in a dialogue. From the baseline experiment with both general and our novel metrics, we show that DailyTalk can be used as a general TTS dataset, and more than that, our baseline can represent contextual information from DailyTalk. The DailyTalk dataset and baseline code are freely available for academic use with CC-BY-SA 4.0 license.
Authors
(none)
Tags
Stats
Related papers
- Spokenwoz: A Large-scale Speech-text Benchmark For Spoken Task-oriented Dialogue Agents (2023)2.26
- Sd-eval: A Benchmark Dataset For Spoken Dialogue Understanding Beyond Words (2024)11.32
- Speechdialoguefactory: Generating High-quality Speech Dialogue Data To Accelerate Your Speech-llm Development (2025)0.00
- Dialogueagents: A Hybrid Agent-based Speech Synthesis Framework For Multi-party Dialogue (2025)1.69
- An Automated End-to-end Open-source Software For High-quality Text-to-speech Dataset Generation (2024)0.00
- The People's Speech: A Large-scale Diverse English Speech Recognition Dataset For Commercial Usage (2021)0.00
- TDASS: Target Domain Adaptation Speech Synthesis Framework For Multi-speaker Low-resource TTS (2022)0.00
- Text-to-speech Synthesis In The Wild (2024)0.00