Common Voice
Canonical45papers using it
2022first seen
Papers using Common Voice (44)
- Whistle: Data-efficient Multilingual And Crosslingual Speech Recognition Via Weakly Phonetic SupervisionGLOBE: A High-quality English Corpus With Global Accents For Zero-shot Speaker Adaptive Text-to-speechPretraining Approaches For Spoken Language Recognition: Taltech Submission To The OLR 2021 ChallengeTranusr: Phoneme-to-word Transcoder Based Unified Speech Representation Learning For Cross-lingual Speech RecognitionSome Voices Are Too Common: Building Fair Speech Recognition Systems Using The Common Voice DatasetA Comparative Analysis Of Bilingual And Trilingual Wav2vec Models For Automatic Speech Recognition In Multilingual Oral History ArchivesSwedish Whispers; Leveraging a Massive Speech Corpus for Swedish Speech RecognitionFLEURS-Kobani: Extending the FLEURS Dataset for Northern KurdishM-CIF: Multi-Scale Alignment For CIF-Based Non-Autoregressive ASRScalable Controllable Accented TTSDeRAGEC: Denoising Named Entity Candidates with Synthetic Rationale for ASR Error CorrectionRobust Unsupervised Adaptation of a Speech Recogniser Using Entropy Minimisation and Speaker CodesCMU's IWSLT 2025 Simultaneous Speech Translation SystemDysarthria Normalization Via Local Lie Group Transformations For Robust ASREvaluation of LLMs in Speech is Often Flawed: Test Set Contamination in Large Language Models for Speech RecognitionDysarthria Normalization via Local Lie Group Transformations for Robust
ASRAn Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASRWhistle: Data-Efficient Multilingual and Crosslingual Speech Recognition
via Weakly Phonetic SupervisionCustom Data Augmentation For Low Resource ASR Using Bark And Retrieval-based Voice ConversionLow-resourced Speech Recognition For Iu Mien Language Via Weakly-supervised Phoneme-based Multilingual Pre-trainingTextless Speech-to-Speech Translation With Limited Parallel DataCustom Data Augmentation for low resource ASR using Bark and
Retrieval-Based Voice ConversionGigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and RefinementSpeech Corpora Divergence Based Unsupervised Data Selection for ASRSome voices are too common: Building fair speech recognition systems
using the Common Voice datasetXLSR-Transducer: Streaming ASR for Self-Supervised Pretrained ModelsDistilling a Pretrained Language Model to a Multilingual ASR ModelASR2K: Speech Recognition for Around 2000 Languages without AudioMeWEHV: Mel and Wave Embeddings for Human Voice TasksCan we use Common Voice to train a Multi-Speaker TTS system?Iterative pseudo-forced alignment by acoustic CTC loss for
self-supervised ASR domain adaptationUnsupervised ASR via Cross-Lingual Pseudo-LabelingTranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation
Learning for Cross-lingual Speech RecognitionConnecting Speech Encoder and Large Language Model for ASRSSHR: Leveraging Self-supervised Hierarchical Representations for
Multilingual Automatic Speech RecognitionLUPET: Incorporating Hierarchical Information Path into Multilingual ASRGLOBE: A High-quality English Corpus with Global Accents for Zero-shot
Speaker Adaptive Text-to-SpeechLow-Resourced Speech Recognition for Iu Mien Language via
Weakly-Supervised Phoneme-based Multilingual Pre-trainingImproving noisy student training for low-resource languages in
End-to-End ASR using CycleGAN and inter-domain lossesLarge Language Model Should Understand Pinyin for Chinese ASR Error
CorrectionFast Streaming Transducer ASR Prototyping via Knowledge Distillation
with WhisperExploring Capabilities of Monolingual Audio Transformers using Large
Datasets in Automatic Speech Recognition of CzechIndonesian Automatic Speech Recognition with XLSR-53A Comparative Analysis of Bilingual and Trilingual Wav2Vec Models for
Automatic Speech Recognition in Multilingual Oral History Archives