End To End Hindi To English Speech Conversion Using Bark, Mbart And A Finetuned XLSR Wav2vec2
2024 Β· Aniket Tathe, Anand Kamble, Suyash Kumbharkar, et al.
Abstract
Speech has long been a barrier to effective communication and connection, persisting as a challenge in our increasingly interconnected world. This research paper introduces a transformative solution to this persistent obstacle an end-to-end speech conversion framework tailored for Hindi-to-English translation, culminating in the synthesis of English audio. By integrating cutting-edge technologies such as XLSR Wav2Vec2 for automatic speech recognition (ASR), mBART for neural machine translation (NMT), and a Text-to-Speech (TTS) synthesis component, this framework offers a unified and seamless approach to cross-lingual communication. We delve into the intricate details of each component, elucidating their individual contributions and exploring the synergies that enable a fluid transition from spoken Hindi to synthesized English audio.
Authors
(none)
Tags
Stats
Related papers
- Transcription And Translation Of Videos Using Fine-tuned XLSR Wav2vec2 On Custom Dataset And Mbart (2024)0.00
- Custom Data Augmentation For Low Resource ASR Using Bark And Retrieval-based Voice Conversion (2023)0.00
- End-to-end ASR For Code-switched Hindi-english Speech (2019)0.00
- Towards Developing State-of-the-art TTS Synthesisers For 13 Indian Languages With Signal Processing Aided Alignments (2022)0.00
- Rapid Speaker Adaptation In Low Resource Text To Speech Systems Using Synthetic Data And Transfer Learning (2023)0.00
- ELAICHI: Enhancing Low-resource TTS By Addressing Infrequent And Low-frequency Character Bigrams (2024)0.00
- Attention Based End To End Speech Recognition For Voice Search In Hindi And English (2021)6.77
- An Automatic Speech Recognition System For Bengali Language Based On Wav2vec2 And Transfer Learning (2022)0.00