Massively Multilingual Neural Grapheme-to-phoneme Conversion
2017 Β· Ben Peters, Jon Dehdari, Josef van Genabith
Abstract
Grapheme-to-phoneme conversion (g2p) is necessary for text-to-speech and automatic speech recognition systems. Most g2p systems are monolingual: they require language-specific data or handcrafting of rules. Such systems are difficult to extend to low resource languages, for which data and handcrafted rules are not available. As an alternative, we present a neural sequence-to-sequence approach to g2p which is trained on spelling--pronunciation pairs in hundreds of languages. The system shares a single encoder and decoder across all languages, allowing it to utilize the intrinsic similarities between different writing systems. We show an 11% improvement in phoneme error rate over an approach based on adapting high-resource monolingual g2p models to low-resource languages. Our model is also much more compact relative to previous approaches.
Authors
(none)
Tags
Stats
Related papers
- One Model To Pronounce Them All: Multilingual Grapheme-to-phoneme Conversion With A Transformer Ensemble (2020)0.00
- Liteg2p: A Fast, Light And High Accuracy Model For Grapheme-to-phoneme Conversion (2023)5.84
- Data-driven Grapheme-to-phoneme Representations For A Lexicon-free Text-to-speech (2024)4.52
- R-g2p: Evaluating And Enhancing Robustness Of Grapheme To Phoneme Conversion By Controlled Noise Introducing And Contextual Information Incorporation (2022)7.50
- Transformer Based Grapheme-to-phoneme Conversion (2020)11.39
- Improving Grapheme-to-phoneme Conversion Through In-context Knowledge Retrieval With Large Language Models (2024)2.26
- Token-level Ensemble Distillation For Grapheme-to-phoneme Conversion (2019)10.35
- G2G: Tts-driven Pronunciation Learning For Graphemic Hybrid ASR (2019)8.35