BERT-LID: Leveraging BERT To Improve Spoken Language Identification
2022 Β· Yuting Nie, Junhong Zhao, Wei-Qiang Zhang, et al.
Abstract
Language identification is the task of automatically determining the identity of a language conveyed by a spoken segment. It has a profound impact on the multilingual interoperability of an intelligent speech system. Despite language identification attaining high accuracy on medium or long utterances(>3s), the performance on short utterances (<=1s) is still far from satisfactory. We propose a BERT-based language identification system (BERT-LID) to improve language identification performance, especially on short-duration speech segments. We extend the original BERT model by taking the phonetic posteriorgrams (PPG) derived from the front-end phone recognizer as input. Then we deployed the optimal deep classifier followed by it for language identification. Our BERT-LID model can improve the baseline accuracy by about 6.5% on long-segment identification and 19.9% on short-segment identification, demonstrating our BERT-LID's effectiveness to language identification.
Authors
(none)
Tags
Stats
Related papers
- Bertphone: Phonetically-aware Encoder Representations For Utterance-level Speaker And Language Recognition (2019)13.27
- Phonetic Temporal Neural Model For Language Identification (2017)12.40
- ST-BERT: Cross-modal Language Model Pre-training For End-to-end Spoken Language Understanding (2020)9.59
- Towards Relevance And Sequence Modeling In Language Recognition (2020)9.23
- A Compact End-to-end Model With Local And Global Context For Spoken Language Identification (2022)5.84
- Wabert: A Low-resource End-to-end Model For Spoken Language Understanding And Speech-to-bert Alignment (2022)0.00
- What BERT Based Language Models Learn In Spoken Transcripts: An Empirical Study (2021)2.26
- Language Identification With Deep Bottleneck Features (2018)0.00