A Discriminative Hierarchical Plda-based Model For Spoken Language Recognition
2022 Β· Luciana Ferrer, Diego Castan, Mitchell McLaren, et al.
Abstract
Spoken language recognition (SLR) refers to the automatic process used to determine the language present in a speech sample. SLR is an important task in its own right, for example, as a tool to analyze or categorize large amounts of multi-lingual data. Further, it is also an essential tool for selecting downstream applications in a work flow, for example, to chose appropriate speech recognition or machine translation models. SLR systems are usually composed of two stages, one where an embedding representing the audio sample is extracted and a second one which computes the final scores for each language. In this work, we approach the SLR task as a detection problem and implement the second stage as a probabilistic linear discriminant analysis (PLDA) model. We show that discriminative training of the PLDA parameters gives large gains with respect to the usual generative training. Further, we propose a novel hierarchical approach where two PLDA models are trained, one to generate scores f
Authors
(none)
Tags
Stats
Related papers
- A Generalized Framework For Domain Adaptation Of PLDA In Speaker Recognition (2020)7.50
- Blind Score Normalization Method For PLDA Based Speaker Recognition (2016)0.00
- Local Training For PLDA In Speaker Verification (2016)0.00
- Multiobjective Optimization Training Of PLDA For Speaker Verification (2018)2.26
- Discriminative Speech Recognition Rescoring With Pre-trained Language Models (2023)2.26
- Probabilistic Spherical Discriminant Analysis: An Alternative To PLDA For Length-normalized Embeddings (2022)6.77
- Generalized Domain Adaptation Framework For Parametric Back-end In Speaker Recognition (2023)0.00
- Subspace-based Representation And Learning For Phonotactic Spoken Language Recognition (2022)0.00