Subspace-based Representation And Learning For Phonotactic Spoken Language Recognition
2022 Β· Hung-Shin Lee, Yu Tsao, Shyh-Kang Jeng, et al.
Abstract
Phonotactic constraints can be employed to distinguish languages by representing a speech utterance as a multinomial distribution or phone events. In the present study, we propose a new learning mechanism based on subspace-based representation, which can extract concealed phonotactic structures from utterances, for language verification and dialect/accent identification. The framework mainly involves two successive parts. The first part involves subspace construction. Specifically, it decodes each utterance into a sequence of vectors filled with phone-posteriors and transforms the vector sequence into a linear orthogonal subspace based on low-rank matrix factorization or dynamic linear modeling. The second part involves subspace learning based on kernel machines, such as support vector machines and the newly developed subspace-based neural networks (SNNs). The input layer of SNNs is specifically designed for the sample represented by subspaces. The topology ensures that the same output
Authors
(none)
Tags
Stats
Related papers
- Exploiting Cross-lingual Speaker And Phonetic Diversity For Unsupervised Subword Modeling (2019)6.77
- Self-supervised Predictive Coding Models Encode Speaker And Phonetic Information In Orthogonal Subspaces (2023)7.16
- Unsupervised Acoustic Unit Discovery By Leveraging A Language-independent Subword Discriminative Feature Representation (2021)5.84
- A Discriminative Hierarchical Plda-based Model For Spoken Language Recognition (2022)5.24
- Unsupervised Representation Learning Of Speech For Dialect Identification (2018)7.16
- Learning Invariant Representation And Risk Minimized For Unsupervised Accent Domain Adaptation (2022)2.26
- On Structured Sparsity Of Phonological Posteriors For Linguistic Parsing (2016)5.24
- Learning Utterance-level Representations Through Token-level Acoustic Latents Prediction For Expressive Speech Synthesis (2022)0.00