On Structured Sparsity Of Phonological Posteriors For Linguistic Parsing
2016 · Milos Cernak, Afsaneh Asaei, Hervé Bourlard
Abstract
The speech signal conveys information on different time scales from short time scale or segmental, associated to phonological and phonetic information to long time scale or supra segmental, associated to syllabic and prosodic information. Linguistic and neurocognitive studies recognize the phonological classes at segmental level as the essential and invariant representations used in speech temporal organization. In the context of speech processing, a deep neural network (DNN) is an effective computational method to infer the probability of individual phonological classes from a short segment of speech signal. A vector of all phonological class probabilities is referred to as phonological posterior. There are only very few classes comprising a short term speech signal; hence, the phonological posterior is a sparse vector. Although the phonological posteriors are estimated at segmental level, we claim that they convey supra-segmental information. Specifically, we demonstrate that phonolo
Authors
(none)
Tags
Stats
Related papers
- Subspace-based Representation And Learning For Phonotactic Spoken Language Recognition (2022)0.00
- Learning-based A Posteriori Speech Presence Probability Estimation And Applications (2025)0.00
- Analyzing Analytical Methods: The Case Of Phonology In Neural Models Of Spoken Language (2020)6.77
- Low-rank And Sparse Soft Targets To Learn Better DNN Acoustic Models (2016)3.58
- Composition Of Deep And Spiking Neural Networks For Very Low Bit Rate Speech Coding (2016)9.92
- Parsing Speech: A Neural Approach To Integrating Lexical And Acoustic-prosodic Information (2017)8.60
- Non-verbal Information In Spontaneous Speech -- Towards A New Framework Of Analysis (2024)0.00
- Feature Joint-state Posterior Estimation In Factorial Speech Processing Models Using Deep Neural Networks (2017)3.58