Unsupervised Neural And Bayesian Models For Zero-resource Speech Processing
2017 Β· Herman Kamper
Abstract
In settings where only unlabelled speech data is available, zero-resource speech technology needs to be developed without transcriptions, pronunciation dictionaries, or language modelling text. There are two central problems in zero-resource speech processing: (i) finding frame-level feature representations which make it easier to discriminate between linguistic units (phones or words), and (ii) segmenting and clustering unlabelled speech into meaningful units. In this thesis, we argue that a combination of top-down and bottom-up modelling is advantageous in tackling these two problems. To address the problem of frame-level representation learning, we present the correspondence autoencoder (cAE), a neural network trained with weak top-down supervision from an unsupervised term discovery system. By combining this top-down supervision with unsupervised bottom-up initialization, the cAE yields much more discriminative features than previous approaches. We then present our unsupervised s
Authors
(none)
Tags
Stats
Related papers
- Improving Unsupervised Subword Modeling Via Disentangled Speech Representation Learning And Transformation (2019)5.24
- Multilingual And Unsupervised Subword Modeling For Zero-resource Languages (2018)7.81
- Self-supervised Language Learning From Raw Audio: Lessons From The Zero Resource Speech Challenge (2022)10.07
- Unsupervised Feature Learning For Speech Using Correspondence And Siamese Networks (2020)8.09
- Truly Unsupervised Acoustic Word Embeddings Using Weak Top-down Constraints In Encoder-decoder Models (2018)0.00
- The Zero Resource Speech Benchmark 2021: Metrics And Baselines For Unsupervised Spoken Language Modeling (2020)0.00
- Almost-unsupervised Speech Recognition With Close-to-zero Resource Based On Phonetic Structures Learned From Very Small Unpaired Speech And Text Data (2018)0.00
- Multilingual Acoustic Word Embedding Models For Processing Zero-resource Languages (2020)8.09