Spoken Term Detection Methods For Sparse Transcription In Very Low-resource Settings
2021 Β· Γric Le Ferrand, Steven Bird, Laurent Besacier
Abstract
We investigate the efficiency of two very different spoken term detection approaches for transcription when the available data is insufficient to train a robust ASR system. This work is grounded in very low-resource language documentation scenario where only few minutes of recording have been transcribed for a given language so far.Experiments on two oral languages show that a pretrained universal phone recognizer, fine-tuned with only a few minutes of target language speech, can be used for spoken term detection with a better overall performance than a dynamic time warping approach. In addition, we show that representing phoneme recognition ambiguity in a graph structure can further boost the recall while maintaining high precision in the low resource spoken term detection task.
Authors
(none)
Tags
Stats
Related papers
- Cross-lingual And Multilingual Spoken Term Detection For Low-resource Indian Languages (2020)0.00
- A Nonparametric Bayesian Approach For Spoken Term Detection By Example Query (2016)0.00
- Leveraging Translations For Speech Transcription In Low-resource Settings (2018)6.77
- Pretraining By Backtranslation For End-to-end ASR In Low-resource Settings (2018)0.00
- Domain Robust Feature Extraction For Rapid Low Resource ASR Development (2018)7.50
- Towards Speech-to-text Translation Without Speech Recognition (2017)10.35
- Almost-unsupervised Speech Recognition With Close-to-zero Resource Based On Phonetic Structures Learned From Very Small Unpaired Speech And Text Data (2018)0.00
- Exploring End-to-end Techniques For Low-resource Speech Recognition (2018)5.84