Homophone-based Label Smoothing In End-to-end Automatic Speech Recognition
2020 Β· Yi Zheng, Xianjie Yang, Xuyong Dang
Abstract
A new label smoothing method that makes use of prior knowledge of a language at human level, homophone, is proposed in this paper for automatic speech recognition (ASR). Compared with its forerunners, the proposed method uses pronunciation knowledge of homophones in a more complex way. End-to-end ASR models that learn acoustic model and language model jointly and modelling units of characters are necessary conditions for this method. Experiments with hybrid CTC sequence-to-sequence model show that the new method can reduce character error rate (CER) by 0.4% absolutely.
Authors
(none)
Tags
Stats
Related papers
- Right Label Context In End-to-end Training Of Time-synchronous ASR Models (2025)0.00
- Multiple-hypothesis Ctc-based Semi-supervised Adaptation Of End-to-end Speech Recognition (2021)5.84
- Combining Frame-synchronous And Label-synchronous Systems For Speech Recognition (2021)0.00
- Adaptive Frequency Cepstral Coefficients For Word Mispronunciation Detection (2016)5.84
- Audio-attention Discriminative Language Model For ASR Rescoring (2019)9.23
- Alternating Weak Triphone/bpe Alignment Supervision From Hybrid Model Improves End-to-end ASR (2024)0.00
- An Improved Hybrid Ctc-attention Model For Speech Recognition (2018)0.00
- Slimipl: Language-model-free Iterative Pseudo-labeling (2020)10.74