Mandarin Tone Modeling Using Recurrent Neural Networks
2017 Β· Hao Huang, Ying Hu, Haihua Xu
Abstract
We propose an Encoder-Classifier framework to model the Mandarin tones using recurrent neural networks (RNN). In this framework, extracted frames of features for tone classification are fed in to the RNN and casted into a fixed dimensional vector (tone embedding) and then classified into tone types using a softmax layer along with other auxiliary inputs. We investigate various configurations that help to improve the model, including pooling, feature splicing and utilization of syllable-level tone embeddings. Besides, tone embeddings and durations of the contextual syllables are exploited to facilitate tone classification. Experimental results on Mandarin tone classification show the proposed network setups improve tone classification accuracy. The results indicate that the RNN encoder-classifier based tone model flexibly accommodates heterogeneous inputs (sequential and segmental) and hence has the advantages from both the sequential classification tone models and segmental classificat
Authors
(none)
Tags
Stats
Related papers
- End-to-end Mandarin Tone Classification With Short Term Context Information (2021)0.00
- Research On Modeling Units Of Transformer Transducer For Mandarin Speech Recognition (2020)0.00
- Cascade Rnn-transducer: Syllable Based Streaming On-device Mandarin Speech Recognition With A Syllable-to-character Converter (2020)9.92
- Generating Mandarin And Cantonese F0 Contours With Decision Trees And Blstms (2018)0.00
- Frame Stacking And Retaining For Recurrent Neural Network Acoustic Model (2017)0.00
- Pronunciation-aware Unique Character Encoding For RNN Transducer-based Mandarin Speech Recognition (2022)3.58
- Toneunit: A Speech Discretization Approach For Tonal Language Speech Synthesis (2024)0.00
- Investigation Of Deep Neural Network Acoustic Modelling Approaches For Low Resource Accented Mandarin Speech Recognition (2022)0.00