LSTM-TDNN With Convolutional Front-end For Dialect Identification In The 2019 Multi-genre Broadcast Challenge
2019 Β· Xiaoxiao Miao, Ian McLoughlin
Abstract
This paper presents a novel Dialect Identification (DID) system developed for the Fifth Edition of the Multi-Genre Broadcast challenge, the task of Fine-grained Arabic Dialect Identification (MGB-5 ADI Challenge). The system improves upon traditional DNN x-vector performance by employing a Convolutional and Long Short Term Memory-Recurrent (CLSTM) architecture to combine the benefits of a convolutional neural network front-end for feature extraction and a back-end recurrent neural to capture longer temporal dependencies. Furthermore we investigate intensive augmentation of one low resource dialect in the highly unbalanced training set using time-scale modification (TSM). This converts an utterance to several time-stretched or time-compressed versions, subsequently used to train the CLSTM system without using any other corpus. In this paper, we also investigate speech augmentation using MUSAN and the RIR datasets to increase the quantity and diversity of the existing training data in th
Authors
(none)
Tags
Stats
Related papers
- Convolutional Neural Networks And Language Embeddings For End-to-end Dialect Recognition (2018)12.40
- MIT-QCRI Arabic Dialect Identification System For The 2017 Multi-genre Broadcast Challenge (2017)8.60
- UTD-CRSS Submission For MGB-3 Arabic Dialect Identification: Front-end And Back-end Advancements On Broadcast Speech (2017)4.52
- Transformer-based Arabic Dialect Identification (2020)9.03
- A Deep Learning Approach For Similar Languages, Varieties And Dialects (2019)0.00
- Hybrid Deep Learning And Signal Processing For Arabic Dialect Recognition In Low-resource Settings (2025)0.00
- Low-resource Speech Recognition And Dialect Identification Of Irish In A Multi-task Framework (2024)0.00
- The MGB-2 Challenge: Arabic Multi-dialect Broadcast Media Recognition (2016)11.76