Transformer-based Arabic Dialect Identification
2020 Β· Wanqiu Lin, Maulik Madhavi, Rohan Kumar Das, et al.
Abstract
This paper presents a dialect identification (DID) system based on the transformer neural network architecture. The conventional convolutional neural network (CNN)-based systems use the shorter receptive fields. We believe that long range information is equally important for language and DID, and self-attention mechanism in transformer captures the long range dependencies. In addition, to reduce the computational complexity, self-attention with downsampling is used to process the acoustic features. This process extracts sparse, yet informative features. Our experimental results show that transformer outperforms CNN-based networks on the Arabic dialect identification (ADI) dataset. We also report that the score-level fusion of CNN and transformer-based systems obtains an overall accuracy of 86.29% on the ADI17 database.
Authors
(none)
Tags
Stats
Related papers
- Convolutional Neural Networks And Language Embeddings For End-to-end Dialect Recognition (2018)12.40
- LSTM-TDNN With Convolutional Front-end For Dialect Identification In The 2019 Multi-genre Broadcast Challenge (2019)0.00
- Hybrid Deep Learning And Signal Processing For Arabic Dialect Recognition In Low-resource Settings (2025)0.00
- Target Speaker Voice Activity Detection With Transformers And Its Integration With End-to-end Neural Diarization (2022)10.48
- MIT-QCRI Arabic Dialect Identification System For The 2017 Multi-genre Broadcast Challenge (2017)8.60
- Transformer Attractors For Robust And Efficient End-to-end Neural Diarization (2023)6.77
- Efficient Arabic Emotion Recognition Using Deep Neural Networks (2020)11.93
- UTD-CRSS Submission For MGB-3 Arabic Dialect Identification: Front-end And Back-end Advancements On Broadcast Speech (2017)4.52