Adversarial Joint Training With Self-attention Mechanism For Robust End-to-end Speech Recognition
2021 Β· Lujun Li, Yikai Kang, Yuchen Shi, et al.
Abstract
Lately, the self-attention mechanism has marked a new milestone in the field of automatic speech recognition (ASR). Nevertheless, its performance is susceptible to environmental intrusions as the system predicts the next output symbol depending on the full input sequence and the previous predictions. Inspired by the extensive applications of the generative adversarial networks (GANs) in speech enhancement and ASR tasks, we propose an adversarial joint training framework with the self-attention mechanism to boost the noise robustness of the ASR system. Generally, it consists of a self-attention speech enhancement GAN and a self-attention end-to-end ASR model. There are two highlights which are worth noting in this proposed framework. One is that it benefits from the advancement of both self-attention mechanism and GANs; while the other is that the discriminator of GAN plays the role of the global discriminant network in the stage of the adversarial joint training, which guides the enhan
Authors
(none)
Tags
Stats
Related papers
- Boosting Noise Robustness Of Acoustic Model Via Deep Adversarial Training (2018)9.23
- Fine-tuning Of Pre-trained End-to-end Speech Recognition With Generative Adversarial Networks (2021)5.84
- Robust Speech Recognition Using Generative Adversarial Networks (2017)11.29
- Exploring Speech Enhancement With Generative Adversarial Networks For Robust Speech Recognition (2017)16.14
- Self-attention Generative Adversarial Network For Speech Enhancement (2020)11.85
- Channel-aware Domain-adaptive Generative Adversarial Network For Robust Speech Recognition (2024)4.52
- Gated Recurrent Fusion With Joint Training Framework For Robust End-to-end Speech Recognition (2020)14.55
- Efficient Acoustic Feature Transformation In Mismatched Environments Using A Guided-gan (2022)2.26