T-GSA: Transformer With Gaussian-weighted Self-attention For Speech Enhancement
2019 Β· Jaeyoung Kim, Mostafa El-Khamy, Jungwon Lee
Abstract
Transformer neural networks (TNN) demonstrated state-of-art performance on many natural language processing (NLP) tasks, replacing recurrent neural networks (RNNs), such as LSTMs or GRUs. However, TNNs did not perform well in speech enhancement, whose contextual nature is different than NLP tasks, like machine translation. Self-attention is a core building block of the Transformer, which not only enables parallelization of sequence computation, but also provides the constant path length between symbols that is essential to learning long-range dependencies. In this paper, we propose a Transformer with Gaussian-weighted self-attention (T-GSA), whose attention weights are attenuated according to the distance between target and context symbols. The experimental results show that the proposed T-GSA has significantly improved speech-enhancement performance, compared to the Transformer and RNNs.
Authors
(none)
Tags
Stats
Related papers
- Transformer-based End-to-end Speech Recognition With Residual Gaussian-based Self-attention (2021)5.84
- Transformer-transducer: End-to-end Speech Recognition With Self-attention (2019)0.00
- Neural Speech Synthesis With Transformer Network (2018)19.95
- Graphspeech: Syntax-aware Graph Attention Network For Neural Speech Synthesis (2020)7.50
- S-transformer: Segment-transformer For Robust Neural Speech Synthesis (2020)0.00
- Gaussian Kernelized Self-attention For Long Sequence Data And Its Application To Ctc-based Speech Recognition (2021)4.52
- Study Of Lightweight Transformer Architectures For Single-channel Speech Enhancement (2025)3.58
- Simplified Self-attention For Transformer-based End-to-end Speech Recognition (2020)10.61