End-to-end Spoken Language Understanding: Performance Analyses Of A Voice Command Task In A Low Resource Setting
2022 · Thierry Desot, François Portet, Michel Vacher
Abstract
Spoken Language Understanding (SLU) is a core task in most human-machine interaction systems. With the emergence of smart homes, smart phones and smart speakers, SLU has become a key technology for the industry. In a classical SLU approach, an Automatic Speech Recognition (ASR) module transcribes the speech signal into a textual representation from which a Natural Language Understanding (NLU) module extracts semantic information. Recently End-to-End SLU (E2E SLU) based on Deep Neural Networks has gained momentum since it benefits from the joint optimization of the ASR and the NLU parts, hence limiting the cascade of error effect of the pipeline architecture. However, little is known about the actual linguistic properties used by E2E models to predict concepts and intents from speech input. In this paper, we present a study identifying the signal features and other linguistic properties used by an E2E model to perform the SLU task. The study is carried out in the application domain of a
Authors
(none)
Tags
Stats
Related papers
- End-to-end Spoken Language Understanding For Generalized Voice Assistants (2021)6.34
- Speech-language Pre-training For End-to-end Spoken Language Understanding (2021)9.41
- End-to-end Architectures For Asr-free Spoken Language Understanding (2019)8.60
- Recent Advances In End-to-end Spoken Language Understanding (2019)8.09
- Modality Confidence Aware Training For Robust End-to-end Spoken Language Understanding (2023)2.26
- Exploring Transfer Learning For End-to-end Spoken Language Understanding (2020)5.24
- End-to-end Spoken Language Understanding Using Transformer Networks And Self-supervised Pre-trained Features (2020)5.24
- Attentive Contextual Carryover For Multi-turn End-to-end Spoken Language Understanding (2021)7.16