Effectiveness Of Text, Acoustic, And Lattice-based Representations In Spoken Language Understanding Tasks
2022 Β· EsaΓΊ Villatoro-Tello, Srikanth Madikeri, Juan Zuluaga-Gomez, et al.
Abstract
In this paper, we perform an exhaustive evaluation of different representations to address the intent classification problem in a Spoken Language Understanding (SLU) setup. We benchmark three types of systems to perform the SLU intent detection task: 1) text-based, 2) lattice-based, and a novel 3) multimodal approach. Our work provides a comprehensive analysis of what could be the achievable performance of different state-of-the-art SLU systems under different circumstances, e.g., automatically- vs. manually-generated transcripts. We evaluate the systems on the publicly available SLURP spoken language resource corpus. Our results indicate that using richer forms of Automatic Speech Recognition (ASR) outputs, namely word-consensus-networks, allows the SLU system to improve in comparison to the 1-best setup (5.5% relative improvement). However, crossmodal approaches, i.e., learning from acoustic and text embeddings, obtains performance similar to the oracle setup, a relative improvement
Authors
(none)
Tags
Stats
Related papers
- Towards ASR Robust Spoken Language Understanding Through In-context Learning With Word Confusion Networks (2024)0.00
- Multimodal Audio-textual Architecture For Robust Spoken Language Understanding (2023)0.00
- A Study On The Integration Of Pre-trained SSL, ASR, LM And SLU Models For Spoken Language Understanding (2022)8.09
- Modality Confidence Aware Training For Robust End-to-end Spoken Language Understanding (2023)2.26
- Towards Reducing The Need For Speech Training Data To Build Spoken Language Understanding Systems (2022)8.35
- Adapting Pretrained Transformer To Lattices For Spoken Language Understanding (2020)12.00
- Speech To Semantics: Improve ASR And NLU Jointly Via All-neural Interfaces (2020)9.03
- Integrating Pretrained ASR And LM To Perform Sequence Generation For Spoken Language Understanding (2023)5.24