A Study On The Integration Of Pipeline And E2E SLU Systems For Spoken Semantic Parsing Toward STOP Quality Challenge
2023 Β· Siddhant Arora, Hayato Futami, Shih-Lun Wu, et al.
Abstract
Recently there have been efforts to introduce new benchmark tasks for spoken language understanding (SLU), like semantic parsing. In this paper, we describe our proposed spoken semantic parsing system for the quality track (Track 1) in Spoken Language Understanding Grand Challenge which is part of ICASSP Signal Processing Grand Challenge 2023. We experiment with both end-to-end and pipeline systems for this task. Strong automatic speech recognition (ASR) models like Whisper and pretrained Language models (LM) like BART are utilized inside our SLU framework to boost performance. We also investigate the output level combination of various models to get an exact match accuracy of 80.8, which won the 1st place at the challenge.
Authors
(none)
Tags
Stats
Related papers
- Modality Confidence Aware Training For Robust End-to-end Spoken Language Understanding (2023)2.26
- Integrating Pretrained ASR And LM To Perform Sequence Generation For Spoken Language Understanding (2023)5.24
- Recent Advances In End-to-end Spoken Language Understanding (2019)8.09
- Speech-language Pre-training For End-to-end Spoken Language Understanding (2021)9.41
- End-to-end Spoken Language Understanding: Performance Analyses Of A Voice Command Task In A Low Resource Setting (2022)8.35
- End-to-end Spoken Language Understanding For Generalized Voice Assistants (2021)6.34
- Towards End-to-end Spoken Language Understanding (2018)14.73
- A Study On The Integration Of Pre-trained SSL, ASR, LM And SLU Models For Spoken Language Understanding (2022)8.09