Adapting Pretrained Transformer To Lattices For Spoken Language Understanding
2020 Β· Chao-Wei Huang, Yun-Nung Chen
Abstract
Lattices are compact representations that encode multiple hypotheses, such as speech recognition results or different word segmentations. It is shown that encoding lattices as opposed to 1-best results generated by automatic speech recognizer (ASR) boosts the performance of spoken language understanding (SLU). Recently, pretrained language models with the transformer architecture have achieved the state-of-the-art results on natural language understanding, but their ability of encoding lattices has not been explored. Therefore, this paper aims at adapting pretrained transformers to lattice inputs in order to perform understanding tasks specifically for spoken language. Our experiments on the benchmark ATIS dataset show that fine-tuning pretrained transformers with lattice inputs yields clear improvement over fine-tuning with 1-best results. Further evaluation demonstrates the effectiveness of our methods under different acoustic conditions. Our code is available at https://github.com/M
Authors
(none)
Tags
Stats
Related papers
- Effectiveness Of Text, Acoustic, And Lattice-based Representations In Spoken Language Understanding Tasks (2022)2.26
- Towards ASR Robust Spoken Language Understanding Through In-context Learning With Word Confusion Networks (2024)0.00
- Latent Speech-text Transformer (2025)3.04
- End-to-end Spoken Language Understanding Using Transformer Networks And Self-supervised Pre-trained Features (2020)5.24
- Lattice-based Lightly-supervised Acoustic Model Training (2019)0.00
- Speech-language Pre-training For End-to-end Spoken Language Understanding (2021)9.41
- Voice Trigger Detection From LVCSR Hypothesis Lattices Using Bidirectional Lattice Recurrent Neural Networks (2020)6.77
- Lattice Rescoring Strategies For Long Short Term Memory Language Models In Speech Recognition (2017)9.76