Label-aware Multi-level Contrastive Learning For Cross-lingual Spoken Language Understanding
2022 Β· Shining Liang, Linjun Shou, Jian Pei, et al.
Abstract
Despite the great success of spoken language understanding (SLU) in high-resource languages, it remains challenging in low-resource languages mainly due to the lack of labeled training data. The recent multilingual code-switching approach achieves better alignments of model representations across languages by constructing a mixed-language context in zero-shot cross-lingual SLU. However, current code-switching methods are limited to implicit alignment and disregard the inherent semantic structure in SLU, i.e., the hierarchical inclusion of utterances, slots, and words. In this paper, we propose to model the utterance-slot-word structure by a multi-level contrastive learning framework at the utterance, slot, and word levels to facilitate explicit alignment. Novel code-switching schemes are introduced to generate hard negative examples for our contrastive learning framework. Furthermore, we develop a label-aware joint model leveraging label semantics to enhance the implicit alignment and
Authors
(none)
Tags
Stats
Related papers
- HC\(^2\)L: Hybrid And Cooperative Contrastive Learning For Cross-lingual Spoken Language Understanding (2024)4.52
- Gl-clef: A Global-local Contrastive Learning Framework For Cross-lingual Spoken Language Understanding (2022)10.35
- Aligning Speech To Languages To Enhance Code-switching Speech Recognition (2024)5.84
- ML-LMCL: Mutual Learning And Large-margin Contrastive Learning For Improving ASR Robustness In Spoken Language Understanding (2023)0.00
- Language Modeling For Code-switching: Evaluation, Integration Of Monolingual Data, And Discriminative Training (2018)5.24
- Contrastive Learning For Improving ASR Robustness In Spoken Language Understanding (2022)6.34
- Enhancing Code-switching Speech Recognition With Interactive Language Biases (2023)9.92
- Exploring Fine-tuning Of Large Audio Language Models For Spoken Language Understanding Under Limited Speech Data (2025)0.00