Acoustic Word Embedding System For Code-switching Query-by-example Spoken Term Detection
2020 Β· Murong Ma, Haiwei Wu, Xuyang Wang, et al.
Abstract
In this paper, we propose a deep convolutional neural network-based acoustic word embedding system on code-switching query by example spoken term detection. Different from previous configurations, we combine audio data in two languages for training instead of only using one single language. We transform the acoustic features of keyword templates and searching content to fixed-dimensional vectors and calculate the distances between keyword segments and searching content segments obtained in a sliding manner. An auxiliary variability-invariant loss is also applied to training data within the same word but different speakers. This strategy is used to prevent the extractor from encoding undesired speaker- or accent-related information into the acoustic word embeddings. Experimental results show that our proposed system produces promising searching results in the code-switching test scenario. With the increased number of templates and the employment of variability-invariant loss, the search
Authors
(none)
Tags
Stats
Related papers
- Query-by-example Search With Discriminative Neural Acoustic Word Embeddings (2017)12.40
- Learning Acoustic Word Embeddings With Temporal Context For Query-by-example Speech Search (2018)9.92
- Query-by-example Keyword Spotting Using Spectral-temporal Graph Attentive Pooling And Multi-task Learning (2024)0.00
- An Effective Mixture-of-experts Approach For Code-switching Speech Recognition Leveraging Encoder Disentanglement (2024)0.00
- Discriminative Acoustic Word Embeddings: Recurrent Neural Network-based Approaches (2016)0.00
- Learning Word Embeddings From Speech (2017)0.00
- Additional Shared Decoder On Siamese Multi-view Encoders For Learning Acoustic Word Embeddings (2019)6.34
- Multi-modal Transformers Utterance-level Code-switching Detection (2020)0.00