Query-by-example Search With Discriminative Neural Acoustic Word Embeddings
2017 Β· Shane Settle, Keith Levin, Herman Kamper, et al.
Abstract
Query-by-example search often uses dynamic time warping (DTW) for comparing queries and proposed matching segments. Recent work has shown that comparing speech segments by representing them as fixed-dimensional vectors --- acoustic word embeddings --- and measuring their vector distance (e.g., cosine distance) can discriminate between words more accurately than DTW-based approaches. We consider an approach to query-by-example search that embeds both the query and database segments according to a neural model, followed by nearest-neighbor search to find the matching segments. Earlier work on embedding-based query-by-example, using template-based acoustic word embeddings, achieved competitive performance. We find that our embeddings, based on recurrent neural networks trained to optimize word discrimination, achieve substantial improvements in performance and run-time efficiency over the previous approaches.
Authors
(none)
Tags
Stats
Related papers
- Learning Acoustic Word Embeddings With Temporal Context For Query-by-example Speech Search (2018)9.92
- Acoustic Word Embedding System For Code-switching Query-by-example Spoken Term Detection (2020)3.58
- Discriminative Acoustic Word Embeddings: Recurrent Neural Network-based Approaches (2016)0.00
- Semantic Query-by-example Speech Search Using Visual Grounding (2019)7.81
- Query-by-example Spoken Term Detection Using Attention-based Multi-hop Networks (2017)9.23
- Neural Network Based End-to-end Query By Example Spoken Term Detection (2019)0.00
- Query-by-example Keyword Spotting Using Spectral-temporal Graph Attentive Pooling And Multi-task Learning (2024)0.00
- Do Acoustic Word Embeddings Capture Phonological Similarity? An Empirical Study (2021)4.52