Discriminative Acoustic Word Embeddings: Recurrent Neural Network-based Approaches
2016 Β· Shane Settle, Karen Livescu
Abstract
Acoustic word embeddings --- fixed-dimensional vector representations of variable-length spoken word segments --- have begun to be considered for tasks such as speech recognition and query-by-example search. Such embeddings can be learned discriminatively so that they are similar for speech segments corresponding to the same word, while being dissimilar for segments corresponding to different words. Recent work has found that acoustic word embeddings can outperform dynamic time warping on query-by-example search and related word discrimination tasks. However, the space of embedding models and training approaches is still relatively unexplored. In this paper we present new discriminative embedding models based on recurrent neural networks (RNNs). We consider training losses that have been successful in prior work, in particular a cross entropy loss for word classification and a contrastive loss that explicitly aims to separate same-word and different-word pairs in a "Siamese network" tr
Authors
(none)
Tags
Stats
Related papers
- Improved Acoustic Word Embeddings For Zero-resource Languages Using Multilingual Transfer (2020)7.81
- Query-by-example Search With Discriminative Neural Acoustic Word Embeddings (2017)12.40
- Additional Shared Decoder On Siamese Multi-view Encoders For Learning Acoustic Word Embeddings (2019)6.34
- Acoustic Neighbor Embeddings (2020)0.00
- Learning Word Embeddings From Speech (2017)0.00
- Multilingual Acoustic Word Embedding Models For Processing Zero-resource Languages (2020)8.09
- Learning Acoustic Word Embeddings With Phonetically Associated Triplet Network (2018)0.00
- Truly Unsupervised Acoustic Word Embeddings Using Weak Top-down Constraints In Encoder-decoder Models (2018)0.00