Unsupervised Word Discovery: Boundary Detection With Clustering Vs. Dynamic Programming
2024 Β· Simon Malan, Benjamin van Niekerk, Herman Kamper
Abstract
We look at the long-standing problem of segmenting unlabeled speech into word-like segments and clustering these into a lexicon. Several previous methods use a scoring model coupled with dynamic programming to find an optimal segmentation. Here we propose a much simpler strategy: we predict word boundaries using the dissimilarity between adjacent self-supervised features, then we cluster the predicted segments to construct a lexicon. For a fair comparison, we update the older ES-KMeans dynamic programming method with better features and boundary constraints. On the five-language ZeroSpeech benchmarks, our simple approach gives similar state-of-the-art results compared to the new ES-KMeans+ method, while being almost five times faster. Project webpage: https://s-malan.github.io/prom-seg-clus.
Authors
(none)
Tags
Stats
Related papers
- Unsupervised Lexicon Learning From Speech Is Limited By Representations Rather Than Clustering (2025)0.00
- An Embedded Segmental K-means Model For Unsupervised Segmentation And Clustering Of Speech (2017)0.00
- Word Segmentation On Discovered Phone Units With Dynamic Programming And Self-supervised Scoring (2022)9.23
- Unsupervised Word Segmentation And Lexicon Discovery Using Acoustic Word Embeddings (2016)12.10
- Towards Unsupervised Phone And Word Segmentation Using Self-supervised Vector-quantized Neural Networks (2020)0.00
- Segmental Contrastive Predictive Coding For Unsupervised Word Segmentation (2021)0.00
- Unsupervised Spoken Term Discovery Based On Re-clustering Of Hypothesized Speech Segments With Siamese And Triplet Networks (2020)0.00
- Back To Supervision: Boosting Word Boundary Detection Through Frame Classification (2024)0.00