Word Segmentation On Discovered Phone Units With Dynamic Programming And Self-supervised Scoring
2022 Β· Herman Kamper
Abstract
Recent work on unsupervised speech segmentation has used self-supervised models with phone and word segmentation modules that are trained jointly. This paper instead revisits an older approach to word segmentation: bottom-up phone-like unit discovery is performed first, and symbolic word segmentation is then performed on top of the discovered units (without influencing the lower level). To do this, I propose a new unit discovery model, a new symbolic word segmentation model, and then chain the two models to segment speech. Both models use dynamic programming to minimize segment costs from a self-supervised network with an additional duration penalty that encourages longer units. Concretely, for acoustic unit discovery, duration-penalized dynamic programming (DPDP) is used with a contrastive predictive coding model as the scoring network. For word segmentation, DPDP is applied with an autoencoding recurrent neural as the scoring network. The two models are chained in order to segment sp
Authors
(none)
Tags
Stats
Related papers
- Towards Unsupervised Phone And Word Segmentation Using Self-supervised Vector-quantized Neural Networks (2020)0.00
- Unsupervised Speech Segmentation And Variable Rate Representation Learning Using Segmental Contrastive Predictive Coding (2021)9.92
- Unsupervised Word Discovery: Boundary Detection With Clustering Vs. Dynamic Programming (2024)3.58
- Segmental Contrastive Predictive Coding For Unsupervised Word Segmentation (2021)0.00
- Unsupervised Acoustic Unit Discovery By Leveraging A Language-independent Subword Discriminative Feature Representation (2021)5.84
- An Investigation Of Phone-based Subword Units For End-to-end Speech Recognition (2020)9.59
- Speech Decomposition Based On A Hybrid Speech Model And Optimal Segmentation (2021)0.00
- Unsupervised Speech Segmentation: A General Approach Using Speech Language Models (2025)2.60