Span Classification With Structured Information For Disfluency Detection In Spoken Utterances
2022 Β· Sreyan Ghosh, Sonal Kumar, Yaman Kumar Singla, et al.
Abstract
Existing approaches in disfluency detection focus on solving a token-level classification task for identifying and removing disfluencies in text. Moreover, most works focus on leveraging only contextual information captured by the linear sequences in text, thus ignoring the structured information in text which is efficiently captured by dependency trees. In this paper, building on the span classification paradigm of entity recognition, we propose a novel architecture for detecting disfluencies in transcripts from spoken utterances, incorporating both contextual information through transformers and long-distance structured information captured by dependency trees, through graph convolutional networks (GCNs). Experimental results show that our proposed model achieves state-of-the-art results on the widely used English Switchboard for disfluency detection and outperforms prior-art by a significant margin. We make all our codes publicly available on GitHub (https://github.com/Sreyan88/Disf
Authors
(none)
Tags
Stats
Code
Related papers
- A Novel Multimodal Dynamic Fusion Network For Disfluency Detection In Spoken Utterances (2022)0.00
- Streaming Joint Speech Recognition And Disfluency Detection (2022)0.00
- Dismo: A Morphosyntactic, Disfluency And Multi-word Unit Annotator. An Evaluation On A Corpus Of French Spontaneous And Read Speech (2018)0.00
- Controllable Time-delay Transformer For Real-time Punctuation Prediction And Disfluency Detection (2020)10.48
- Stutter-solver: End-to-end Multi-lingual Dysfluency Detection (2024)5.24
- Unconstrained Dysfluency Modeling For Dysfluent Speech Transcription And Detection (2023)7.16
- A Character-level Span-based Model For Mandarin Prosodic Structure Prediction (2022)5.24
- Improving Unsupervised Subword Modeling Via Disentangled Speech Representation Learning And Transformation (2019)5.24