Sequence Model With Self-adaptive Sliding Window For Efficient Spoken Document Segmentation
2021 Β· Qinglin Zhang, Qian Chen, Yali Li, et al.
Abstract
Transcripts generated by automatic speech recognition (ASR) systems for spoken documents lack structural annotations such as paragraphs, significantly reducing their readability. Automatically predicting paragraph segmentation for spoken documents may both improve readability and downstream NLP performance such as summarization and machine reading comprehension. We propose a sequence model with self-adaptive sliding window for accurate and efficient paragraph segmentation. We also propose an approach to exploit phonetic information, which significantly improves robustness of spoken document segmentation to ASR errors. Evaluations are conducted on the English Wiki-727K document segmentation benchmark, a Chinese Wikipedia-based document segmentation dataset we created, and an in-house Chinese spoken document dataset. Our proposed model outperforms the state-of-the-art (SOTA) model based on the same BERT-Base, increasing segmentation F1 on the English benchmark by 4.2 points and on Chines
Authors
(none)
Tags
Stats
Related papers
- Toward Unifying Text Segmentation And Long Document Summarization (2022)8.60
- Simultaneous Translation For Unsegmented Input: A Sliding Window Approach (2022)0.00
- Don't Discard Fixed-window Audio Segmentation In Speech-to-text Translation (2022)0.00
- Subtitles To Segmentation: Improving Low-resource Speech-to-text Translation Pipelines (2020)0.00
- Smart Speech Segmentation Using Acousto-linguistic Features With Look-ahead (2022)0.00
- Sequence Segmentation Using Joint RNN And Structured Prediction Models (2016)7.81
- Reading Between The Waves: Robust Topic Segmentation Using Inter-sentence Audio Features (2026)0.00
- Speech Segmentation Optimization Using Segmented Bilingual Speech Corpus For End-to-end Speech Translation (2022)5.84