Non-verbal Information In Spontaneous Speech -- Towards A New Framework Of Analysis
2024 Β· Tirza Biron, Moshe Barboy, Eran Ben-Artzy, et al.
Abstract
Non-verbal signals in speech are encoded by prosody and carry information that ranges from conversation action to attitude and emotion. Despite its importance, the principles that govern prosodic structure are not yet adequately understood. This paper offers an analytical schema and a technological proof-of-concept for the categorization of prosodic signals and their association with meaning. The schema interprets surface-representations of multi-layered prosodic events. As a first step towards implementation, we present a classification process that disentangles prosodic phenomena of three orders. It relies on fine-tuning a pre-trained speech recognition model, enabling the simultaneous multi-class/multi-label detection. It generalizes over a large variety of spontaneous data, performing on a par with, or superior to, human annotation. In addition to a standardized formalization of prosody, disentangling prosodic patterns can direct a theory of communication and speech organization. A
Authors
(none)
Tags
Stats
Related papers
- Disentangling Prosody Representations With Unsupervised Speech Reconstruction (2022)0.00
- Parsing Speech: A Neural Approach To Integrating Lexical And Acoustic-prosodic Information (2017)8.60
- Perception Of Prosodic Variation For Speech Synthesis Using An Unsupervised Discrete Representation Of F0 (2020)7.81
- Prosody-controllable Spontaneous TTS With Neural Hmms (2022)8.09
- Topic Identification For Spontaneous Speech: Enriching Audio Features With Embedded Linguistic Information (2023)4.52
- Disentangling Speech And Non-speech Components For Building Robust Acoustic Models From Found Data (2019)0.00
- Learning Spontaneity To Improve Emotion Recognition In Speech (2017)8.09
- CAMP: A Two-stage Approach To Modelling Prosody In Context (2020)0.00