Evaluation Of The Speech Resynthesis Capabilities Of The Voiceprivacy Challenge Baseline B1
2023 Β· Γnal Ege Gaznepoglu, Nils Peters
Abstract
Speaker anonymization systems continue to improve their ability to obfuscate the original speaker characteristics in a speech signal, but often create processing artifacts and unnatural sounding voices as a tradeoff. Many of those systems stem from the VoicePrivacy Challenge (VPC) Baseline B1, using a neural vocoder to synthesize speech from an F0, x-vectors and bottleneck features-based speech representation. Inspired by this, we investigate the reproduction capabilities of the aforementioned baseline, to assess how successful the shared methodology is in synthesizing human-like speech. We use four objective metrics to measure speech quality, waveform similarity, and F0 similarity. Our findings indicate that both the speech representation and the vocoder introduces artifacts, causing an unnatural perception. A MUSHRA-like listening test on 18 subjects corroborate our findings, motivating further research on the analysis and synthesis components of the VPC Baseline B1.
Authors
(none)
Tags
Stats
Related papers
- Voiceprivacy 2022 System Description: Speaker Anonymization With Feature-matched F0 Trajectories (2022)0.00
- The Voiceprivacy 2022 Challenge: Progress And Perspectives In Voice Anonymisation (2024)10.61
- A Spoofing Benchmark For The 2018 Voice Conversion Challenge: Leveraging From Spoofing Countermeasures For Speech Artifact Assessment (2018)8.09
- Improving Voice Quality In Speech Anonymization With Just Perception-informed Losses (2024)0.00
- Speaker Independence Of Neural Vocoders And Their Effect On Parametric Resynthesis Speech Enhancement (2019)6.34
- Predictions Of Subjective Ratings And Spoofing Assessments Of Voice Conversion Challenge 2020 Submissions (2020)5.84
- Speaker Anonymization Using Neural Audio Codec Language Models (2023)10.97
- Exploring The Importance Of F0 Trajectories For Speaker Anonymization Using X-vectors And Neural Waveform Models (2021)0.00