A Spoofing Benchmark For The 2018 Voice Conversion Challenge: Leveraging From Spoofing Countermeasures For Speech Artifact Assessment

Abstract

Voice conversion (VC) aims at conversion of speaker characteristic without altering content. Due to training data limitations and modeling imperfections, it is difficult to achieve believable speaker mimicry without introducing processing artifacts; performance assessment of VC, therefore, usually involves both speaker similarity and quality evaluation by a human panel. As a time-consuming, expensive, and non-reproducible process, it hinders rapid prototyping of new VC technology. We address artifact assessment using an alternative, objective approach leveraging from prior work on spoofing countermeasures (CMs) for automatic speaker verification. Therein, CMs are used for rejecting `fake' inputs such as replayed, synthetic or converted speech but their potential for automatic speech artifact assessment remains unknown. This study serves to fill that gap. As a supplement to subjective results for the 2018 Voice Conversion Challenge (VCC'18) data, we configure a standard constant-Q cepst

A Spoofing Benchmark For The 2018 Voice Conversion Challenge: Leveraging From Spoofing Countermeasures For Speech Artifact Assessment

Abstract

Authors

Tags

Stats

Related papers