Voice Watermarking for Authentication and Copyright Protection Using Neural Models

Abstract

Voice information is now more vulnerable to misuse, ranging from traditional copyright theft to emerging threats in the era of artificial intelligence (AI), where voice cloning and deepfake synthesis can easily bypass conventional verification methods. The problem of voice verification and protection against copyright infringement can be addressed with this paper through the development of a neural watermarking model inspired by AudioSeal, an advanced deep learning watermarking system. We train and fine-tune AudioSeal for offline, pre-recorded speech, inserting watermarks that are both imperceptible as well as resilient against typical audio processing attacks. The system was developed for a proof-of-concept application to support end-toend watermark embedding and detection within actual user workflows. Experimental outcomes demonstrate that our model attains high imperceptibility ($\text{PESQ} \approx 4.272$, SI -SNR $\approx 39.335$) with competitive robustness against compression, resampling, and noise attacks. These findings indicate the ability of watermarking based on deep learning to secure voice data for security-critical and copyright-related applications.

Abstract

Related papers