Sef-pnet: Speaker Encoder-free Personalized Speech Enhancement With Local And Global Contexts Aggregation
2025 Β· Ziling Huang, Haixin Guan, Haoran Wei, et al.
Abstract
Personalized speech enhancement (PSE) methods typically rely on pre-trained speaker verification models or self-designed speaker encoders to extract target speaker clues, guiding the PSE model in isolating the desired speech. However, these approaches suffer from significant model complexity and often underutilize enrollment speaker information, limiting the potential performance of the PSE model. To address these limitations, we propose a novel Speaker Encoder-Free PSE network, termed SEF-PNet, which fully exploits the information present in both the enrollment speech and noisy mixtures. SEF-PNet incorporates two key innovations: Interactive Speaker Adaptation (ISA) and Local-Global Context Aggregation (LCA). ISA dynamically modulates the interactions between enrollment and noisy signals to enhance the speaker adaptation, while LCA employs advanced channel attention within the PSE encoder to effectively integrate local and global contextual information, thus improving feature learning
Authors
(none)
Tags
Stats
Related papers
- Personalized Speech Enhancement Without A Separate Speaker Embedding Model (2024)5.24
- A Lightweight Dual-stage Framework For Personalized Speech Enhancement Based On Deepfilternet2 (2024)2.26
- Real-time Joint Personalized Speech Enhancement And Acoustic Echo Cancellation (2022)4.52
- Personalized Percepnet: Real-time, Low-complexity Target Voice Separation And Enhancement (2021)10.97
- The Potential Of Neural Speech Synthesis-based Data Augmentation For Personalized Speech Enhancement (2022)6.77
- Dynamic Acoustic Compensation And Adaptive Focal Training For Personalized Speech Enhancement (2022)4.52
- Magnitude-phase Dual-path Speech Enhancement Network Based On Self-supervised Embedding And Perceptual Contrast Stretch Boosting (2025)3.21
- Parallel Gated Neural Network With Attention Mechanism For Speech Enhancement (2022)0.00