The IBM Speaker Recognition System: Recent Advances And Error Analysis
2016 Β· Seyed Omid Sadjadi, Jason Pelecanos, Sriram Ganapathy
Abstract
We present the recent advances along with an error analysis of the IBM speaker recognition system for conversational speech. Some of the key advancements that contribute to our system include: a nearest-neighbor discriminant analysis (NDA) approach (as opposed to LDA) for intersession variability compensation in the i-vector space, the application of speaker and channel-adapted features derived from an automatic speech recognition (ASR) system for speaker recognition, and the use of a DNN acoustic model with a very large number of output units (~10k senones) to compute the frame-level soft alignments required in the i-vector estimation process. We evaluate these techniques on the NIST 2010 SRE extended core conditions (C1-C9), as well as the 10sec-10sec condition. To our knowledge, results achieved by our system represent the best performances published to date on these conditions. For example, on the extended tel-tel condition (C5) the system achieves an EER of 0.59%. To garner furthe
Authors
(none)
Tags
Stats
Related papers
- UTD-CRSS Systems For 2016 NIST Speaker Recognition Evaluation (2016)0.00
- The Intelligent Voice 2016 Speaker Recognition System (2016)0.00
- LIA System Description For NIST SRE 2016 (2016)0.00
- DNN Based Speaker Recognition On Short Utterances (2016)0.00
- HLT-NUS Submission For NIST 2019 Multimedia Speaker Recognition Evaluation (2020)0.00
- STC Speaker Recognition Systems For The Voices From A Distance Challenge (2019)7.81
- Integration Of Speech Separation, Diarization, And Recognition For Multi-speaker Meetings: System Description, Comparison, And Analysis (2020)13.23
- Length- And Noise-aware Training Techniques For Short-utterance Speaker Recognition (2020)0.00