cluster #2
50 papers in this cluster (ordered by heat_score)
Papers
- End-to-end Neural Speaker Diarization With Permutation-free Objectives (2019)Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, et al.21.98
- Multi-speaker DOA Estimation Using Deep Convolutional Networks Trained With Noise Signals (2018)Soumitro Chakrabarty, Emanuël A. P. Habets18.46
- Asvspoof 2021: Towards Spoofed And Deepfake Speech Detection In The Wild (2022)Xuechen Liu, Xin Wang, Md Sahidullah, et al.17.95
- Speaker Diarization With LSTM (2017)Quan Wang, Carlton Downey, Li Wan, et al.17.48
- Automatic Speaker Verification Spoofing And Deepfake Detection Using Wav2vec 2.0 And Data Augmentation (2022)Hemlata Tak, Massimiliano Todisco, Xin Wang, et al.17.35
- One-class Learning Towards Synthetic Voice Spoofing Detection (2020)You Zhang, Fei Jiang, Zhiyao Duan17.31
- Advances In Integration Of End-to-end Neural And Clustering-based Diarization For Real Conversational Speech (2021)Keisuke Kinoshita, Marc Delcroix, Naohiro Tawara16.48
- Adversarial Attacks Against Automatic Speech Recognition Systems Via Psychoacoustic Hiding (2018)Lea Schönherr, Katharina Kohls, Steffen Zeiler, et al.16.45
- Replay And Synthetic Speech Detection With Res2net Architecture (2020)Xu Li, Na Li, Chao Weng, et al.16.32
- Who Is Real Bob? Adversarial Attacks On Speaker Recognition Systems (2019)Guangke Chen, Sen Chen, Lingling Fan, et al.16.28
- Target-speaker Voice Activity Detection: A Novel Approach For Multi-speaker Diarization In A Dinner Party Scenario (2020)Ivan Medennikov, Maxim Korenevsky, Tatiana Prisyach, et al.16.19
- Fully Supervised Speaker Diarization (2018)Aonan Zhang, Quan Wang, Zhenyao Zhu, et al.15.80
- Targeted Adversarial Examples For Black Box Audio Systems (2018)Rohan Taori, Amog Kamsetty, Brenton Chu, et al.15.75
- ASSERT: Anti-spoofing With Squeeze-excitation And Residual Networks (2019)Cheng-I Lai, Nanxin Chen, Jesús Villalba, et al.15.40
- Tristounet: Triplet Loss For Speaker Turn Embedding (2016)Hervé Bredin14.80
- Auto-tuning Spectral Clustering For Speaker Diarization Using Normalized Maximum Eigengap (2020)Tae Jin Park, Kyu J. Han, Manoj Kumar, et al.14.58
- Audio-visual Speaker Diarization Based On Spatiotemporal Bayesian Fusion (2016)Israel D. Gebru, Silèye Ba, Xiaofei Li, et al.14.51
- Deep Learning Based Multi-source Localization With Source Splitting And Its Effectiveness In Multi-talker Speech Recognition (2021)Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, et al.14.23
- Universal Adversarial Perturbations For Speech Recognition Systems (2019)Paarth Neekhara, Shehzeen Hussain, Prakhar Pandey, et al.14.11
- The Vicomtech Audio Deepfake Detection System Based On Wav2vec2 For The 2022 ADD Challenge (2022)Juan M. Martín-Doñas, Aitor Álvarez14.06
- The Partialspoof Database And Countermeasures For The Detection Of Short Fake Speech Segments Embedded In An Utterance (2022)Lin Zhang, Xin Wang, Erica Cooper, et al.14.06
- Light Convolutional Neural Network With Feature Genuinization For Detection Of Synthetic Speech Attacks (2020)Zhenzong Wu, Rohan Kumar Das, Jichen Yang, et al.13.97
- MIMO-SPEECH: End-to-end Multi-channel Multi-speaker Speech Recognition (2019)Xuankai Chang, Wangyou Zhang, Yanmin Qian, et al.13.93
- LSTM Based Similarity Measurement With Spectral Clustering For Speaker Diarization (2019)Qingjian Lin, Ruiqing Yin, Ming Li, et al.13.79
- Integrating End-to-end Neural And Clustering-based Diarization: Getting The Best Of Both Worlds (2020)Keisuke Kinoshita, Marc Delcroix, Naohiro Tawara13.74
- Adversarial Attacks On GMM I-vector Based Speaker Verification Systems (2019)Xu Li, Jinghua Zhong, Xixin Wu, et al.13.65
- Voice Activity Detection: Merging Source And Filter-based Information (2019)Thomas Drugman, Yannis Stylianou, Yusuke Kida, et al.13.50
- Adversarial Attack And Defense Strategies For Deep Speaker Recognition Systems (2020)Arindam Jati, Chin-Cheng Hsu, Monisankha Pal, et al.13.39
- MLAAD: The Multi-language Audio Anti-spoofing Dataset (2024)Nicolas M. Müller, Piotr Kawa, Wei Herng Choong, et al.13.34
- Overlap-aware Diarization: Resegmentation Using Neural End-to-end Overlapped Speech Detection (2019)Latané Bullock, Hervé Bredin, Leibny Paola Garcia-Perera13.17
- Personal VAD: Speaker-conditioned Voice Activity Detection (2019)Shaojin Ding, Quan Wang, Shuo-Yiin Chang, et al.13.05
- Encoder-decoder Based Attractors For End-to-end Neural Diarization (2021)Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, et al.13.05
- End-to-end Integration Of Speech Recognition, Speech Enhancement, And Self-supervised Learning Representation (2022)Xuankai Chang, Takashi Maekaku, Yuya Fujita, et al.12.54
- UR Channel-robust Synthetic Speech Detection System For Asvspoof 2021 (2021)Xinhui Chen, You Zhang, Ge Zhu, et al.12.54
- Joint Speaker Counting, Speech Recognition, And Speaker Identification For Overlapped Speech Of Any Number Of Speakers (2020)Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, et al.12.54
- Turn-to-diarize: Online Speaker Diarization Constrained By Transformer Transducer Speaker Turn Detection (2021)Wei Xia, Han Lu, Quan Wang, et al.12.40
- Universal Adversarial Perturbations Generative Network For Speaker Recognition (2020)Jiguo Li, Xinfeng Zhang, Chuanmin Jia, et al.12.33
- Inaudible Adversarial Perturbations For Targeted Attack In Speaker Recognition (2020)Qing Wang, Pengcheng Guo, Lei Xie12.33
- The DKU Replay Detection System For The Asvspoof 2019 Challenge: On Data Augmentation, Feature Representation, Classification, And Fusion (2019)Weicheng Cai, Haiwei Wu, Danwei Cai, et al.12.25
- The Chime-7 DASR Challenge: Distant Meeting Transcription With Multiple Devices In Diverse Scenarios (2023)Samuele Cornell, Matthew Wiesner, Shinji Watanabe, et al.12.25
- A Purely End-to-end System For Multi-speaker Speech Recognition (2018)Hiroshi Seki, Takaaki Hori, Shinji Watanabe, et al.12.25
- Introduction To Voice Presentation Attack Detection And Recent Advances (2019)Md Sahidullah, Hector Delgado, Massimiliano Todisco, et al.12.17
- A Survey On Speech Deepfake Detection (2024)Menglu Li, Yasaman Ahmadiadli, Xiao-Ping Zhang12.10
- Can We Steal Your Vocal Identity From The Internet?: Initial Investigation Of Cloning Obama's Voice Using GAN, Wavenet And Low-quality Found Data (2018)Jaime Lorenzo-Trueba, Fuming Fang, Xin Wang, et al.12.02
- End-to-end Monaural Multi-speaker ASR System Without Pretraining (2018)Xuankai Chang, Yanmin Qian, Kai Yu, et al.11.93
- Spoofed Training Data For Speech Spoofing Countermeasure Can Be Efficiently Created Using Neural Vocoders (2022)Xin Wang, Junichi Yamagishi11.93
- Singfake: Singing Voice Deepfake Detection (2023)Yongyi Zang, You Zhang, Mojtaba Heydari, et al.11.93
- USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction (2024)Bang Zeng, Ming Li11.88
- Multitask Detection Of Speaker Changes, Overlapping Speech And Voice Activity Using Wav2vec 2.0 (2022)Marie Kunešová, Zbyněk Zajíc11.86
- Enhancing Partially Spoofed Audio Localization With Boundary-aware Attention Mechanism (2024)Jiafeng Zhong, Bin Li, Jiangyan Yi11.86