cluster #0
50 papers in this cluster (ordered by heat_score)
Papers
- Hubert: Self-supervised Speech Representation Learning By Masked Prediction Of Hidden Units (2021)Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, et al.25.30
- Robust Wav2vec 2.0: Analyzing Domain Shift In Self-supervised Pre-training (2021)Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, et al.25.07
- Wavlm: Large-scale Self-supervised Pre-training For Full Stack Speech Processing (2021)Sanyuan Chen, Chengyi Wang, Zhengyang Chen, et al.24.00
- Unsupervised Cross-lingual Representation Learning For Speech Recognition (2020)Alexis Conneau, Alexei Baevski, Ronan Collobert, et al.18.91
- W2v-bert: Combining Contrastive Learning And Masked Language Modeling For Self-supervised Speech Pre-training (2021)Yu-An Chung, Yu Zhang, Wei Han, et al.17.78
- TERA: Self-supervised Learning Of Transformer Encoder Representation For Speech (2020)Andy T. Liu, Shang-Wen Li, Hung-Yi Lee17.61
- An Unsupervised Autoregressive Model For Speech Representation Learning (2019)Yu-An Chung, Wei-Ning Hsu, Hao Tang, et al.17.26
- Mockingjay: Unsupervised Speech Representation Learning With Deep Bidirectional Transformer Encoders (2019)Andy T. Liu, Shu-Wen Yang, Po-Han Chi, et al.17.26
- Layer-wise Analysis Of A Self-supervised Speech Representation Model (2021)Ankita Pasad, Ju-Chieh Chou, Karen Livescu17.07
- Wenetspeech: A 10000+ Hours Multi-domain Mandarin Corpus For Speech Recognition (2021)Binbin Zhang, Hang Lv, Pengcheng Guo, et al.16.12
- Multilingual Speech Recognition With A Single End-to-end Model (2017)Shubham Toshniwal, Tara N. Sainath, Ron J. Weiss, et al.16.05
- Bigssl: Exploring The Frontier Of Large-scale Semi-supervised Learning For Automatic Speech Recognition (2021)Yu Zhang, Daniel S. Park, Wei Han, et al.15.73
- Audio ALBERT: A Lite BERT For Self-supervised Learning Of Audio Representation (2020)Po-Han Chi, Pei-Hung Chung, Tsung-Han Wu, et al.15.54
- Learning Problem-agnostic Speech Representations From Multiple Self-supervised Tasks (2019)Santiago Pascual, Mirco Ravanelli, Joan Serrà, et al.15.54
- Self-training For End-to-end Speech Recognition (2019)Jacob Kahn, Ann Lee, Awni Hannun15.48
- Distilhubert: Speech Representation Learning By Layer-wise Distillation Of Hidden-unit BERT (2021)Heng-Jui Chang, Shu-Wen Yang, Hung-Yi Lee15.06
- Generative Pre-training For Speech With Autoregressive Predictive Coding (2019)Yu-An Chung, James Glass14.73
- Deep Contextualized Acoustic Representations For Semi-supervised Speech Recognition (2019)Shaoshi Ling, Yuzong Liu, Julian Salazar, et al.14.62
- Towards Learning A Universal Non-semantic Representation Of Speech (2020)Joel Shor, Aren Jansen, Ronnie Maor, et al.14.43
- Transfer Learning For Speech Recognition On A Budget (2017)Julius Kunze, Louis Kirsch, Ilia Kurenkov, et al.14.27
- Speech2vec: A Sequence-to-sequence Framework For Learning Word Embeddings From Speech (2018)Yu-An Chung, James Glass14.15
- Voxlingua107: A Dataset For Spoken Language Recognition (2020)Jörgen Valk, Tanel Alumäe14.15
- Self-training And Pre-training Are Complementary For Speech Recognition (2020)Qiantong Xu, Alexei Baevski, Tatiana Likhomanenko, et al.14.15
- Speech Slytherin: Examining The Performance And Efficiency Of Mamba For Speech Separation, Recognition, And Synthesis (2024)Xilin Jiang, Yinghao Aaron Li, Adrian Nicolas Florea, et al.13.88
- Multilingual Sequence-to-sequence Speech Recognition: Architecture, Transfer Learning, And Language Modeling (2018)Jaejin Cho, Murali Karthick Baskar, Ruizhi Li, et al.13.84
- Towards Robust Voice Pathology Detection (2019)Pavol Harar, Zoltan Galaz, Jesus B. Alonso-Hernandez, et al.13.74
- Mixspeech: Data Augmentation For Low-resource Automatic Speech Recognition (2021)Linghui Meng, Jin Xu, Xu Tan, et al.13.60
- Unsupervised Automatic Speech Recognition: A Review (2021)Hanan Aldarmaki, Asad Ullah, Nazar Zaki13.50
- Pre-training Transformer Decoder For End-to-end ASR Model With Unpaired Speech Data (2022)Junyi Ao, Ziqiang Zhang, Long Zhou, et al.13.47
- SUPERB-SG: Enhanced Speech Processing Universal Performance Benchmark For Semantic And Generative Capabilities (2022)Hsiang-Sheng Tsai, Heng-Jui Chang, Wen-Chin Huang, et al.13.34
- Bertphone: Phonetically-aware Encoder Representations For Utterance-level Speaker And Language Recognition (2019)Shaoshi Ling, Julian Salazar, Yuzong Liu, et al.13.27
- The Zero Resource Speech Challenge 2019: TTS Without T (2019)Ewan Dunbar, Robin Algayres, Julien Karadayi, et al.13.17
- Wav2vec-switch: Contrastive Learning From Original-noisy Speech Pairs For Robust Speech Recognition (2021)Yiming Wang, Jinyu Li, Heming Wang, et al.12.93
- Analysing Discrete Self Supervised Speech Representation For Spoken Language Modeling (2023)Amitay Sicherman, Yossi Adi12.86
- Data Augmenting Contrastive Learning Of Speech Representations In The Time Domain (2020)Eugene Kharitonov, Morgane Rivière, Gabriel Synnaeve, et al.12.81
- Self-supervised Contrastive Learning For Unsupervised Phoneme Segmentation (2020)Felix Kreuk, Joseph Keshet, Yossi Adi12.68
- Efficient Adapter Transfer Of Self-supervised Speech Models For Automatic Speech Recognition (2022)Bethan Thomas, Samuel Kessler, Salah Karout12.68
- Non-autoregressive Predictive Coding For Learning Speech Representations From Local Dependencies (2020)Alexander H. Liu, Yu-An Chung, James Glass12.47
- ML-SUPERB: Multilingual Speech Universal Performance Benchmark (2023)Jiatong Shi, Dan Berrebbi, William Chen, et al.12.47
- Query-by-example Search With Discriminative Neural Acoustic Word Embeddings (2017)Shane Settle, Keith Levin, Herman Kamper, et al.12.40
- Convolutional Neural Networks And Language Embeddings For End-to-end Dialect Recognition (2018)Suwon Shon, Ahmed Ali, James Glass12.40
- Unsupervised Pre-training Of Bidirectional Speech Encoders Via Masked Reconstruction (2020)Weiran Wang, Qingming Tang, Karen Livescu12.33
- An Exploration Of Self-supervised Pretrained Representations For End-to-end Speech Recognition (2021)Xuankai Chang, Takashi Maekaku, Pengcheng Guo, et al.12.25
- Knowledge Distillation For Small-footprint Highway Networks (2016)Liang Lu, Michelle Guo, Steve Renals12.25
- Unsupervised Word Segmentation And Lexicon Discovery Using Acoustic Word Embeddings (2016)Herman Kamper, Aren Jansen, Sharon Goldwater12.10
- Exploration Of Efficient End-to-end ASR Using Discretized Input From Self-supervised Learning (2023)Xuankai Chang, Brian Yan, Yuya Fujita, et al.12.02
- Massively Multilingual Adversarial Speech Recognition (2019)Oliver Adams, Matthew Wiesner, Shinji Watanabe, et al.11.93
- Neural Predictive Coding Using Convolutional Neural Networks Towards Unsupervised Learning Of Speaker Characteristics (2018)Arindam Jati, Panayiotis Georgiou11.85
- The MGB-2 Challenge: Arabic Multi-dialect Broadcast Media Recognition (2016)Ahmed Ali, Peter Bell, James Glass, et al.11.76
- Learning Efficient Representations For Keyword Spotting With Triplet Loss (2021)Roman Vygon, Nikolay Mikhaylovskiy11.76