cluster #5
50 papers in this cluster (ordered by heat_score)
Papers
- State-of-the-art Speech Recognition With Sequence-to-sequence Models (2017)Chung-Cheng Chiu, Tara N. Sainath, Yonghui Wu, et al.21.01
- Joint Ctc-attention Based End-to-end Speech Recognition Using Multi-task Learning (2016)Suyoun Kim, Takaaki Hori, Shinji Watanabe20.43
- A Comparative Study On Transformer Vs RNN In Speech Applications (2019)Shigeki Karita, Nanxin Chen, Tomoki Hayashi, et al.20.07
- Light Gated Recurrent Units For Speech Recognition (2018)Mirco Ravanelli, Philemon Brakel, Maurizio Omologo, et al.18.90
- Streaming End-to-end Speech Recognition For Mobile Devices (2018)Yanzhang He, Tara N. Sainath, Rohit Prabhavalkar, et al.18.87
- Recent Advances In End-to-end Automatic Speech Recognition (2021)Jinyu Li18.62
- Transformer Transducer: A Streamable Speech Recognition Model With Transformer Encoders And RNN-T Loss (2020)Qian Zhang, Han Lu, Hasim Sak, et al.18.58
- Wenet: Production Oriented Streaming And Non-streaming End-to-end Speech Recognition Toolkit (2021)Zhuoyuan Yao, di Wu, Xiong Wang, et al.17.27
- Contextnet: Improving Convolutional Neural Networks For Automatic Speech Recognition With Global Context (2020)Wei Han, Zhengdong Zhang, Yu Zhang, et al.17.24
- Phone-to-audio Alignment Without Text: A Semi-supervised Approach (2021)Jian Zhu, Cong Zhang, David Jurgens16.74
- Advances In Joint Ctc-attention Based End-to-end Speech Recognition With A Deep CNN Encoder And RNN-LM (2017)Takaaki Hori, Shinji Watanabe, Yu Zhang, et al.16.49
- Transformer-based Acoustic Modeling For Hybrid Speech Recognition (2019)Yongqiang Wang, Abdelrahman Mohamed, Duc Le, et al.16.30
- An Analysis Of Incorporating An External Language Model Into A Sequence-to-sequence Model (2017)Anjuli Kannan, Yonghui Wu, Patrick Nguyen, et al.16.25
- Exploring Architectures, Data And Units For Streaming End-to-end Speech Recognition With Rnn-transducer (2018)Kanishka Rao, Haşim Sak, Rohit Prabhavalkar16.21
- Whispering Llama: A Cross-modal Generative Error Correction Framework For Speech Recognition (2023)Srijith Radhakrishnan, Chao-Han Huck Yang, Sumeer Ahmad Khan, et al.16.15
- Towards Better Decoding And Language Model Integration In Sequence To Sequence Models (2016)Jan Chorowski, Navdeep Jaitly15.67
- Deep Context: End-to-end Contextual Speech Recognition (2018)Golan Pundak, Tara N. Sainath, Rohit Prabhavalkar, et al.15.57
- Lighthubert: Lightweight And Configurable Speech Representation Learning With Once-for-all Hidden-unit BERT (2022)Rui Wang, Qibing Bai, Junyi Ao, et al.15.51
- Personalized Speech Recognition On Mobile Devices (2016)Ian McGraw, Rohit Prabhavalkar, Raziel Alvarez, et al.15.37
- RWTH ASR Systems For Librispeech: Hybrid Vs Attention -- W/o Data Augmentation (2019)Christoph Lüscher, Eugen Beck, Kazuki Irie, et al.15.34
- Neural Speech Recognizer: Acoustic-to-word LSTM Model For Large Vocabulary Speech Recognition (2016)Hagen Soltau, Hank Liao, Hasim Sak15.16
- Paraformer: Fast And Accurate Parallel Transformer For Non-autoregressive End-to-end Speech Recognition (2022)Zhifu Gao, Shiliang Zhang, Ian McLoughlin, et al.15.10
- A Streaming On-device End-to-end Model Surpassing Server-side Conventional Model Quality And Latency (2020)Tara N. Sainath, Yanzhang He, Bo Li, et al.15.00
- Exploring Neural Transducers For End-to-end Speech Recognition (2017)Eric Battenberg, Jitong Chen, Rewon Child, et al.14.90
- Mamba-360: Survey Of State Space Models As Transformer Alternative For Long Sequence Modelling: Methods, Applications, And Challenges (2024)Badri Narayana Patro, Vijay Srinivas Agneeswaran14.90
- E-branchformer: Branchformer With Enhanced Merging For Speech Recognition (2022)Kwangyoun Kim, Felix Wu, Yifan Peng, et al.14.66
- A Spelling Correction Model For End-to-end Speech Recognition (2019)Jinxi Guo, Tara N. Sainath, Ron J. Weiss14.62
- Deep LSTM For Large Vocabulary Continuous Speech Recognition (2017)Xu Tian, Jun Zhang, Zejun Ma, et al.14.58
- Self-attention Networks For Connectionist Temporal Classification In Speech Recognition (2019)Julian Salazar, Katrin Kirchhoff, Zhiheng Huang14.55
- Fast Conformer With Linearly Scalable Attention For Efficient Speech Recognition (2023)Dima Rekesh, Nithin Rao Koluguri, Samuel Kriman, et al.14.47
- A Comparison Of Techniques For Language Model Integration In Encoder-decoder Speech Recognition (2018)Shubham Toshniwal, Anjuli Kannan, Chung-Cheng Chiu, et al.14.39
- Minimum Word Error Rate Training For Attention-based Sequence-to-sequence Models (2017)Rohit Prabhavalkar, Tara N. Sainath, Yonghui Wu, et al.14.35
- Unsupervised Domain Adaptation For Robust Speech Recognition Via Variational Autoencoder-based Data Augmentation (2017)Wei-Ning Hsu, Yu Zhang, James Glass14.23
- Transformer-based Online Ctc/attention End-to-end Speech Recognition Architecture (2020)Haoran Miao, Gaofeng Cheng, Changfeng Gao, et al.14.06
- Two-pass End-to-end Speech Recognition (2019)Tara N. Sainath, Ruoming Pang, David Rybach, et al.13.97
- Large-scale Domain Adaptation Via Teacher-student Learning (2017)Jinyu Li, Michael L. Seltzer, Xi Wang, et al.13.93
- Efficient Conformer: Progressive Downsampling And Grouped Attention For Automatic Speech Recognition (2021)Maxime Burchi, Valentin Vielzeuf13.79
- Multi-dialect Speech Recognition With A Single Sequence-to-sequence Model (2017)Bo Li, Tara N. Sainath, Khe Chai Sim, et al.13.79
- Recognizing Long-form Speech Using Streaming End-to-end Models (2019)Arun Narayanan, Rohit Prabhavalkar, Chung-Cheng Chiu, et al.13.74
- Fpga-based Low-power Speech Recognition With Recurrent Neural Networks (2016)Minjae Lee, Kyuyeon Hwang, Jinhwan Park, et al.13.50
- E-RNN: Design Optimization For Efficient Recurrent Neural Networks In Fpgas (2018)Zhe Li, Caiwen Ding, Siyue Wang, et al.13.50
- Learning From Flawed Data: Weakly Supervised Automatic Speech Recognition (2023)Dongji Gao, Hainan Xu, Desh Raj, et al.13.45
- Contextualized Streaming End-to-end Speech Recognition With Trie-based Deep Biasing And Shallow Fusion (2021)Duc Le, Mahaveer Jain, Gil Keren, et al.13.44
- Internal Language Model Estimation For Domain-adaptive End-to-end Speech Recognition (2020)Zhong Meng, Sarangarajan Parthasarathy, Eric Sun, et al.13.44
- End-to-end Asr-free Keyword Search From Speech (2017)Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, et al.13.39
- Toward Domain-invariant Speech Recognition Via Large Scale Training (2018)Arun Narayanan, Ananya Misra, Khe Chai Sim, et al.13.39
- Relaxing The Conditional Independence Assumption Of Ctc-based ASR By Conditioning On Intermediate Predictions (2021)Jumon Nozaki, Tatsuya Komatsu13.34
- Developing RNN-T Models Surpassing High-performance Hybrid Models With Customization Capability (2020)Jinyu Li, Rui Zhao, Zhong Meng, et al.13.28
- Back-translation-style Data Augmentation For End-to-end ASR (2018)Tomoki Hayashi, Shinji Watanabe, Yu Zhang, et al.13.11
- Towards A Unified Conformer Structure: From ASR To ASV Task (2022)Dexin Liao, Tao Jiang, Feng Wang, et al.13.11