Improved Conformer-based End-to-end Speech Recognition Using Neural Architecture Search
2021 Β· Yukun Liu, Ta Li, Pengyuan Zhang, et al.
Abstract
Recently neural architecture search(NAS) has been successfully used in image classification, natural language processing, and automatic speech recognition(ASR) tasks for finding the state-of-the-art(SOTA) architectures than those human-designed architectures. NAS can derive a SOTA and data-specific architecture over validation data from a pre-defined search space with a search algorithm. Inspired by the success of NAS in ASR tasks, we propose a NAS-based ASR framework containing one search space and one differentiable search algorithm called Differentiable Architecture Search(DARTS). Our search space follows the convolution-augmented transformer(Conformer) backbone, which is a more expressive ASR architecture than those used in existing NAS-based ASR frameworks. To improve the performance of our method, a regulation method called Dynamic Search Schedule(DSS) is employed. On a widely used Mandarin benchmark AISHELL-1, our best-searched architecture outperforms the baseline Conform model
Authors
(none)
Tags
Stats
Related papers
- Efficient Neural Architecture Search For End-to-end Speech Recognition Via Straight-through Gradients (2020)8.35
- DARTS-ASR: Differentiable Architecture Search For Multilingual Speech Recognition And Adaptation (2020)8.60
- Enhancing Speech Emotion Recognition Through Differentiable Architecture Search (2023)0.00
- Effectiveasr: A Single-step Non-autoregressive Mandarin Speech Recognition Architecture With High Accuracy And Inference Speed (2024)3.58
- Towards A Unified Conformer Structure: From ASR To ASV Task (2022)13.11
- Latency-controlled Neural Architecture Search For Streaming Speech Recognition (2021)0.00
- Efficienttdnn: Efficient Architecture Search For Speaker Recognition (2021)10.07
- Conformer-based Hybrid ASR System For Switchboard Dataset (2021)9.41