← all datasets

Aishell-1

Emerging

76papers using it

2021first seen

The AISHELL-1 dataset is a benchmark for evaluating automatic speech recognition (ASR) systems, specifically focusing on their performance in recognizing and correcting named entities.

🔎 Find this dataset

Papers using Aishell-1 (76)

PAC: Pronunciation-Aware Contextualized Large Language Model-based Automatic Speech Recognition2025 · 1 cites

Breaking Through the Spike: Spike Window Decoding for Accelerated and Precise Automatic Speech Recognition2025 · 1 cites

JSPG: Dynamic Dictionary Filtering via Joint Semantic-Pinyin-Glyph Retrieval for Chinese Contextual ASR2026

Retrieval-Augmented Self-Taught Reasoning Model with Adaptive Chain-of-Thought for ASR Named Entity Correction2026

IKFST: IOO and KOO Algorithms for Accelerated and Precise WFST-based End-to-End Automatic Speech Recognition2026

Streaming Speech Recognition with Decoder-Only Large Language Models and Latency Optimization2026

End-to-end Speech Recognition with similar length speech and text2025

A Bottom-up Framework with Language-universal Speech Attribute Modeling for Syllable-based ASR2025

Objective Soups: Multilingual Multi-Task Modeling for Speech Processing2025

IML-Spikeformer: Input-aware Multi-Level Spiking Transformer for Speech Processing2025

CR-CTC: Consistency regularization on CTC for improved speech recognition2024

M2R-Whisper: Multi-stage and Multi-scale Retrieval Augmentation for Enhancing Whisper2024

EffectiveASR: A Single-Step Non-Autoregressive Mandarin Speech Recognition Architecture with High Accuracy and Inference Speed2024

Zipformer: A faster and better encoder for automatic speech recognition2023 · 28 cites

Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models2022 · 24 cites

Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study2023 · 15 cites

Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets2024 · 12 cites

Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model2021 · 6 cites

Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition2022 · 4 cites

Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction2022 · 3 cites

Nextformer: A ConvNeXt Augmented Conformer For End-To-End Speech Recognition2022 · 3 cites

Improving Mandarin Speech Recogntion with Block-augmented Transformer2022 · 3 cites

Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI2021 · 2 cites

Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition2022 · 2 cites

Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation2022 · 2 cites

UniEnc-CASSNAT: An Encoder-only Non-autoregressive ASR for Speech SSL Models2024 · 2 cites

EfficientASR: Speech Recognition Network Compression via Attention Redundancy and Chunk-Level FFN Optimization2024 · 2 cites

Decoupling recognition and transcription in Mandarin ASR2021 · 1 cites

Non-autoregressive Transformer with Unified Bidirectional Decoder for Automatic Speech Recognition2021 · 1 cites

On the Effectiveness of Pinyin-Character Dual-Decoding for End-to-End Mandarin Chinese ASR2022 · 1 cites

Transformer-based Streaming ASR with Cumulative Attention2022 · 1 cites

Improving CTC-based ASR Models with Gated Interlayer Collaboration2022 · 1 cites

Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition2022 · 1 cites

Conformer-based End-to-end Speech Recognition With Rotary Position Embedding2021

Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR2021

FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition2021

Improving CTC-based speech recognition via knowledge transferring from pre-trained language models2022

Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASR2022

CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR2022

An Empirical Study of Language Model Integration for Transducer based Speech Recognition2022

Memory-Efficient Training of RNN-Transducer with Sampled Softmax2022

A CTC Triggered Siamese Network with Spatial-Temporal Dropout for Speech Recognition2022

Knowledge Transfer and Distillation from Autoregressive to Non-Autoregressive Speech Recognition2022

PSVRF: Learning to restore Pitch-Shifted Voice without reference2022

A context-aware knowledge transferring strategy for CTC-based ASR2022

SAN: a robust end-to-end ASR model architecture2022

Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention Frames2022

Improving Noisy Student Training on Non-target Domain Data for Automatic Speech Recognition2022

SSCFormer: Push the Limit of Chunk-wise Conformer for Streaming ASR Using Sequentially Sampled Chunks and Chunked Causal Convolution2022

Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation2023

Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition2023

Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition2023

Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition2023

A Lexical-aware Non-autoregressive Transformer-based ASR Model2023

GNCformer Enhanced Self-attention for Automatic Speech Recognition2023

Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding2023

Enhancing the Unified Streaming and Non-streaming Model with Contrastive Learning2023

Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure2023

TST: Time-Sparse Transducer for Automatic Speech Recognition2023

CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition2023

ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging2023

HypR: A comprehensive study for ASR hypothesis revising with a reference corpus2023

Cross-modal Alignment with Optimal Transport for CTC-based ASR2023

Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR2023

Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition2024

Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment2024

Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study2024

CUSIDE-T: Chunking, Simulating Future and Decoding for Transducer based Streaming ASR2024

HydraFormer: One Encoder For All Subsampling Rates2024

An Effective Context-Balanced Adaptation Approach for Long-Tailed Speech Recognition2024

Large Language Model Should Understand Pinyin for Chinese ASR Error Correction2024

Bridging Speech and Text: Enhancing ASR with Pinyin-to-Character Pre-training in LLMs2024

Deep CLAS: Deep Contextual Listen, Attend and Spell2024

Sample adaptive data augmentation with progressive scheduling2024

Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model2022

UCorrect: An Unsupervised Framework for Automatic Speech Recognition Error Correction2024

Aishell-1 dataset — papers, benchmarks & downloads · Speech Audio