Open ASR Leaderboard (Scripted vs Conversational) open-asr-conversational Leaderboard
Conversational ASR β Average WER split across scripted vs conversational, US vs non-US English. Tests robustness across speaking styles. Β· Metric: Average WER (lower is better)
| # | Model | Average WER | Paper |
|---|---|---|---|
| 1 | microsoft/azure-speech-06-2026 | 8.08 | link |
| 2 | Qwen/Qwen3-ASR-1.7B | 8.08 | link |
| 3 | reson8/resonant-1 | 8.42 | link |
| 4 | assemblyai/universal-3-pro | 8.44 | link |
| 5 | reson8/resonant-1-flash | 8.49 | link |
| 6 | AutoArk-AI/ARK-ASR-3B | 8.56 | link |
| 7 | AutoArk-AI/ARK-ASR-0.6B | 8.66 | link |
| 8 | zoom/scribe_v1 | 8.66 | link |
| 9 | Qwen/Qwen3-ASR-0.6B | 8.79 | link |
| 10 | CohereLabs/cohere-transcribe-03-2026 | 8.95 | link |
| 11 | nvidia/canary-qwen-2.5b | 9.05 | link |
| 12 | microsoft/Phi-4-multimodal-instruct | 9.09 | link |
| 13 | aquavoice/avalon-v1-en | 9.14 | link |
| 14 | smallestai/pulse | 9.21 | link |
| 15 | zai-org/GLM-ASR-Nano-2512 | 9.21 | link |
| 16 | ibm-granite/granite-speech-4.1-2b | 9.23 | link |
| 17 | nvidia/canary-1b | 9.24 | link |
| 18 | nvidia/parakeet-tdt-0.6b-v2 | 9.24 | link |
| 19 | speechmatics/enhanced | 9.28 | link |
| 20 | ibm-granite/granite-4.0-1b-speech | 9.41 | link |
| 21 | nvidia/parakeet-tdt-0.6b-v3 | 9.46 | link |
| 22 | nvidia/canary-1b-flash | 9.49 | link |
| 23 | bosonai/higgs-audio-v3-stt | 9.54 | link |
| 24 | nvidia/parakeet-tdt-1.1b | 9.55 | link |
| 25 | nvidia/parakeet-rnnt-1.1b | 9.58 | link |
| 26 | mistralai/Voxtral-Small-24B-2507 | 9.60 | link |
| 27 | ibm-granite/granite-speech-4.1-2b-nar | 9.80 | link |
| 28 | openai/whisper-large-v3 | 9.96 | link |
| 29 | distil-whisper/distil-large-v3.5 | 10.00 | link |
| 30 | nvidia/canary-180m-flash | 10.02 | link |