H_eval: A New Hybrid Evaluation Metric For Automatic Speech Recognition Tasks
2022 Β· Zitha Sasindran, Harsha Yelchuri, T. V. Prabhakar, et al.
Abstract
Many studies have examined the shortcomings of word error rate (WER) as an evaluation metric for automatic speech recognition (ASR) systems. Since WER considers only literal word-level correctness, new evaluation metrics based on semantic similarity such as semantic distance (SD) and BERTScore have been developed. However, we found that these metrics have their own limitations, such as a tendency to overly prioritise keywords. We propose H_eval, a new hybrid evaluation metric for ASR systems that considers both semantic correctness and error rate and performs significantly well in scenarios where WER and SD perform poorly. Due to lighter computation compared to BERTScore, it offers 49 times reduction in metric computation time. Furthermore, we show that H_eval correlates strongly with downstream NLP tasks. Also, to reduce the metric calculation time, we built multiple fast and lightweight models using distillation techniques
Authors
(none)
Tags
Stats
Related papers
- Semantic-wer: A Unified Metric For The Evaluation Of ASR Transcript For End Usability (2021)0.00
- Evaluating User Perception Of Speech Recognition System Quality With Semantic Distance Metric (2021)6.77
- A Reference-less Quality Metric For Automatic Speech Recognition Via Contrastive-learning Of A Multi-language Model With Self-supervision (2023)2.51
- WER-BERT: Automatic WER Estimation With BERT In A Balanced Ordinal Classification Paradigm (2021)0.00
- On Minimum Word Error Rate Training Of The Hybrid Autoregressive Transducer (2020)4.52
- Speechbertscore: Reference-aware Automatic Evaluation Of Speech Generation Leveraging NLP Evaluation Metrics (2024)10.74
- Acoustics-guided Evaluation (AGE): A New Measure For Estimating Performance Of Speech Enhancement Algorithms For Robust ASR (2018)0.00
- Beyond Levenshtein: Leveraging Multiple Algorithms For Robust Word Error Rate Computations And Granular Error Classifications (2024)2.26