Tamil
Emerging6papers using it
2024first seen
The 'Tamil' dataset/benchmark contains speech data used to evaluate automatic speech recognition (ASR) systems through fine-grained, Part-of-Speech (PoS)-wise error analysis, particularly focusing on the alignment of ASR hypotheses and reference transcriptions in non-Latin scripts.
Papers using Tamil (6)
- MultiGen: Child-Friendly Multilingual Speech Generator with LLMsBreaking the Script Barrier: Enabling Automatic Alignment for PoS-based ASR Error Analysis in Non-Latin ScriptsGoodness-of-pronunciation without phoneme time alignmentDynamic Multi-Expert Projectors with Stabilized Routing for Multilingual Speech RecognitionTextless NLP -- Zero Resource Challenge with Low Resource ComputeMultistage Fine-tuning Strategies for Automatic Speech Recognition in
Low-resource Languages