Mandarin

Emerging

16papers using it

2022first seen

The 'Mandarin' dataset/benchmark is used to evaluate the effectiveness of speech representation learning methods, particularly in the context of tonal languages, by assessing their ability to handle speaker variation and tone distinctions.

🔎 Find this dataset

Papers using Mandarin (16)

Language Barriers: Evaluating Cross-Lingual Performance of CNN and Transformer Architectures for Speech Quality Estimation2025 · 1 cites

Toward Unified Chinese Multi-Dialectal Speech Recognition via Pinyin Intermediate Representation2026

Prosodic ABX: A Language-Agnostic Method for Measuring Prosodic Contrast in Speech Representations2026

Rethinking Entropy Allocation in LLM-based ASR: Understanding the Dynamics between Speech Encoders and LLMs2026

SITA: Learning Speaker-Invariant and Tone-Aware Speech Representations for Low-Resource Tonal Languages2026

Unsupervised lexicon learning from speech is limited by representations rather than clustering2025

HENT-SRT: Hierarchical Efficient Neural Transducer with Self-Distillation for Joint Speech Recognition and Translation2025

A Self-Refining Framework for Enhancing ASR Using TTS-Synthesized Data2025

Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models2022 · 1 cites

Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition2023

Effects of Convolutional Autoencoder Bottleneck Width on StarGAN-based Singing Technique Conversion2023

Accent-VITS:accent transfer for end-to-end TTS2023

Period Singer: Integrating Periodic and Aperiodic Variational Autoencoders for Natural-Sounding End-to-End Singing Voice Synthesis2024

PRESENT: Zero-Shot Text-to-Prosody Control2024

LA-RAG:Enhancing LLM-based ASR Accuracy with Retrieval-Augmented Generation2024

Do Discrete Self-Supervised Representations of Speech Capture Tone Distinctions?2024