SUPERB

Emerging

52papers using it

2021first seen

SUPERB is a benchmark dataset that contains a variety of speech processing tasks and is used to evaluate the performance of speech foundation models.

🔎 Find this dataset

Papers using SUPERB (52)

Rethinking Speech Foundation Model Fine-tuning: Better SFT or Better Match?2026

Fast Speech Foundation Model Distillation Using Interleaved Stacking2026

Codec2Vec: Self-Supervised Speech Representation Learning Using Neural Speech Codecs2025 · 1 cites

WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling2026

SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations2025

An Exploration of Mamba for Speech Self-Supervised Models2025

USAD: Universal Speech and Audio Representation via Distillation2025

Task-Agnostic Structured Pruning of Speech Representation Models2023 · 10 cites

Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond2023 · 2 cites

Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling2022 · 19 cites

Ensemble knowledge distillation of self-supervised speech models2023 · 16 cites

SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data2022 · 13 cites

What Do Self-Supervised Speech and Speaker Models Learn? New Findings From a Cross Model Layer-Wise Analysis2024 · 10 cites

UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training2021 · 9 cites

An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition2021 · 8 cites

FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning2022 · 5 cites

Efficient Speech Representation Learning with Low-Bit Quantization2023 · 4 cites

Don't speak too fast: The impact of data bias on self-supervised speech models2021 · 3 cites

SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities2022 · 3 cites

Improving the Robustness of DistilHuBERT to Unseen Noisy Conditions via Data Augmentation, Curriculum Learning, and Multi-Task Enhancement2022 · 3 cites

SCORE: Self-supervised Correspondence Fine-tuning for Improved Content Representations2024 · 3 cites

Cocktail HuBERT: Generalized Self-Supervised Pre-training for Mixture and Single-Source Speech2023 · 2 cites

Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations2023 · 2 cites

Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation2022 · 1 cites

Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models2022 · 1 cites

Evaluating context-invariance in unsupervised speech representations2022 · 1 cites

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark2023 · 1 cites

Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation2023 · 1 cites

On the Transferability of Whisper-based Representations for "In-the-Wild" Cross-Task Downstream Speech Applications2023 · 1 cites

DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models2023 · 1 cites

Are Paralinguistic Representations all that is needed for Speech Emotion Recognition?2024 · 1 cites

A Large-Scale Evaluation of Speech Foundation Models2024 · 1 cites

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT2022

CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning2022

SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning2022

Exploring Effective Fusion Algorithms for Speech Based Self-Supervised Learning Models2022

Masked Modeling Duo for Speech: Specializing General-Purpose Audio Representation to Speech using Denoising Distillation2023

MCR-Data2vec 2.0: Improving Self-supervised Speech Pre-training via Model-level Consistency Regularization2023

Selective HuBERT: Self-Supervised Pre-Training for Target Speaker in Clean and Mixture Speech2023

An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis2023

STaR: Distilling Speech Temporal Relation for Lightweight Speech Self-Supervised Learning Models2023

Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features?2024

SKILL: Similarity-aware Knowledge distILLation for Speech Self-Supervised Learning2024

Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition2024

Removing Speaker Information from Speech Representation using Variable-Length Soft Pooling2024

Refining Self-Supervised Learnt Speech Representation using Brain Activations2024

LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks2024

GenDistiller: Distilling Pre-trained Language Models based on an Autoregressive Generative Model2024

JOOCI: a Framework for Learning Comprehensive Speech Representations2024

EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning2024

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing2021

An Adapter-Based Unified Model for Multiple Spoken Language Processing Tasks2024