SUPERB
Emerging63papers using it
2022first seen
Papers using SUPERB (63)
- ML-SUPERB: Multilingual Speech Universal Performance BenchmarkSUPERB @ SLT 2022: Challenge On Generalization And Efficiency Of Self-supervised Speech Representation LearningDeep Versus Wide: An Analysis Of Student Architectures For Task-agnostic Knowledge Distillation Of Self-supervised Speech ModelsWhat Do Self-supervised Speech And Speaker Models Learn? New Findings From A Cross Model Layer-wise AnalysisMasked Modeling Duo For Speech: Specializing General-purpose Audio Representation To Speech Using Denoising DistillationCan You Remove The Downstream Model For Speaker Recognition With Self-supervised Speech Features?Cocktail Hubert: Generalized Self-supervised Pre-training For Mixture And Single-source SpeechRecycle-and-distill: Universal Compression Strategy For Transformer-based Speech SSL Models With Attention Map Reusing And Masking DistillationStar: Distilling Speech Temporal Relation For Lightweight Speech Self-supervised Learning ModelsLASER: Learning By Aligning Self-supervised Representations Of Speech For Improving Content-related TasksMcr-data2vec 2.0: Improving Self-supervised Speech Pre-training Via Model-level Consistency RegularizationGendistiller: Distilling Pre-trained Language Models Based On An Autoregressive Generative ModelCodec2Vec: Self-Supervised Speech Representation Learning Using Neural Speech CodecsSPEAR: A Unified SSL Framework for Learning Speech and Audio RepresentationsAn Exploration of Mamba for Speech Self-Supervised ModelsUSAD: Universal Speech and Audio Representation via DistillationTask-Agnostic Structured Pruning of Speech Representation ModelsFindings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation
over More Languages and BeyondAre Paralinguistic Representations All That Is Needed For Speech Emotion Recognition?SCORE: Self-supervised Correspondence Fine-tuning For Improved Content RepresentationsFindings Of The 2023 ML-SUPERB Challenge: Pre-training And Evaluation Over More Languages And BeyondAn Adapter-based Unified Model For Multiple Spoken Language Processing TasksOn The Transferability Of Whisper-based Representations For "in-the-wild" Cross-task Downstream Speech ApplicationsEvaluating Context-invariance In Unsupervised Speech RepresentationsAn Empirical Analysis Of Speech Self-supervised Learning At Multiple ResolutionsAn Experimental Study: Assessing The Combined Framework Of Wavlm And BEST-RQ For Text-to-speech SynthesisExploring Effective Fusion Algorithms For Speech Based Self-supervised Learning ModelsJOOCI: A Framework For Learning Comprehensive Speech RepresentationsSpeechLM: Enhanced Speech Pre-Training with Unpaired Textual DataFitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech
Self-Supervised LearningEfficient Speech Representation Learning with Low-Bit QuantizationImproving the Robustness of DistilHuBERT to Unseen Noisy Conditions via
Data Augmentation, Curriculum Learning, and Multi-Task EnhancementCocktail HuBERT: Generalized Self-Supervised Pre-training for Mixture
and Single-Source SpeechSelf-supervised Neural Factor Analysis for Disentangling Utterance-level
Speech RepresentationsDeep versus Wide: An Analysis of Student Architectures for Task-Agnostic
Knowledge Distillation of Self-Supervised Speech ModelsEvaluating context-invariance in unsupervised speech representationsML-SUPERB: Multilingual Speech Universal PERformance BenchmarkRecycle-and-Distill: Universal Compression Strategy for
Transformer-based Speech SSL Models with Attention Map Reusing and Masking
DistillationOn the Transferability of Whisper-based Representations for
"In-the-Wild" Cross-Task Downstream Speech ApplicationsDPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech
ModelsAre Paralinguistic Representations all that is needed for Speech Emotion
Recognition?A Large-Scale Evaluation of Speech Foundation ModelsCoBERT: Self-Supervised Speech Representation Learning Through Code
Representation LearningSUPERB @ SLT 2022: Challenge on Generalization and Efficiency of
Self-Supervised Speech Representation LearningExploring Effective Fusion Algorithms for Speech Based Self-Supervised
Learning ModelsEnsemble knowledge distillation of self-supervised speech modelsMasked Modeling Duo for Speech: Specializing General-Purpose Audio
Representation to Speech using Denoising DistillationMCR-Data2vec 2.0: Improving Self-supervised Speech Pre-training via
Model-level Consistency RegularizationSelective HuBERT: Self-Supervised Pre-Training for Target Speaker in
Clean and Mixture SpeechAn Experimental Study: Assessing the Combined Framework of WavLM and
BEST-RQ for Text-to-Speech SynthesisSTaR: Distilling Speech Temporal Relation for Lightweight Speech
Self-Supervised Learning ModelsCan you Remove the Downstream Model for Speaker Recognition with
Self-Supervised Speech Features?SKILL: Similarity-aware Knowledge distILLation for Speech
Self-Supervised LearningSCORE: Self-supervised Correspondence Fine-tuning for Improved Content
RepresentationsMulti-Stage Multi-Modal Pre-Training for Automatic Speech RecognitionRemoving Speaker Information from Speech Representation using
Variable-Length Soft PoolingRefining Self-Supervised Learnt Speech Representation using Brain
ActivationsLASER: Learning by Aligning Self-supervised Representations of Speech
for Improving Content-related TasksGenDistiller: Distilling Pre-trained Language Models based on an
Autoregressive Generative ModelJOOCI: a Framework for Learning Comprehensive Speech RepresentationsEH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech
Representation LearningFast Speech Foundation Model Distillation Using Interleaved StackingAn Adapter-Based Unified Model for Multiple Spoken Language Processing
Tasks