AudioSet
Canonical24papers using it
2022first seen
Papers using AudioSet (24)
- Efficient Large-scale Audio Tagging Via Transformer-to-cnn Knowledge DistillationAST-SED: An Effective Sound Event Detection Method Based On Audio Spectrogram TransformerAudio Mamba: Selective State Spaces For Self-supervised Audio RepresentationsSoloaudio: Target Sound Extraction With Language-oriented Audio Diffusion TransformerAudio-JEPA: Joint-Embedding Predictive Architecture for Audio Representation LearningCollap: Contrastive Long-form Language-audio Pretraining With Musical Temporal Structure AugmentationStreaming Audio Transformers For Online Audio TaggingAxlstms: Learning Self-supervised Audio Representations With XlstmsSAM: A Mamba-2 State-Space Audio-Language ModelAudioMAE++: learning better masked audio representations with SwiGLU FFNsSelf-supervised learning method using multiple sampling strategies for general-purpose audio representationSoloAudio: Target Sound Extraction with Language-oriented Audio
Diffusion TransformerAudio Language Modeling Using Perceptually-guided Discrete RepresentationsImproving Self-supervised Learning For Audio Representations By Feature Diversity And DecorrelationEfficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge
DistillationAudio Language Modeling using Perceptually-Guided Discrete
RepresentationsMultiscale Audio Spectrogram Transformer for Efficient Audio
ClassificationEnhancing Zero-shot Audio Classification using Sound Attribute Knowledge
from Large Language ModelsAST-SED: An Effective Sound Event Detection Method Based on Audio
Spectrogram TransformerImproving Self-Supervised Learning for Audio Representations by Feature
Diversity and DecorrelationStreaming Audio Transformers for Online Audio TaggingAudio Mamba: Selective State Spaces for Self-Supervised Audio
RepresentationsEfficient Autoregressive Audio Modeling via Next-Scale PredictionAxLSTMs: learning self-supervised audio representations with xLSTMs