Investigating The Sensitivity Of Pre-trained Audio Embeddings To Common Effects
2025 Β· Victor Deng, Changhong Wang, Gael Richard, et al.
Abstract
In recent years, foundation models have significantly advanced data-driven systems across various domains. Yet, their underlying properties, especially when functioning as feature extractors, remain under-explored. In this paper, we investigate the sensitivity to audio effects of audio embeddings extracted from widely-used foundation models, including OpenL3, PANNs, and CLAP. We focus on audio effects as the source of sensitivity due to their prevalent presence in large audio datasets. By applying parameterized audio effects (gain, low-pass filtering, reverberation, and bitcrushing), we analyze the correlation between the deformation trajectories and the effect strength in the embedding space. We propose to quantify the dimensionality and linearizability of the deformation trajectories induced by audio effects using canonical correlation analysis. We find that there exists a direction along which the embeddings move monotonically as the audio effect strength increases, but that the sub
Authors
(none)
Tags
Stats
Related papers
- An Empirical Study Of Weakly Supervised Audio Tagging Embeddings For General Audio Representations (2022)0.00
- Diverse Audio Embeddings -- Bringing Features Back Outperforms CLAP! (2023)0.00
- Transformation Of Audio Embeddings Into Interpretable, Concept-based Representations (2025)2.26
- An Investigation On Selecting Audio Pre-trained Models For Audio Captioning (2022)0.00
- Advancing Audio Emotion And Intent Recognition With Large Pre-trained Models And Bayesian Inference (2023)5.24
- Investigating Design Choices In Joint-embedding Predictive Architectures For General Audio Representation Learning (2024)2.26
- Towards Evaluating Generative Audio: Insights From Neural Audio Codec Embedding Distances (2025)0.00
- Establishing Degrees Of Closeness Between Audio Recordings Along Different Dimensions Using Large-scale Cross-lingual Models (2024)0.00