Exploring Modality-agnostic Representations For Music Classification
2021 Β· Ho-Hsiang Wu, Magdalena Fuentes, Juan P. Bello
Abstract
Music information is often conveyed or recorded across multiple data modalities including but not limited to audio, images, text and scores. However, music information retrieval research has almost exclusively focused on single modality recognition, requiring development of separate models for each modality. Some multi-modal works require multiple coexisting modalities given to the model as inputs, constraining the use of these models to the few cases where data from all modalities are available. To the best of our knowledge, no existing model has the ability to take inputs from varying modalities, e.g. images or sounds, and classify them into unified music categories. We explore the use of cross-modal retrieval as a pretext task to learn modality-agnostic representations, which can then be used as inputs to classifiers that are independent of modality. We select instrument classification as an example task for our study as both visual and audio components provide relevant semantic inf
Authors
(none)
Tags
Stats
Related papers
- Contrastive Learning For Cross-modal Artist Retrieval (2023)0.00
- Cross-modal Music Retrieval And Applications: An Overview Of Key Methodologies (2019)12.68
- Musictm-dataset For Joint Representation Learning Among Sheet Music, Lyrics, And Musical Audio (2020)3.58
- Towards Robust And Truly Large-scale Audio-sheet Music Retrieval (2023)4.52
- Multimodal Metric Learning For Tag-based Music Retrieval (2020)9.76
- Metric Learning Vs Classification For Disentangled Music Representation Learning (2020)0.00
- Representation Learning For Image-based Music Recommendation (2018)0.00
- Contrastive Audio-language Learning For Music (2022)0.00