Self-supervised Contrastive Learning For Robust Audio-sheet Music Retrieval Systems
2023 · Luis Carvalho, Tobias Washüttl, Gerhard Widmer
Abstract
Linking sheet music images to audio recordings remains a key problem for the development of efficient cross-modal music retrieval systems. One of the fundamental approaches toward this task is to learn a cross-modal embedding space via deep neural networks that is able to connect short snippets of audio and sheet music. However, the scarcity of annotated data from real musical content affects the capability of such methods to generalize to real retrieval scenarios. In this work, we investigate whether we can mitigate this limitation with self-supervised contrastive learning, by exposing a network to a large amount of real music data as a pre-training step, by contrasting randomly augmented views of snippets of both modalities, namely audio and sheet images. Through a number of experiments on synthetic and real piano data, we show that pre-trained models are able to retrieve snippets with better precision in all scenarios and pre-training configurations. Encouraged by these results, we
Authors
(none)
Tags
Stats
Related papers
- Towards Robust And Truly Large-scale Audio-sheet Music Retrieval (2023)4.52
- Learning Soft-attention Models For Tempo-invariant Audio-sheet Music Retrieval (2019)0.00
- Passage Summarization With Recurrent Models For Audio-sheet Music Retrieval (2023)0.00
- Towards End-to-end Audio-sheet-music Retrieval (2016)0.00
- Contrastive Learning For Cross-modal Artist Retrieval (2023)0.00
- Contrastive Audio-language Learning For Music (2022)0.00
- Self-supervised Auxiliary Loss For Metric Learning In Music Similarity-based Retrieval And Auto-tagging (2023)0.00
- On The Effect Of Data-augmentation On Local Embedding Properties In The Contrastive Learning Of Music Audio Representations (2024)5.24