Passage Summarization With Recurrent Models For Audio-sheet Music Retrieval
2023 Β· Luis Carvalho, Gerhard Widmer
Abstract
Many applications of cross-modal music retrieval are related to connecting sheet music images to audio recordings. A typical and recent approach to this is to learn, via deep neural networks, a joint embedding space that correlates short fixed-size snippets of audio and sheet music by means of an appropriate similarity structure. However, two challenges that arise out of this strategy are the requirement of strongly aligned data to train the networks, and the inherent discrepancies of musical content between audio and sheet music snippets caused by local and global tempo differences. In this paper, we address these two shortcomings by designing a cross-modal recurrent network that learns joint embeddings that can summarize longer passages of corresponding audio and sheet music. The benefits of our method are that it only requires weakly aligned audio-sheet music pairs, as well as that the recurrent network handles the non-linearities caused by tempo variations between audio and sheet m
Authors
(none)
Tags
Stats
Related papers
- Towards Robust And Truly Large-scale Audio-sheet Music Retrieval (2023)4.52
- Self-supervised Contrastive Learning For Robust Audio-sheet Music Retrieval Systems (2023)5.24
- Learning Soft-attention Models For Tempo-invariant Audio-sheet Music Retrieval (2019)0.00
- Towards End-to-end Audio-sheet-music Retrieval (2016)0.00
- Deep Cross-modal Correlation Learning For Audio And Lyrics In Music Retrieval (2017)14.06
- Musictm-dataset For Joint Representation Learning Among Sheet Music, Lyrics, And Musical Audio (2020)3.58
- Contrastive Learning For Cross-modal Artist Retrieval (2023)0.00
- Content-based Video-music Retrieval Using Soft Intra-modal Structure Constraint (2017)3.60