Emotion Embedding Spaces For Matching Music To Stories
2021 Β· Minz Won, Justin Salamon, Nicholas J. Bryan, et al.
Abstract
Content creators often use music to enhance their stories, as it can be a powerful tool to convey emotion. In this paper, our goal is to help creators find music to match the emotion of their story. We focus on text-based stories that can be auralized (e.g., books), use multiple sentences as input queries, and automatically retrieve matching music. We formalize this task as a cross-modal text-to-music retrieval problem. Both the music and text domains have existing datasets with emotion labels, but mismatched emotion vocabularies prevent us from using mood or emotion annotations directly for matching. To address this challenge, we propose and investigate several emotion embedding spaces, both manually defined (e.g., valence/arousal) and data-driven (e.g., Word2Vec and metric learning) to bridge this gap. Our experiments show that by leveraging these embedding spaces, we are able to successfully bridge the gap between modalities to facilitate cross modal retrieval. We show that our meth
Authors
(none)
Tags
Stats
Related papers
- Expressivity-aware Music Performance Retrieval Using Mid-level Perceptual Features And Emotion Word Embeddings (2024)0.00
- VMCML: Video And Music Matching Via Cross-modality Lifting (2023)2.26
- Contrastive Learning For Cross-modal Artist Retrieval (2023)0.00
- Audio-visual Embedding For Cross-modal Musicvideo Retrieval Through Supervised Deep CCA (2019)11.93
- Wikimute: A Web-sourced Dataset Of Semantic Descriptions For Music Audio (2023)5.24
- Content-based Video-music Retrieval Using Soft Intra-modal Structure Constraint (2017)3.60
- Contrastive Audio-language Learning For Music (2022)0.00
- Towards Robust And Truly Large-scale Audio-sheet Music Retrieval (2023)4.52