Introducing Auxiliary Text Query-modifier To Content-based Audio Retrieval
2022 Β· Daiki Takeuchi, Yasunori Ohishi, Daisuke Niizumi, et al.
Abstract
The amount of audio data available on public websites is growing rapidly, and an efficient mechanism for accessing the desired data is necessary. We propose a content-based audio retrieval method that can retrieve a target audio that is similar to but slightly different from the query audio by introducing auxiliary textual information which describes the difference between the query and target audio. While the range of conventional content-based audio retrieval is limited to audio that is similar to the query audio, the proposed method can adjust the retrieval range by adding an embedding of the auxiliary text query-modifier to the embedding of the query sample audio in a shared latent space. To evaluate our method, we built a dataset comprising two different audio clips and the text that describes the difference. The experimental results show that the proposed method retrieves the paired audio more accurately than the baseline. We also confirmed based on visualization that the propose
Authors
(none)
Tags
Stats
Related papers
- Enhancing Retrieval-augmented Audio Captioning With Generation-assisted Multimodal Querying And Progressive Learning (2024)3.58
- Improving Natural-language-based Audio Retrieval With Transfer Learning And Audio & Text Augmentations (2022)0.00
- Estimated Audio-caption Correspondences Improve Language-based Audio Retrieval (2024)0.00
- Advancing Natural-language Based Audio Retrieval With Passt And Large Audio-caption Data Sets (2023)0.00
- Improving Audio-text Retrieval Via Hierarchical Cross-modal Interaction And Auxiliary Captions (2023)0.00
- Contrastive Latent Space Reconstruction Learning For Audio-text Retrieval (2023)3.58
- Audio Difference Captioning Utilizing Similarity-discrepancy Disentanglement (2023)2.26
- Text-based Audio Retrieval By Learning From Similarities Between Audio Captions (2024)2.26