Learning Shared Semantic Space With Correlation Alignment For Cross-modal Event Retrieval
2019 Β· Zhenguo Yang, Zehang Lin, Peipei Kang, et al.
Abstract
In this paper, we propose to learn shared semantic space with correlation alignment (\(\{S\}^\{3\}CA\)) for multimodal data representations, which aligns nonlinear correlations of multimodal data distributions in deep neural networks designed for heterogeneous data. In the context of cross-modal (event) retrieval, we design a neural network with convolutional layers and fully-connected layers to extract features for images, including images on Flickr-like social media. Simultaneously, we exploit a fully-connected neural network to extract semantic features for texts, including news articles from news media. In particular, nonlinear correlations of layer activations in the two neural networks are aligned with correlation alignment during the joint training of the networks. Furthermore, we project the multimodal data into a shared semantic space for cross-modal (event) retrieval, where the distances between heterogeneous data samples can be measured directly. In addition, we contribute a
Authors
(none)
Tags
Stats
Related papers
- Learning Joint Embedding For Cross-modal Retrieval (2019)5.84
- Preserving Semantic Neighborhoods For Robust Cross-modal Retrieval (2020)10.07
- Multimodal Representation Alignment For Cross-modal Information Retrieval (2025)0.00
- Learning From Multiview Correlations In Open-domain Videos (2018)5.84
- Discriminative Semantic Transitive Consistency For Cross-modal Learning (2021)0.00
- End-to-end Cross-modality Retrieval With CCA Projections And Pairwise Ranking Loss (2017)14.68
- Adversarial Cross-modal Retrieval Via Learning And Transferring Single-modal Similarities (2019)8.60
- Multimodal Representation Learning Conditioned On Semantic Relations (2025)0.00