Modeling Text With Graph Convolutional Network For Cross-modal Information Retrieval
2018 Β· Jing Yu, Yuhang Lu, Zengchang Qin, et al.
Abstract
Cross-modal information retrieval aims to find heterogeneous data of various modalities from a given query of one modality. The main challenge is to map different modalities into a common semantic space, in which distance between concepts in different modalities can be well modeled. For cross-modal information retrieval between images and texts, existing work mostly uses off-the-shelf Convolutional Neural Network (CNN) for image feature extraction. For texts, word-level features such as bag-of-words or word2vec are employed to build deep learning models to represent texts. Besides word-level semantics, the semantic relations between words are also informative but less explored. In this paper, we model texts by graphs using similarity measure based on word2vec. A dual-path neural network model is proposed for couple feature learning in cross-modal information retrieval. One path utilizes Graph Convolutional Network (GCN) for text modeling based on graph representations. The other path u
Authors
(none)
Tags
Stats
Related papers
- Invgc: Robust Cross-modal Retrieval By Inverse Graph Convolution (2023)3.95
- Revisiting Cross Modal Retrieval (2018)0.00
- Multi-modal Reasoning Graph For Scene-text Based Fine-grained Image Classification And Retrieval (2020)11.29
- Look, Imagine And Match: Improving Textual-visual Cross-modal Retrieval With Generative Models (2017)18.52
- Scene Graph Based Fusion Network For Image-text Retrieval (2023)4.52
- Image Retrieval For Structure-from-motion Via Graph Convolutional Network (2020)9.59
- Multi-modal Retrieval Using Graph Neural Networks (2020)0.00
- Deep Multimodal Image-text Embeddings For Automatic Cross-media Retrieval (2020)0.00