MSCOCO

Emerging

23papers using it

2018first seen

MSCOCO is a dataset that contains images and their corresponding text descriptions, used to evaluate the performance of cross-modal retrieval systems.

🔎 Find this dataset

Papers using MSCOCO (23)

Visualsparta: An Embarrassingly Simple Approach To Large-scale Text-to-image Search With Weighted Bag-of-words2021 · 22 cites

Learning To Embed Semantic Similarity For Joint Image-text Retrieval2022 · 9 cites

Leveraging High-resolution Features For Improved Deep Hashing-based Image Retrieval2024 · 3 cites

Feature Fusion Mamba Hashing via Decoupling for Cross-Modal Retrieval2026

UniHash: Unifying Pointwise and Pairwise Hashing Paradigms2026

Sparse and Dense Retrievers Learn Better Together: Joint Sparse-Dense Optimization for Text-Image Retrieval2025

Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval2019 · 11 cites

Deep Priority Hashing2018 · 4 cites

Deep Triplet Quantization2019 · 3 cites

Deep Unsupervised Image Hashing by Maximizing Bit Entropy2020 · 3 cites

VisualSparta: An Embarrassingly Simple Approach to Large-scale Text-to-Image Search with Weighted Bag-of-words2021 · 3 cites

Multi-Modal Retrieval using Graph Neural Networks2020 · 2 cites

Instance-weighted Central Similarity for Multi-label Image Retrieval2021 · 2 cites

Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval2021 · 1 cites

LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Retrieval2023 · 1 cites

Leveraging High-Resolution Features for Improved Deep Hashing-based Image Retrieval2024 · 1 cites

HHF: Hashing-guided Hinge Function for Deep Hashing Retrieval2021

ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval2022

Deep Metric Multi-View Hashing for Multimedia Retrieval2023

Central Similarity Multi-View Hashing for Multimedia Retrieval2023

Nearest Neighbor Normalization Improves Multimodal Retrieval2024

Do Cross Modal Systems Leverage Semantic Relationships?2019

Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval2023