MSCOCO
Emerging23papers using it
2018first seen
MSCOCO is a dataset that contains images and their corresponding text descriptions, used to evaluate the performance of cross-modal retrieval systems.
Papers using MSCOCO (23)
- Visualsparta: An Embarrassingly Simple Approach To Large-scale Text-to-image Search With Weighted Bag-of-wordsLearning To Embed Semantic Similarity For Joint Image-text RetrievalLeveraging High-resolution Features For Improved Deep Hashing-based Image RetrievalFeature Fusion Mamba Hashing via Decoupling for Cross-Modal RetrievalUniHash: Unifying Pointwise and Pairwise Hashing ParadigmsSparse and Dense Retrievers Learn Better Together: Joint Sparse-Dense Optimization for Text-Image RetrievalCross-modal Scene Graph Matching for Relationship-aware Image-Text
RetrievalDeep Priority HashingDeep Triplet QuantizationDeep Unsupervised Image Hashing by Maximizing Bit EntropyVisualSparta: An Embarrassingly Simple Approach to Large-scale
Text-to-Image Search with Weighted Bag-of-wordsMulti-Modal Retrieval using Graph Neural NetworksInstance-weighted Central Similarity for Multi-label Image RetrievalConstructing Phrase-level Semantic Labels to Form Multi-Grained
Supervision for Image-Text RetrievalLexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale
Image-Text RetrievalLeveraging High-Resolution Features for Improved Deep Hashing-based
Image RetrievalHHF: Hashing-guided Hinge Function for Deep Hashing RetrievalALADIN: Distilling Fine-grained Alignment Scores for Efficient
Image-Text Matching and RetrievalDeep Metric Multi-View Hashing for Multimedia RetrievalCentral Similarity Multi-View Hashing for Multimedia RetrievalNearest Neighbor Normalization Improves Multimodal RetrievalDo Cross Modal Systems Leverage Semantic Relationships?Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval