Where Does The Performance Improvement Come From? -- A Reproducibility Concern About Image-text Retrieval
2022 Β· Jun Rao, Fei Wang, Liang Ding, et al.
Abstract
This article aims to provide the information retrieval community with some reflections on recent advances in retrieval learning by analyzing the reproducibility of image-text retrieval models. Due to the increase of multimodal data over the last decade, image-text retrieval has steadily become a major research direction in the field of information retrieval. Numerous researchers train and evaluate image-text retrieval algorithms using benchmark datasets such as MS-COCO and Flickr30k. Research in the past has mostly focused on performance, with multiple state-of-the-art methodologies being suggested in a variety of ways. According to their assertions, these techniques provide improved modality interactions and hence more precise multimodal representations. In contrast to previous works, we focus on the reproducibility of the approaches and the examination of the elements that lead to improved performance by pretrained and nonpretrained models in retrieving images and text. To be more sp
Authors
(none)
Tags
Stats
Related papers
- Rethinking Benchmarks For Cross-modal Image-text Retrieval (2023)13.11
- Benchmark Granularity And Model Robustness For Image-text Retrieval (2024)0.00
- Scene-centric Vs. Object-centric Image-text Cross-modal Retrieval: A Reproducibility Study (2023)5.24
- Benchmarking Robustness Of Text-image Composed Retrieval (2023)2.23
- Performance Evaluation In Multimedia Retrieval (2024)8.82
- Image-text Retrieval: A Survey On Recent Research And Development (2022)13.93
- Improving Image Recognition By Retrieving From Web-scale Image-text Data (2023)9.41
- Cross-modal Coherence For Text-to-image Retrieval (2021)6.77