← all datasets

Flickr30k

Canonical
23papers using it
2023first seen

Flickr30k is a large-scale dataset containing 30,000 images, each paired with five corresponding textual descriptions, used to evaluate cross-modal retrieval and the alignment between visual and textual information.

Papers using Flickr30k (23)

Flickr30k β€” datasets β€” multimodal