← all datasets

Flickr-8k

Emerging
3papers using it
2024first seen

Flickr8k is a dataset that contains images and their corresponding textual descriptions, used to evaluate multimodal speech recognition systems.

Papers using Flickr-8k (3)

Flickr-8k β€” datasets β€” speech-audio