Skip to content

Image To Text Datasets

There are 18 image to text datasets in our directory. Each links to its source, paper, and download — browse the full list below or filter by language.

Image To Text is the task of generating textual descriptions or captions from images. We catalog 18 datasets for it.

Updated June 2026

What languages do image to text datasets cover?

Explore other dataset tasks

Frequently asked questions