Skip to content

TextVQA

Question AnsweringVisualCommonsenseEnglish

Created by Singh et al. at 2019, the TextVQA is a question answering dataset in English containing 36,602 records in JSON, PNG format.

About TextVQA

TextVQA requires models to read and reason about text in images to answer questions about them. Specifically, models need to incorporate a new modality of text present in the images and reason over it to answer TextVQA questions.

Details

Task
Question Answering, Visual, Commonsense
Language
English
Format
JSON, PNG
Rows / instances
36,602
Creator
Singh et al.
Year
2019
Download Paper

Related Question Answering, Visual, Commonsense datasets

FAQ