Skip to content

ByteDance/MTVQA

Visual Question AnsweringImage To TextMULTILINGUAL, AR, DEcc-by-nc-4.0

Created by ByteDance at 2024, the ByteDance/MTVQA is a visual question answering dataset in MULTILINGUAL, AR, DE containing 8,794 records in Parquet format. With 333 downloads and 42 likes, it is actively used by the community. It is released under the cc-by-nc-4.0 license and is a 1K<n<10K-scale dataset.

About ByteDance/MTVQA

Dataset Card The dataset is oriented toward visual question answering of multilingual text scenes in nine languages, including Korean, Japanese, Italian, Russian, Deutsch, French, Thai, Arabic, and Vietnamese. The question-answer pairs are labe...

Details

Task
Visual Question Answering, Image To Text
Language
MULTILINGUAL, AR, DE
Format
Parquet
Rows / instances
8794
Size
1K<n<10K
Creator
ByteDance
Year
2024
License
cc-by-nc-4.0
Downloads
333
Likes
42
Download Homepage

Related Visual Question Answering, Image To Text datasets

FAQ