Skip to content

google-research-datasets/tydiqa

Question AnsweringAR, BN, ENBenchmarkapache-2.0

Google-research-datasets/tydiqa is a question answering benchmark dataset in AR, BN, EN from google-research-datasets with 240,544 records in Parquet format. It is distributed under the apache-2.0 license and falls in the 100K<n<1M size category, and has been downloaded 3K times.

📊 This dataset is used as an LLM benchmark. See model leaderboards →

About google-research-datasets/tydiqa

Dataset Card for "tydiqa" Dataset Summary TyDi QA is a question answering dataset covering 11 typologically diverse languages with 204K question-answer pairs. The languages of TyDi QA are diverse with regard to their typology -- the ...

Details

Task
Question Answering
Language
AR, BN, EN
Format
Parquet
Rows / instances
240544
Size
100K<n<1M
Creator
google-research-datasets
Year
2022
License
apache-2.0
Downloads
2967
Likes
38
Download Homepage

Related Question Answering datasets

FAQ