Skip to content

Arabic Reading Comprehension Dataset (ARCD)

Question AnsweringReading ComprehensionArabicBenchmark

The Arabic Reading Comprehension Dataset (ARCD) dataset is a Arabic question answering resource from Mozannar et al. at 2019 comprising ~50,000 examples.

📊 This dataset is used as an LLM benchmark. See model leaderboards →

About Arabic Reading Comprehension Dataset (ARCD)

Dataset contains 1,395 questions posed by crowdworkers on Wikipedia articles, and a machine translation of the Stanford Question Answering Dataset (Arabic-SQuAD) containing 48,344 questions.

Details

Task
Question Answering, Reading Comprehension
Language
Arabic
Format
JSON
Rows / instances
~50,000
Creator
Mozannar et al.
Year
2019
Download Paper

Related Question Answering, Reading Comprehension datasets

FAQ