Skip to content

1.5 billion Words Arabic Corpus

Text CorporaArabic

1.5 billion Words Arabic Corpus is a text corpora dataset in Arabic from El-khair et al. with 5 records in XML format.

About 1.5 billion Words Arabic Corpus

The data were collected from newspaper articles in ten major news sources from eight Arabic countries, over a period of fourteen years.

Details

Task
Text Corpora
Language
Arabic
Format
XML
Rows / instances
5M
Creator
El-khair et al.
Year
2016
Download Paper

Related Text Corpora datasets

FAQ