Skip to content

KALIMAT Multipurpose Arabic Corpus

SummarizationNamed Entity Recognition (NER)Part-of-Speech (POS)Arabic

Created by El-Haj et al. at 2013, the KALIMAT Multipurpose Arabic Corpus is a summarization dataset in Arabic containing 20,291 records in Text format.

About KALIMAT Multipurpose Arabic Corpus

Dataset contains 20,291 Arabic articles collected from the Omani newspaper Alwatan. Extractive Single-document and multi-document system summaries. Named Entity Recognised articles. The data has 6 categories: culture, economy, local-news, international-news, religion, and sports.

Details

Task
Summarization, Named Entity Recognition (NER), Part-of-Speech (POS)
Language
Arabic
Format
Text
Rows / instances
20,291
Creator
El-Haj et al.
Year
2013
Download Paper

FAQ