KALIMAT Multipurpose Arabic Corpus
SummarizationNamed Entity Recognition (NER)Part-of-Speech (POS)Arabic
Created by El-Haj et al. at 2013, the KALIMAT Multipurpose Arabic Corpus is a summarization dataset in Arabic containing 20,291 records in Text format.
About KALIMAT Multipurpose Arabic Corpus
Dataset contains 20,291 Arabic articles collected from the Omani newspaper Alwatan. Extractive Single-document and multi-document system summaries. Named Entity Recognised articles. The data has 6 categories: culture, economy, local-news, international-news, religion, and sports.
Details
- Task
- Summarization, Named Entity Recognition (NER), Part-of-Speech (POS)
- Language
- Arabic
- Format
- Text
- Rows / instances
- 20,291
- Creator
- El-Haj et al.
- Year
- 2013