Corpus for Knowledge-Enhanced Language Model Pre-training (KELM)
Data-To-Text GenerationEnglish
Created by Agarwal et al. at 2020, the Corpus for Knowledge-Enhanced Language Model Pre-training (KELM) is a data-to-text generation dataset in English containing 18 records in TSV format.
About Corpus for Knowledge-Enhanced Language Model Pre-training (KELM)
Dataset consists of ∼18M sentences spanning ∼45M triples with ∼1,500 distinct relations from English Wikidata.
Details
- Task
- Data-To-Text Generation
- Language
- English
- Format
- TSV
- Rows / instances
- 18M
- Creator
- Agarwal et al.
- Year
- 2020