Skip to content

Corpus for Knowledge-Enhanced Language Model Pre-training (KELM)

Data-To-Text GenerationEnglish

Created by Agarwal et al. at 2020, the Corpus for Knowledge-Enhanced Language Model Pre-training (KELM) is a data-to-text generation dataset in English containing 18 records in TSV format.

About Corpus for Knowledge-Enhanced Language Model Pre-training (KELM)

Dataset consists of ∼18M sentences spanning ∼45M triples with ∼1,500 distinct relations from English Wikidata.

Details

Task
Data-To-Text Generation
Language
English
Format
TSV
Rows / instances
18M
Creator
Agarwal et al.
Year
2020
Download Paper

Related Data-To-Text Generation datasets

FAQ