Cornell Newsroom
Text CorporaSummarizationEnglish
Cornell Newsroom is a text corpora-focused dataset in English that provides 1.3 labeled examples distributed in JSON format.
About Cornell Newsroom
Dataset contains 1.3 million articles and summaries written by authors and editors in the newsrooms of 38 major publications. The summaries are obtained from search and social metadata between 1998 and 2017.
Details
- Task
- Text Corpora, Summarization
- Language
- English
- Format
- JSON
- Rows / instances
- 1.3M
- Creator
- Grusky et al.
- Year
- 2018