Skip to content

stanford-oval/ccnews

Text ClassificationQuestion AnsweringText GenerationMULTILINGUAL, AF, AM

Stanford-oval/ccnews is a text classification dataset in MULTILINGUAL, AF, AM from stanford-oval in Parquet format.

About stanford-oval/ccnews

This dataset is the result of processing all WARC files in the CCNews Corpus, from the beginning (2016) to June of 2024. The data has been cleaned and deduplicated, and language of articles have been detected and added. The process is similar to w...

Details

Task
Text Classification, Question Answering, Text Generation
Language
MULTILINGUAL, AF, AM
Format
Parquet
Rows / instances
N/A
Creator
stanford-oval
Year
2024
Download

Related Text Classification, Question Answering, Text Generation datasets

FAQ