Skip to content

uonlp/CulturaX

Text GenerationFill MaskAF, ALS, AM

Created by uonlp at 2023, the uonlp/CulturaX is a text generation dataset in AF, ALS, AM in Parquet format.

About uonlp/CulturaX

CulturaX Cleaned, Enormous, and Public: The Multilingual Fuel to Democratize Large Language Models for 167 Languages Dataset Summary We present CulturaX, a substantial multilingual dataset with 6.3 trillion tokens in 167 lan...

Details

Task
Text Generation, Fill Mask
Language
AF, ALS, AM
Format
Parquet
Rows / instances
N/A
Creator
uonlp
Year
2023
Download

Related Text Generation, Fill Mask datasets

FAQ