tiiuae/falcon-refinedweb
Text GenerationEN
The tiiuae/falcon-refinedweb dataset is a EN text generation resource from tiiuae at 2023.
About tiiuae/falcon-refinedweb
📀 Falcon RefinedWeb
Falcon RefinedWeb is a massive English web dataset built by TII and released under an ODC-By 1.0 license.
See the 📓 paper on arXiv for more details.
RefinedWeb is built through stringent filtering and large-scale deduplicat...
Details
- Task
- Text Generation
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- tiiuae
- Year
- 2023