allenai/dolma3_dolmino_mix-100B-1025
Text GenerationENodc-by
Allenai/dolma3_dolmino_mix-100B-1025 is a text generation dataset in EN from allenai in Parquet format. It is distributed under the odc-by license and falls in the 10M<n<100M size category, and has been downloaded 24.6K times.
About allenai/dolma3_dolmino_mix-100B-1025
Dolma 3 Dolmino Mix (100B)
The Dolma 3 Dolmino Mix (100B) is the mixture of high-quality data used for the second stage of training for Olmo 3 7B model.
Dataset Sources
Source
Category
Tokens
Documents
TinyMATH Mind
Math ...
Details
- Task
- Text Generation
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 10M<n<100M
- Creator
- allenai
- Year
- 2025
- License
- odc-by
- Downloads
- 24582
- Likes
- 10