allenai/dolma3_dolmino_mix-100B-1125
General NLPENodc-by
Created by allenai at 2025, the allenai/dolma3_dolmino_mix-100B-1125 is a General NLP dataset in EN in Parquet format. With 36.7K downloads and 21 likes, it is actively used by the community. It is released under the odc-by license.
About allenai/dolma3_dolmino_mix-100B-1125
Dolma 3 Dolmino dataset pool for Olmo 3 stage 2 annealing training
This dataset contains the high-quality pool of data considered for the second stage of Olmo 3 32B.
Dataset Sources
Source
Category
TinyMATH Mind
Math (syn...
Details
- Task
- General NLP
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- allenai
- Year
- 2025
- License
- odc-by
- Downloads
- 36691
- Likes
- 21