NovaSky-AI/Sky-T1_data_17k
General NLPEnglish
The NovaSky-AI/Sky-T1_data_17k dataset is a English General NLP resource from NovaSky-AI at 2025.
About NovaSky-AI/Sky-T1_data_17k
Sky-T1_data_17k.json: The 17k training data used to train Sky-T1-32B-Preview. The final data contains 5k coding data from APPs and TACO, and 10k math data from AIME, MATH, and Olympiads subsets of the NuminaMATH dataset. In addition, we maintain 1...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- NovaSky-AI
- Year
- 2025