a-m-team/AM-DeepSeek-Distilled-40M
Text GenerationZH, ENcc-by-nc-4.0
A-m-team/AM-DeepSeek-Distilled-40M is a text generation-focused dataset in ZH, EN distributed in Parquet format. It is distributed under the cc-by-nc-4.0 license and falls in the 10M<n<100M size category, and has been downloaded 1.5K times.
About a-m-team/AM-DeepSeek-Distilled-40M
For more open-source datasets, models, and methodologies, please visit our GitHub repository and paper: DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training.
Due to certain constraints, we are only able...
Details
- Task
- Text Generation
- Language
- ZH, EN
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 10M<n<100M
- Creator
- a-m-team
- Year
- 2025
- License
- cc-by-nc-4.0
- Downloads
- 1547
- Likes
- 56