MegaScale (Production)
ByteDancePeking UniversityLanguage modeling/generation
MegaScale (Production) is language modeling/generation model published by ByteDance,Peking University in 2024 featuring 530000000000.0 parameters.
About MegaScale (Production)
We present the design, implementation and engineering experience in building and deploying MegaScale, a production system for training large language models (LLMs) at the scale of more than 10,000 GPUs. Training LLMs at this scale brings unprecedente
Details
- Provider
- ByteDance,Peking University
- Task
- Language modeling/generation
- Parameters
- 530000000000.0
- Released
- 2024-02-23
- Open weights
- No