Nemotron 3 Ultra
NVIDIALanguage modeling/generationQuestion answeringOpen weights
Developed by NVIDIA in 2026, Nemotron 3 Ultra is a language modeling/generation model with 550000000000.0 parameters with openly available weights.
About Nemotron 3 Ultra
We introduce Nemotron 3 Ultra, a 550 billion total and 55 billion active parameter Mixture-of-Experts Hybrid Mamba-Attention language model. We pre-trained Nemotron 3 Ultra on 20 trillion text tokens, then extended the context length to 1M tokens, an
LLM pricing & performance
Full LLM page →Nemotron 3 Ultra is available via API — live cost, context, and benchmark data:
Input / 1M
$0.60
Output / 1M
$2.40
Context
1M
Tokens/sec
—
Details
- Provider
- NVIDIA
- Task
- Language modeling/generation,Question answering
- Parameters
- 550000000000.0
- Released
- 2026-06-04
- Open weights
- Yes