NVIDIA Launches Nemotron 3 Super: Open 120B-Param Agentic AI Model with 5× Throughput and 1M-Token Context
NVIDIA just dropped something that’s going to matter for anyone building real agentic AI systems. Nemotron 3 Super is a 120-billion-parameter open-weight model — but here’s the key detail that separates it from the crowd: it only uses 12 billion active parameters at inference time thanks to a hybrid Mamba-Transformer Mixture-of-Experts (MoE) architecture. The result? Five times higher throughput than comparable-sized models, with a one-million-token context window that changes how agents can actually operate in the wild. ...