The Nemotron-4 340B release
On 14 June 2024, NVIDIA releases Nemotron-4 340B, a family of language models with 340 billion parameters published in three coordinated variants: Base (pre-trained model), Instruct (model aligned to follow instructions) and Reward (classifier model used in RLHF processes). Pre-training is conducted on approximately 9 trillion tokens, with a data mix balancing code, web text and multilingual content.
The decision to release the reward model at the same time is unusual: it makes explicit the tool with which NVIDIA evaluates responses during fine-tuning, and allows third parties to reuse it for preference modelling pipelines without having to train their own classifier.
NVIDIA Open Model License
Nemotron is distributed under the NVIDIA Open Model License, a licence that permits commercial use of the weights and model outputs, including — explicitly — the generation of synthetic data to be used in training other models. There are some caveats on conduct of use and attribution, but without the non-commercial or user-threshold limitations typical of other “open weights” licences.
The declared commercial positioning is consistent with this choice: NVIDIA proposes Nemotron as a synthetic data generator for training downstream models, more than as an end-user conversational assistant. The reward model’s performance on public benchmarks at the time of release confirms the pipeline’s validity.
Evolution of the family
In October 2024, NVIDIA publishes Nemotron-70B, a more compact variant obtained through targeted fine-tuning, which achieves notable results on alignment and instruction-following benchmarks. The family evolves while keeping the same licence and the same philosophy: providing quality models together with the infrastructural components (reward, synthetic datasets) needed to train further models on NVIDIA ecosystems. NVIDIA’s goal is not to compete head-on with proprietary API providers, but to consolidate its position as a platform for the entire training pipeline.
