A model family designed for local execution
On 16 July 2024 Hugging Face releases the SmolLM family: three small language models — 135 million, 360 million and 1.7 billion parameters — explicitly designed to run on edge devices, laptops and consumer hardware, without dependence on dedicated GPUs.
The family fits into a technical trend that prioritises efficiency: rather than increasing model size, SmolLM invests in dataset quality to achieve competitive performance with compact architectures. All models are released under the Apache 2.0 licence.
The SmolLM-Corpus dataset
The training dataset, SmolLM-Corpus, is itself published on Hugging Face and combines three main components: Cosmopedia v2 — quality synthetic content produced by large models on educational topics —, FineWeb-Edu — an education-oriented filtered subset of the FineWeb corpus — and Python-Edu, focused on well-documented Python code.
Dataset curation is an integral part of the contribution: Hugging Face publishes, alongside the models, both the dataset and the methodology behind its construction, making the whole process reproducible.
SmolLM2
In October 2024 the second generation, SmolLM2, is released, with the same three sizes (135M, 360M, 1.7B) but an improved data mixture and longer training. The 1.7B SmolLM2 model is trained on approximately 11 trillion tokens, a large amount relative to its size, following the principle that small language models benefit significantly from extended training on high-quality data.
Practical use
SmolLM can be used via transformers, llama.cpp, ONNX Runtime and Core ML. Its small footprint enables in-browser execution (via transformers.js) and deployment on mobile CPUs or NPUs, making SmolLM relevant in contexts where privacy, latency or lack of connectivity require local inference.
