DeepSeek: the reasoning models challenging Western labs

DeepSeek releases V3 and R1: Mixture-of-Experts models with chain-of-thought reasoning, competitive performance on math and coding benchmarks, open weights and reduced training costs.

Open SourceAI Open SourceDeepSeekLLMAIReasoningMoE

A lab from Hangzhou

DeepSeek, founded in 2023 in Hangzhou as a research division of the quantitative fund High-Flyer, enters the AI landscape with an approach different from Western laboratories: high-performance models, open weights, detailed technical documentation and declared training costs a fraction of the competition’s. In January 2025 the publication of DeepSeek-V3 and DeepSeek-R1 captures the attention of the entire industry.

Mixture-of-Experts: more parameters, less computation

The architecture underpinning DeepSeek-V3 is Mixture-of-Experts (MoE). The model has a high total parameter count — 671 billion — but during each forward pass it activates only a portion of the experts, approximately 37 billion parameters. A routing mechanism decides which experts to activate for each input token, balancing load across experts and specialising them on different content types.

The advantage is concrete: the performance of a dense model of comparable size, with significantly lower computational cost per token. DeepSeek declares a training cost of approximately $5.5 million for V3 — an order of magnitude lower than estimates for Western frontier models.

R1: chain-of-thought reasoning

DeepSeek-R1 is the reasoning model. Trained with reinforcement learning (RL) to develop chain-of-thought capabilities — explicit step-by-step reasoning — R1 competes with OpenAI’s o1 on mathematics, coding and logical reasoning benchmarks. The model generates a visible intermediate thinking sequence, where it decomposes the problem, evaluates alternative approaches and verifies solution consistency before producing the final answer.

The technical paper documents how reasoning emerges from RL training without explicit supervision on the thinking chain: the model learns to reason because reasoning produces better answers and therefore higher rewards.

Open weights and impact

Both V3 and R1 are released with open weights under a licence that permits commercial use. DeepSeek also publishes distillations of R1 into smaller models — from 1.5B to 70B parameters — based on Qwen and Llama architectures, making reasoning capabilities accessible on modest hardware.

The impact goes beyond the model itself. DeepSeek demonstrates that architectural and algorithmic efficiency can compensate for computational resource availability, challenging the assumption that the AI frontier is accessible only to those with tens of thousands of latest-generation GPUs.

Link: deepseek.com

Need support? Under attack? Service Status
Need support? Under attack? Service Status