Qwen: Alibaba Cloud's multilingual models

Alibaba Cloud releases Qwen (通义千问): a model family from 0.5B to 72B parameters with native multilingual support, specialised variants for code, maths and vision, under the Apache 2.0 licence.

Open SourceAI Open SourceQwenLLMAIMultilingualAlibaba

A Chinese contribution to the open ecosystem

Alibaba Cloud releases Qwen (通义千问, literally “a thousand questions of universal wisdom”), a family of language models covering sizes from 0.5 to 72 billion parameters. In an ecosystem dominated by models of American and European origin, Qwen represents the most significant Chinese contribution to the open language model landscape, released under the Apache 2.0 licence with no usage restrictions.

The Qwen family stands out for the breadth of its offering: not a single model, but a complete ecosystem with specialised variants for different tasks, all trained on a multilingual corpus that includes Chinese, English and source code in multiple programming languages.

Architecture and multilingual support

The base architecture is a decoder-only Transformer with grouped-query attention (GQA), RMSNorm and SwiGLU, technical choices aligned with the state of the art in contemporary models. The tokenizer is optimised for native multilingual support: it efficiently handles Chinese, English and source code without the penalty typical of tokenizers trained predominantly on English text.

The context window supports up to 32,768 tokens, with positional extension techniques that allow handling longer contexts during inference. The 72B model achieves competitive performance with Llama 2 70B on English benchmarks, and significantly superior performance on Chinese benchmarks.

Specialised variants

Qwen-Code is optimised for source code generation and understanding, trained on an additional corpus of public repositories. Qwen-Math specialises in mathematical reasoning, with performance exceeding larger generalist models on dedicated benchmarks. Qwen-VL (Vision-Language) extends the model to image understanding, accepting multimodal text and image inputs.

Scale and accessibility

The range of sizes — from 0.5B to 72B parameters — covers scenarios from mobile device deployment to inference on GPU clusters. The smallest models (0.5B, 1.8B) are designed for execution on edge hardware with minimal resources, while the largest models compete with the best available open models. All models are distributed through Hugging Face and ModelScope, with weights available in full-precision and quantized formats.

Link: qwenlm.github.io

Need support? Under attack? Service Status
Need support? Under attack? Service Status