01.AI’s debut
On 6 November 2023, 01.AI — the company founded by Kai-Fu Lee — releases Yi-6B and Yi-34B, the first public versions of its own family of language models. The models are English-Chinese bilingual, with a training data balance designed to make the two languages equally competitive, a characteristic still rare at the time of release for models conceived in a Chinese context but distributed globally.
Yi-34B in particular positions itself among models in the 30-35 billion parameter range that compete with Llama 2 70B on standard English benchmarks, while on Chinese benchmarks (C-Eval, CMMLU) it surpasses several references of the period.
Extended context and variants
Shortly after the initial release, 01.AI publishes Yi-34B-200K, a variant with a context window extended to 200,000 tokens. The extension is obtained through a continued pretraining process on long sequences, with modifications to RoPE positionals to maintain stability at lengths greater than the original training. The availability of a bilingual model with long context is relevant for document analysis scenarios and summarisation of extended content in both languages.
Licence and evolution
The first Yi releases adopt a custom open licence: free use for research and granted for commercial cases subject to registration with 01.AI during 2023. This choice, common to other Chinese models of the period, places Yi in an intermediate category between fully permissive licences and those with explicit restrictions.
With Yi-1.5, released in May 2024, 01.AI moves to the Apache 2.0 licence, removing the registration requirement and aligning the family with dominant conventions in the Western open source ecosystem. Yi-1.5 also introduces quality improvements on reasoning and code generation benchmarks, consolidating the family’s position as a robust bilingual option among open-weight models.
Link: 01.ai
