Linux Services & Systems

Domains, hosting, PEC, email infrastructure, network services and Linux systems. Open Source infrastructure support and management.

Discover →

A model for commercial use

On 5 May 2023 MosaicML releases MPT-7B (MosaicML Pretrained Transformer), a 7-billion-parameter model under the Apache 2.0 licence. The licence choice is deliberately aimed at enterprises: unlike Llama 2 (not yet released at the time) or the first version of Falcon, MPT can be used commercially without restrictions or royalties.

A larger successor, MPT-30B, is planned for the following weeks as a model optimised to run on a single 80GB GPU in 16-bit precision.

Architectural innovations

MPT introduces several notable technical choices into the Open Source space:

ALiBi (Attention with Linear Biases) as positional encoding — enables extrapolation to context lengths beyond training without re-training
FlashAttention — IO-aware attention implementation, reducing training and inference time
No bias in linear layers and layer norms — improves training stability
EleutherAI GPT-NeoX 20B tokeniser

MPT-7B training was performed on 1 trillion tokens with a published cost of around 200,000 USD, demonstrating that the Open Source frontier was reachable even with moderate training budgets.

Specialised variants

MosaicML releases several fine-tuned MPT variants:

MPT-7B-Instruct — instruction following
MPT-7B-Chat — assistant-style conversation
MPT-7B-StoryWriter-65K+ — context window extended to 65,000 tokens (trained on books), a practical demonstration of ALiBi’s ability to handle long sequences

Instruct and Chat variants are Apache 2.0; StoryWriter is CC-BY-SA-3.0 due to fine-tuning dataset constraints.

Significance

MPT marks a significant milestone for the commercial Open Source ecosystem: frontier-quality models with a clean Apache 2.0 licence and architectural choices (ALiBi, FlashAttention) that will influence subsequent training efforts.

Link: www.mosaicml.com

Company

Actions

Links

Products

Solutions

Industries

MPT: MosaicML's Open Source commercial models

Linux Services & Systems

A model for commercial use

Architectural innovations

Specialised variants

Significance