The first decentralised LLM community
EleutherAI was born in 2020 as an informal research collective gathered around the stated goal of openly replicating the large language models made famous by GPT-3. EleutherAI’s work was the first structured attempt, outside major industrial labs, to train competitive LLMs and release them publicly with weights, code and data.
GPT-J-6B
On 9 June 2021 GPT-J-6B is released, a model with 6 billion parameters trained using Mesh Transformer JAX, the distributed library developed for the purpose by Ben Wang and Aran Komatsuzaki. Training was performed on The Pile, an 825 GB dataset curated by EleutherAI itself as an open alternative to the proprietary corpora used by OpenAI.
GPT-J is released under the Apache 2.0 licence, with no usage restrictions. At release it was the largest publicly available autoregressive language model with a permissive licence, and for a period it was the primary reference for anyone wanting to use a GPT-3-like model without accessing OpenAI’s APIs.
GPT-NeoX-20B
On 9 February 2022 EleutherAI releases GPT-NeoX-20B, a 20 billion parameter model, again under the Apache 2.0 licence and trained on The Pile. Training was carried out with GPT-NeoX, EleutherAI’s library based on Megatron-LM and DeepSpeed, optimised for large-scale multi-GPU training.
GPT-NeoX-20B is, at the time of release, the largest open-weight dense language model in the world with a permissive licence. The publication includes weights, training code, configurations and detailed hardware infrastructure description.
An infrastructural legacy
EleutherAI’s contribution extends beyond individual models. The Pile, the GPT-NeoX library and the methodological approach to full disclosure have defined the de facto standard for subsequent open LLM releases, including Pythia (2023) and many community-derived models.
Link: eleuther.ai
