Linux Services & Systems

Domains, hosting, PEC, email infrastructure, network services and Linux systems. Open Source infrastructure support and management.

Discover →

From GPT Index to LlamaIndex

The project was born in November 2022 as GPT Index, developed by Jerry Liu to address a specific problem: giving language models access to data volumes larger than the context window size. In the following months the project was renamed LlamaIndex and positioned as a data framework for LLM applications, with primary focus on the retrieval-augmented generation pattern.

The framework is written in Python (with a companion TypeScript version) and released under the MIT licence. Around the project LlamaIndex.ai was founded as the reference company, which also develops components and commercial services complementary to the open source framework.

Core abstractions

LlamaIndex organises the data lifecycle for a RAG application around a set of specialised components. Document readers load data from heterogeneous sources — files, APIs, databases, cloud services — through LlamaHub, a hub collecting more than one hundred official and community connectors. Documents are transformed into nodes, atomic information units associated with metadata.

Indices organise nodes for retrieval: the vector index is the most widespread, but the framework also supports keyword indices, hierarchical tree indices and knowledge graph indices. Query engines combine retrieval with generation, applying strategies such as sub-question decomposition, query rewriting and result re-ranking. Agent workflows extend the model to scenarios where multiple steps of retrieval and reasoning are orchestrated over multiple sources.

Version 0.10 and the refactor

On 14 February 2024 LlamaIndex 0.10 was released, considered the foundation for a future 1.0 version. The release introduces a significant refactor: the codebase is reorganised into llama-index-core, containing fundamental abstractions and components, and a constellation of integration packages distributed as llama-index-*. Integrations with vector stores, LLM providers, readers and other third-party components can thus be installed and versioned independently.

The change addresses the same need that motivated package separation in other frameworks: reducing the installation footprint, enabling granular releases and separating core stability from the fast evolution of integrations.

Adoption

LlamaIndex is today one of the consolidated references for building production RAG applications, particularly when the primary requirement is managing heterogeneous document datasets. The framework is often used in combination with other tools in the LLM ecosystem, specifically in the role of a data access layer.

Link: llamaindex.ai

Company

Actions

Links

Products

Solutions

Industries

LlamaIndex: the open source data framework for RAG

Linux Services & Systems

From GPT Index to LlamaIndex

Core abstractions

Version 0.10 and the refactor

Adoption