Haystack: deepset's open source RAG framework

Haystack by deepset evolves from traditional RAG pipelines to a component-based framework. Version 2.0 (11 March 2024) introduces YAML pipelines, components and native agent support.

Open SourceAI Open SourceHaystackdeepsetRAGAgenticLLMAI

Origin and company

Haystack was published in 2019 by deepset, a German company founded by Milos Rusic, Malte Pietsch and Timo Möller. The framework was born to address neural search problems: extractive question answering, semantic document retrieval, search pipelines over large volumes of unstructured data. The licence is Apache 2.0, the main language is Python.

In the first generation — Haystack 1.x — the framework proposed the pipeline concept as an ordered graph of components: reader, retriever, generator, ranker. The approach allows composing modular RAG systems, swapping sparse and dense retrievers, and evaluating pipelines systematically through integrated metrics.

Haystack 2.0

On 11 March 2024 Haystack 2.0 was released, a significant architectural refactor. The new version introduces a component-based model: each operation — embedding, retrieval, prompt building, LLM call, parsing — is a component with declaratively typed inputs and outputs. Pipelines are graphs of components connected through input/output attributes, with native YAML serialisation.

Declarative serialisation allows versioning pipelines as textual artifacts, running them in different environments and generating them programmatically. The component-based model also simplifies extension: a custom component is a Python class with a run method and a set of type annotations.

Agents and integrations

Haystack 2.0 introduces native agent support, modelled as components that encapsulate reasoning loops and tool invocation cycles. The framework maintains a wide range of integrations with vector databases — Elasticsearch, OpenSearch, Weaviate, Pinecone, Qdrant, among others — and with LLM providers (OpenAI, Anthropic, Cohere, open source models via Hugging Face).

Positioning

Compared to other frameworks in the RAG ecosystem, Haystack has a longer history in classical neural search, from which it inherits particular attention to evaluation, metrics and specialised retrieval components. The transition to a component-based architecture with YAML serialisation makes it suitable for contexts where pipelines are configuration artifacts managed as code, with review and deploy independent of the application invoking them.

Link: haystack.deepset.ai

Need support? Under attack? Service Status
Need support? Under attack? Service Status