Origin and reference paper
AutoGen was published in October 2023 as an open source project developed by Microsoft Research in collaboration with Penn State University and University of Washington. The work is accompanied by the paper “AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation”, which formalises the theoretical model underlying the framework. The initial licence is a combination of Creative Commons Attribution 4.0 for documentation and MIT for code.
The stated goal is to provide a general framework for building LLM applications in which multiple agents cooperate through structured conversations, with a higher level of explicit control compared to single-agent patterns.
Conversational model
AutoGen organises applications around agent types with distinct roles. The AssistantAgent represents an LLM-based agent, typically configured with a system prompt and a set of tools. The UserProxyAgent acts as a bridge between the user and the system: it can forward human input, execute code in a sandboxed environment or delegate execution to registered functions. The GroupChat coordinates multiple agents within a shared conversation, with policies for selecting the next speaker.
Code execution is a central feature: a UserProxyAgent can receive Python or bash blocks generated by an AssistantAgent and execute them in an isolated environment, typically a Docker container. The execution result comes back into the conversation as a message, enabling write-execute-debug cycles.
The 0.4 refactor
In January 2025 AutoGen 0.4 was released, a significant rewrite introducing an event-driven, asynchronous architecture. The new version separates the agent messaging protocol from conversational logic, allowing agents to run in distinct processes and enabling composition of distributed systems. Different API levels are introduced, from low-level core to high-level interfaces preserving conceptual compatibility with previous versions.
Positioning
AutoGen was one of the first frameworks to propose multi-agent as a first-class application pattern, influencing subsequent projects in the same space. The focus on code execution as part of the conversation makes it particularly suitable for research scenarios, automation of technical tasks and LLM-assisted data analysis, while requiring attention in managing the sandboxed execution environment.
