Sakana Fugu: multi-agent orchestration as a single model

On 22 June 2026 Sakana AI unveiled Fugu, an orchestrator model that coordinates a pool of frontier LLMs behind a single API. What it is, how it works (TRINITY and Conductor), the reported benchmarks, the anti vendor lock-in thesis, and why it isn't available in the EU/EEA at launch due to GDPR.

AIGovernanceCompliance Sakana AIFuguMulti-agentOrchestrationLLMVendor lock-inGDPR

What Sakana Fugu is

On 22 June 2026 the Japanese lab Sakana AI unveiled Sakana Fugu, described as “a multi-agent system that behaves like a single model”. It is not a new monolithic LLM: it is a family of orchestrator models. Fugu is itself a language model, trained to interpret the user’s request and to dynamically assemble the scaffold of agents needed to solve it, calling different LLMs from a pool as needed, including recursively instances of itself.

All of this sits behind a single OpenAI-compatible API: integrators send the request to one endpoint, and model selection, delegation, verification and final synthesis happen internally. Two variants: Fugu, low-latency for everyday work (code review, chatbots), and Fugu Ultra, tuned for maximum quality on complex, multi-step problems (AI research, cybersecurity analysis, patent investigations). Distribution is subscription-based, with a pay-as-you-go plan for enterprise workloads.

How it works

The technology rests on two research papers presented at ICLR 2026:

  • TRINITY (An Evolved LLM Coordinator): a lightweight, evolutionarily-obtained coordinator that assigns different LLMs the roles of Thinker, Worker and Verifier across multiple turns.
  • The Conductor (Learning to Orchestrate Agents in Natural Language): trained with reinforcement learning to discover on its own coordination strategies expressed in natural language, instead of relying on hand-written workflows.

Training combines large-scale fine-tuning, evolutionary algorithms and RL. The most interesting trait is recursive orchestration: Fugu can re-read its own output and decide whether to adopt a better coordination strategy, without any retraining.

The numbers (as reported by Sakana)

According to the figures the company published in its technical report, Fugu Ultra leads on four coding benchmarks, on CharXiv Reasoning and on Humanity’s Last Exam. Some of the reported values:

  • SWE-Bench Pro: 73.7 (Ultra), 59.0 (Fugu)
  • LiveCodeBench: 93.2 (Ultra), 92.9 (Fugu)
  • GPQA-Diamond: 95.5 for both variants

Sakana claims Fugu Ultra “stands shoulder-to-shoulder” with leading models such as Fable 5 and Mythos Preview. As always with self-reported benchmarks, these figures should be taken with caution and verified independently.

The bet: no single-vendor lock-in

The stated rationale is explicit. Sakana frames Fugu as a hedge against single-vendor dependency: “recent disruptions in the AI landscape have demonstrated the severe risk of single-vendor dependency”. The company cites, verbatim, the “export controls imposed on Anthropic’s Fable and Mythos models”, noting that access “can shift or disappear overnight”. If one provider restricts access, the orchestrator routes the work to other models in the pool.

It is an argument about architectural resilience: moving value from the single model to the ability to coordinate many. The idea is compelling, but as we’ll see, in Europe it runs into a not-so-minor detail.

The European paradox

Here’s the point for anyone operating in Italy and the EU. At launch, Fugu is not available in the EU/EEA: Sakana states it is still working on GDPR compliance and, in the meantime, does not offer the service in EU/EEA member states (the US and UK are already live).

The reason is structural. Model routing is not exposed “by design”: the user does not know which third-party model processes their request and, consequently, where and by whom the data is handled. For a black-box architecture of this kind the GDPR questions are concrete: controller and processors, non-EU transfers, legal basis, transparency towards the data subject. It is paradoxical that a tool designed precisely to reduce dependency on a single vendor arrives, in Europe, with a geographic access restriction.

Our take

Treating multi-model orchestration as a product is a sound idea, and probably one of the directions of the coming months: on many real tasks the gains come more from how models are combined than from the single model. For an Italian or EU company or public body, though, two practical notes apply today:

  • It cannot be used in the EU, and when it can, transparency over routing will be decisive for any GDPR/DPIA assessment.
  • The anti lock-in thesis is correct, but it can be pursued without relying on a black box too: multi-model orchestration managed in-house, with open or self-hosted models where the data requires it, keeps control over routing and processing.

In short: an important technological signal, worth watching; in Europe, for now, more of an architectural prompt than an adoptable tool.

Sources

Need support? Under attack? Service Status
Need support? Under attack? Service Status