Tutorial: orchestrating a review agent with Claude Code Dynamic Workflows

Hands-on tutorial: build a multi-agent review workflow in Claude Code with the agent(), pipeline() and parallel() primitives, reproducing the find-adversarial verify-synthesise pattern of the bundled deep-research command. The feature is in research preview.

AIGovernanceOpen SourceCompliance Open SourceAILLMAgenticTutorialClaude CodeGovernance

Premise: what we are building, and the state of the feature

In late May, alongside claude-opus-4-8, Anthropic introduced Dynamic Workflows in Claude Code: JavaScript scripts that Claude writes and a runtime executes to orchestrate many subagents in the background. We covered the architecture in Claude Opus 4.8 and Dynamic Workflows; here we get hands-on and build a review workflow that reuses the pattern of the bundled /deep-research command.

One warning before we start, to be taken literally: Dynamic Workflows are in research preview, not generally available. Primitives, activation keywords and runtime limits may change. The code below is a teaching example, not a stable API: it illustrates the pattern, and should not be copied into production without checking it against the version you have installed.

Prerequisites

The feature requires a recent version of Claude Code and is available on paid plans (on the Pro plan it must be turned on from the Dynamic workflows row in /config), as well as via the API and on the Bedrock, Vertex and Foundry clouds. You do not write the script by hand: you have Claude generate it by describing the task, optionally with the activation keyword. When a run does what you wanted, the script can be saved as a reusable command (/workflows, then the s key).

The /deep-research pattern applied to review

/deep-research fans out searches across several angles, cross-checks the sources and subjects each claim to an adversarial majority vote — independent agents that try to refute the claim — returning a cited report with non-surviving claims already filtered out. Three phases: find → adversarial verify → synthesise.

The same scheme transfers cleanly to a code review. Instead of claims to refute we have findings to confirm: for each review dimension (security, correctness, performance, maintainability) one agent finds the issues, and a second agent — the adversarial reviewer, what we informally call the skeptic here (not an official term) — tries to tear them down. Only what holds up survives.

The script

Three primitives do the work. agent() runs a single subagent; with the schema option it forces a structured output validated at the tool-call level, with an automatic retry on mismatch. pipeline() flows items through the stages with no barrier, so one dimension can be at the verify stage while another is still at discovery. parallel(), by contrast, is a barrier: it waits for all tasks to complete.

// Multi-dimension review: find -> adversarial majority vote -> synthesise.
// `args` comes from the command invocation (e.g. the diff or the paths).
const dimensions = ["security", "correctness", "performance", "maintainability"];

const findingSchema = {
  type: "object",
  properties: {
    file: { type: "string" },
    line: { type: "number" },
    severity: { type: "string", enum: ["low", "medium", "high"] },
    claim: { type: "string" },
  },
  required: ["file", "claim", "severity"],
};

// Stages 1 + 2 as a pipeline: each dimension flows with no barrier.
const reviewed = await pipeline(dimensions, [
  // Stage 1 — find: one agent looks for findings on a single dimension.
  async (dim) => {
    const findings = await agent({
      prompt: `Review the diff for ${dim} issues. Concrete findings only.`,
      context: args,
      schema: { type: "array", items: findingSchema },
    });
    return { dim, findings };
  },
  // Stage 2 — adversarial verify: 3 independent "skeptics" vote on each finding.
  async ({ dim, findings }) => {
    const survived = [];
    for (const f of findings) {
      const votes = await parallel(
        [0, 1, 2].map(() =>
          agent({
            prompt: `Try to refute this finding. Is it real and actionable?`,
            context: { finding: f, diff: args },
            schema: { type: "object", properties: { upheld: { type: "boolean" } } },
          })
        )
      );
      const upheld = votes.filter((v) => v.upheld).length;
      if (upheld >= 2) survived.push({ ...f, votes: upheld });
    }
    return { dim, survived };
  },
]);

// Stage 3 — synthesis: a single agent composes the final cited report.
const report = await agent({
  prompt: "Compose a review report from the surviving findings only, grouped by severity.",
  context: reviewed,
});

The steps, briefly. Stage 1 launches one agent per dimension and collects its findings in the validated schema; stage 2 subjects each finding to three adversarial reviewers in parallel() and keeps only those with at least two votes in favour (the threshold is arbitrary, tune it); stage 3 synthesises. The intermediate results live in script variables (reviewed), not in the context window: that is what lets you process a lot of work without saturating the context.

Runtime limits to keep in mind

The runtime enforces hard constraints: up to 16 concurrent agents and a maximum of 1,000 agents per run. There is no mid-run user input — for a sign-off between phases, run each phase as its own workflow. And, crucially: the script has no direct filesystem or shell access; the agents do. The script coordinates, the agents act. Three dimensions times four findings with three votes each is already around forty agents: with a large diff you reach the cap fast, so it pays to try first on a narrow slice to gauge the token cost.

What it means in practice, and the governance angle

For anyone weighing adoption, the point is not the syntactic novelty but where the controls end up. A workflow’s subagents always run in acceptEdits mode and inherit the session’s tool allowlist, regardless of the conversation’s permission mode: file edits are auto-approved, while shell commands, web fetches and MCP tools outside the allowlist can still pause the run. The feature is disablable at several levels — from /config, with disableWorkflows in settings.json, via an environment variable, or centrally through organization-level managed settings.

These are exactly the control points that matter when a single command spawns dozens of agents: what they can touch, what gets auto-approved, who can shut it all down. It is the thesis we at noze call Secure Governance, and it sits behind Admina, the Open Source AI-governance framework we sponsor: a codified review workflow is repeatable and legible, but it is governing the orchestration — allowlist, approval mode, kill switches — that makes it adoptable in a regulated context.

Links: Orchestrate subagents at scale with dynamic workflows — Claude Code Docs · Introducing dynamic workflows in Claude Code

Need support? Under attack? Service Status
Need support? Under attack? Service Status