Claude AI Tutorial #15: Multi-Agent Systems — Orchestrating Multiple Claude Instances

One agent is powerful. Multiple agents working together are transformative. A code review agent checks for bugs while a test agent writes tests and a documentation agent updates the docs — all in parallel. Multi-agent systems let you break complex tasks into specialized roles.

This is Article 15 in the Claude AI — From Zero to Power User series. You should have completed Article 14: Building AI Agents before this article.

By the end of this article, you will know how to orchestrate multiple Claude agents using Agent Teams, programmatic patterns, and the Claude Flow framework.

Why Multi-Agent?

A single agent can do a lot. But some tasks are better split across multiple agents:

Specialization — Each agent focuses on one thing and does it well
Parallelism — Multiple agents work simultaneously, reducing total time
Reliability — If one agent fails, others continue
Quality — Agents with competing hypotheses produce better results through debate

When to Use Multi-Agent

Scenario	Single Agent	Multi-Agent
Fix one bug	Good	Overkill
Review and test a PR	Possible but slow	Faster (parallel review + test)
Full feature: code + tests + docs	Slow, context overflow	Fast (specialized agents)
Research task with many sources	Possible	Better (parallel research)
Complex debugging	Good	Better (competing hypotheses)

Agent Teams in Claude Code

Claude Code has a built-in multi-agent feature called Agent Teams. It is experimental and requires Opus 4.6.

Enabling Agent Teams

# Set the experimental flag
export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=true

# Start Claude Code
claude

Or add it to your Claude Code settings:

{
  "experimentalAgentTeams": true
}

How Agent Teams Work

When enabled, Claude Code can spawn teammate agents to work on subtasks in parallel.

Team Lead — The main Claude Code instance. It:

Reads your request
Decides which subtasks need separate agents
Spawns teammates
Coordinates results

Teammates — Independent Claude instances. They:

Get their own context window
Work independently on their subtask
Report results back to the team lead

Example: Using Agent Teams

You: "Add pagination to the user list endpoint, write tests, and update the API documentation"

Claude Code (team lead) might:

Spawn teammate 1: “Add pagination to GET /users endpoint”
Spawn teammate 2: “Write pagination tests in tests/test_users.py”
Spawn teammate 3: “Update API docs for pagination in docs/api.md”
Wait for all teammates to finish
Integrate and verify the results

Teammates vs Subagents

Feature	Teammates	Subagents
Context	Independent (own window)	Shared with parent
Parallelism	True parallel execution	Sequential
Communication	Through team lead	Direct
Best for	Independent tasks	Dependent tasks
Model	Same as team lead	Can be different

Use teammates when tasks are independent. Use subagents (sequential tool calls) when tasks depend on each other.

Building Multi-Agent Systems with the Agent SDK

For programmatic control, use the Agent SDK to build your own multi-agent systems.

Pattern 1: Fan-Out / Fan-In

Multiple agents work in parallel, then results are combined.

Python

import asyncio
from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import FileReadTool, GrepTool

async def fan_out_review(files: list[str]) -> dict:
    """Review multiple files in parallel with specialized agents."""

    # Define specialized agents
    security_agent = Agent(AgentConfig(
        model="claude-sonnet-4-6",
        max_tokens=4096,
        tools=[FileReadTool(), GrepTool()],
        system_prompt="""You are a security reviewer. Focus ONLY on:
- SQL injection, XSS, CSRF vulnerabilities
- Authentication and authorization issues
- Sensitive data exposure
- Input validation gaps

Ignore style, performance, and non-security issues."""
    ))

    performance_agent = Agent(AgentConfig(
        model="claude-sonnet-4-6",
        max_tokens=4096,
        tools=[FileReadTool(), GrepTool()],
        system_prompt="""You are a performance reviewer. Focus ONLY on:
- N+1 queries
- Missing indexes
- Unnecessary loops
- Memory leaks
- Blocking operations in async code

Ignore security, style, and non-performance issues."""
    ))

    logic_agent = Agent(AgentConfig(
        model="claude-sonnet-4-6",
        max_tokens=4096,
        tools=[FileReadTool(), GrepTool()],
        system_prompt="""You are a logic reviewer. Focus ONLY on:
- Bugs and logic errors
- Edge cases not handled
- Off-by-one errors
- Null/undefined handling
- Error handling gaps

Ignore security, performance, and style issues."""
    ))

    file_list = "\n".join(f"- {f}" for f in files)
    prompt = f"Review these files:\n{file_list}"

    # Run all three agents in parallel
    security_task = security_agent.run(prompt)
    performance_task = performance_agent.run(prompt)
    logic_task = logic_agent.run(prompt)

    security_result, performance_result, logic_result = await asyncio.gather(
        security_task, performance_task, logic_task
    )

    return {
        "security": security_result.text,
        "performance": performance_result.text,
        "logic": logic_result.text
    }

# Use it
async def main():
    results = await fan_out_review([
        "src/auth/login.py",
        "src/api/users.py",
        "src/db/queries.py"
    ])

    print("=== Security Review ===")
    print(results["security"])
    print("\n=== Performance Review ===")
    print(results["performance"])
    print("\n=== Logic Review ===")
    print(results["logic"])

asyncio.run(main())

TypeScript

import { Agent, AgentConfig } from "@anthropic-ai/claude-agent-sdk";
import { FileReadTool, GrepTool } from "@anthropic-ai/claude-agent-sdk/tools";

async function fanOutReview(files: string[]) {
  const securityAgent = new Agent({
    model: "claude-sonnet-4-6",
    maxTokens: 4096,
    tools: [new FileReadTool(), new GrepTool()],
    systemPrompt: "You are a security reviewer. Focus ONLY on security issues.",
  });

  const performanceAgent = new Agent({
    model: "claude-sonnet-4-6",
    maxTokens: 4096,
    tools: [new FileReadTool(), new GrepTool()],
    systemPrompt:
      "You are a performance reviewer. Focus ONLY on performance issues.",
  });

  const logicAgent = new Agent({
    model: "claude-sonnet-4-6",
    maxTokens: 4096,
    tools: [new FileReadTool(), new GrepTool()],
    systemPrompt:
      "You are a logic reviewer. Focus ONLY on bugs and logic errors.",
  });

  const fileList = files.map((f) => `- ${f}`).join("\n");
  const prompt = `Review these files:\n${fileList}`;

  const [security, performance, logic] = await Promise.all([
    securityAgent.run(prompt),
    performanceAgent.run(prompt),
    logicAgent.run(prompt),
  ]);

  return {
    security: security.text,
    performance: performance.text,
    logic: logic.text,
  };
}

const results = await fanOutReview([
  "src/auth/login.ts",
  "src/api/users.ts",
]);

console.log("Security:", results.security);
console.log("Performance:", results.performance);
console.log("Logic:", results.logic);

Pattern 2: Pipeline

Agents work in sequence, each building on the previous agent’s output.

Python

import asyncio
from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import FileReadTool, FileWriteTool, ShellTool

async def pipeline_feature(feature_description: str):
    """Three agents work in sequence: plan → implement → test."""

    # Agent 1: Planning
    planner = Agent(AgentConfig(
        model="claude-sonnet-4-6",
        max_tokens=4096,
        tools=[FileReadTool()],
        system_prompt="""You are a technical architect. Given a feature request:
1. Read the relevant code files
2. Create a step-by-step implementation plan
3. List the files to create or modify
4. List the test cases needed

Output a clear, structured plan."""
    ))

    plan_result = await planner.run(feature_description)
    print(f"Plan:\n{plan_result.text}\n")

    # Agent 2: Implementation
    implementer = Agent(AgentConfig(
        model="claude-sonnet-4-6",
        max_tokens=8192,
        tools=[FileReadTool(), FileWriteTool()],
        system_prompt="""You are a senior developer. Implement the feature according to the plan provided.
- Follow existing code patterns
- Add proper error handling
- Include type hints and docstrings"""
    ))

    impl_result = await implementer.run(
        f"Implement this feature:\n\n{feature_description}\n\nPlan:\n{plan_result.text}"
    )
    print(f"Implementation:\n{impl_result.text}\n")

    # Agent 3: Testing
    tester = Agent(AgentConfig(
        model="claude-sonnet-4-6",
        max_tokens=8192,
        tools=[FileReadTool(), FileWriteTool(), ShellTool(allowed_commands=["pytest"])],
        system_prompt="""You are a test engineer. Write comprehensive tests for the implemented feature.
- Unit tests for each function
- Edge cases and error scenarios
- Integration tests if needed
- Run the tests and fix any failures"""
    ))

    test_result = await tester.run(
        f"Write and run tests for:\n\n{feature_description}\n\nImplementation summary:\n{impl_result.text}"
    )
    print(f"Testing:\n{test_result.text}\n")

    return {
        "plan": plan_result.text,
        "implementation": impl_result.text,
        "tests": test_result.text
    }

asyncio.run(pipeline_feature(
    "Add a rate limiting middleware that limits each user to 100 requests per minute"
))

TypeScript

import { Agent, AgentConfig } from "@anthropic-ai/claude-agent-sdk";
import { FileReadTool, FileWriteTool, ShellTool } from "@anthropic-ai/claude-agent-sdk/tools";

async function pipelineFeature(featureDescription: string) {
  const planner = new Agent({
    model: "claude-sonnet-4-6",
    maxTokens: 4096,
    tools: [new FileReadTool()],
    systemPrompt: "You are a technical architect. Create a step-by-step implementation plan.",
  });

  const planResult = await planner.run(featureDescription);
  console.log(`Plan:\n${planResult.text}\n`);

  const implementer = new Agent({
    model: "claude-sonnet-4-6",
    maxTokens: 8192,
    tools: [new FileReadTool(), new FileWriteTool()],
    systemPrompt: "You are a senior developer. Implement the feature according to the plan.",
  });

  const implResult = await implementer.run(
    `Implement this feature:\n\n${featureDescription}\n\nPlan:\n${planResult.text}`
  );
  console.log(`Implementation:\n${implResult.text}\n`);

  const tester = new Agent({
    model: "claude-sonnet-4-6",
    maxTokens: 8192,
    tools: [new FileReadTool(), new FileWriteTool(), new ShellTool()],
    systemPrompt: "You are a test engineer. Write and run tests for the feature.",
  });

  const testResult = await tester.run(
    `Write and run tests for:\n\n${featureDescription}\n\nSummary:\n${implResult.text}`
  );
  console.log(`Testing:\n${testResult.text}\n`);

  return { plan: planResult.text, implementation: implResult.text, tests: testResult.text };
}

const result = await pipelineFeature(
  "Add a rate limiting middleware that limits each user to 100 requests per minute"
);

Pattern 3: Debate / Consensus

Two or more agents analyze the same problem independently, then a judge agent picks the best answer.

Python

import asyncio
from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import FileReadTool

async def debate_debug(bug_description: str) -> str:
    """Two agents independently diagnose a bug, then a judge picks the best analysis."""

    # Two independent investigators
    agent_a = Agent(AgentConfig(
        model="claude-sonnet-4-6",
        max_tokens=4096,
        tools=[FileReadTool()],
        thinking={"type": "enabled", "budget_tokens": 5000},
        system_prompt="You are Debug Agent A. Investigate the bug and propose a root cause and fix."
    ))

    agent_b = Agent(AgentConfig(
        model="claude-sonnet-4-6",
        max_tokens=4096,
        tools=[FileReadTool()],
        thinking={"type": "enabled", "budget_tokens": 5000},
        system_prompt="You are Debug Agent B. Investigate the bug and propose a root cause and fix."
    ))

    # Run both in parallel
    result_a, result_b = await asyncio.gather(
        agent_a.run(bug_description),
        agent_b.run(bug_description)
    )

    # Judge picks the best analysis
    judge = Agent(AgentConfig(
        model="claude-opus-4-6",  # Use Opus for the judge
        max_tokens=4096,
        thinking={"type": "enabled", "budget_tokens": 3000},
        system_prompt="""You are a senior engineer judging two bug analyses.
Compare both analyses and:
1. Pick the more likely root cause
2. Explain why you picked it
3. Combine the best suggestions from both"""
    ))

    judgment = await judge.run(
        f"""Bug: {bug_description}

Analysis A:
{result_a.text}

Analysis B:
{result_b.text}

Which analysis is more likely correct? Explain your reasoning."""
    )

    return judgment.text

async def main():
    result = await debate_debug(
        "The user search endpoint returns 0 results when searching for names with special characters like O'Brien"
    )
    print(result)

asyncio.run(main())

TypeScript

import { Agent } from "@anthropic-ai/claude-agent-sdk";
import { FileReadTool } from "@anthropic-ai/claude-agent-sdk/tools";

async function debateDebug(bugDescription: string): Promise<string> {
  const agentA = new Agent({
    model: "claude-sonnet-4-6",
    maxTokens: 4096,
    tools: [new FileReadTool()],
    thinking: { type: "enabled", budgetTokens: 5000 },
    systemPrompt: "You are Debug Agent A. Investigate the bug and propose a root cause and fix.",
  });

  const agentB = new Agent({
    model: "claude-sonnet-4-6",
    maxTokens: 4096,
    tools: [new FileReadTool()],
    thinking: { type: "enabled", budgetTokens: 5000 },
    systemPrompt: "You are Debug Agent B. Investigate the bug and propose a root cause and fix.",
  });

  const [resultA, resultB] = await Promise.all([
    agentA.run(bugDescription),
    agentB.run(bugDescription),
  ]);

  const judge = new Agent({
    model: "claude-opus-4-6",
    maxTokens: 4096,
    thinking: { type: "enabled", budgetTokens: 3000 },
    systemPrompt: "You are a senior engineer judging two bug analyses. Pick the better one and explain why.",
  });

  const judgment = await judge.run(
    `Bug: ${bugDescription}\n\nAnalysis A:\n${resultA.text}\n\nAnalysis B:\n${resultB.text}\n\nWhich is more likely correct?`
  );

  return judgment.text;
}

const result = await debateDebug(
  "The user search endpoint returns 0 results for names with special characters like O'Brien"
);
console.log(result);

The debate pattern is excellent for complex bugs where the root cause is not obvious. Two independent investigations are more likely to find the real issue than one.

Communication Between Agents

In multi-agent systems, agents need to share information. Here are the main approaches:

1. Shared Context (via Orchestrator)

The orchestrator passes results from one agent to another:

result_a = await agent_a.run("Analyze the codebase structure")
result_b = await agent_b.run(f"Based on this analysis:\n{result_a.text}\nWrite tests")

2. Shared Files

Agents read and write to a shared workspace:

# Agent A writes a plan
await planner.run("Write the implementation plan to docs/plan.md")

# Agent B reads the plan
await implementer.run("Read docs/plan.md and implement the feature")

3. Message Queue (for Complex Systems)

For production multi-agent systems, use a message queue:

import redis

r = redis.Redis()

# Agent A publishes results
async def agent_a_callback(result):
    r.publish("agent_results", json.dumps({
        "agent": "security_review",
        "result": result.text
    }))

# Orchestrator subscribes to results
pubsub = r.pubsub()
pubsub.subscribe("agent_results")

Resource Management

Multiple agents running in parallel can be expensive. Monitor and control resources:

Token Budget

class BudgetTracker:
    def __init__(self, max_tokens: int):
        self.max_tokens = max_tokens
        self.used_tokens = 0

    def can_proceed(self, estimated_tokens: int) -> bool:
        return self.used_tokens + estimated_tokens <= self.max_tokens

    def record(self, tokens: int):
        self.used_tokens += tokens

# Set a total budget for the multi-agent task
budget = BudgetTracker(max_tokens=500_000)  # ~$7.50 with Sonnet

async def run_with_budget(agent, prompt):
    if not budget.can_proceed(estimated_tokens=10_000):
        raise Exception("Token budget exceeded")
    result = await agent.run(prompt)
    budget.record(result.total_tokens)
    return result

Parallel Cost Estimates

Agents	Task	Parallel Cost (Sonnet 4.6)
3	Code review (fan-out)	$0.10-0.30
3	Plan → Implement → Test (pipeline)	$0.20-0.60
2 + judge	Debug debate	$0.15-0.50
5	Full feature (plan + impl + test + docs + review)	$0.50-2.00

Multi-agent tasks are more expensive than single-agent tasks but much faster and often higher quality.

Claude Flow Framework

Claude Flow is an open-source framework for building multi-agent systems with Claude. It provides pre-built orchestration patterns.

# Install Claude Flow
npm install claude-flow

Example: Research Pipeline

import { Flow, Agent, Pipeline } from "claude-flow";

const flow = new Flow();

// Define agents
const researcher = flow.agent({
  name: "researcher",
  model: "claude-sonnet-4-6",
  systemPrompt: "You are a research agent. Search the web and summarize findings.",
  tools: ["web_search"],
});

const writer = flow.agent({
  name: "writer",
  model: "claude-sonnet-4-6",
  systemPrompt: "You are a technical writer. Write clear, structured reports.",
});

const reviewer = flow.agent({
  name: "reviewer",
  model: "claude-sonnet-4-6",
  systemPrompt: "You are an editor. Review for accuracy, clarity, and completeness.",
});

// Define the pipeline
const pipeline = new Pipeline([
  { agent: researcher, task: "Research {{topic}}" },
  { agent: writer, task: "Write a report based on: {{previous}}" },
  { agent: reviewer, task: "Review this report: {{previous}}" },
]);

// Run it
const result = await pipeline.run({ topic: "Claude API best practices in 2026" });
console.log(result.final);

Claude Flow handles:

Agent lifecycle management
Inter-agent communication
Error recovery
Token tracking
Timeout management

Error Handling in Multi-Agent Systems

When one agent fails, you need a strategy:

1. Fail Fast

# If any agent fails, stop everything
results = await asyncio.gather(
    agent_a.run(prompt),
    agent_b.run(prompt),
    agent_c.run(prompt)
)  # Raises on first failure

2. Continue on Failure

# Continue even if some agents fail
results = await asyncio.gather(
    agent_a.run(prompt),
    agent_b.run(prompt),
    agent_c.run(prompt),
    return_exceptions=True
)

# Process results, handling failures
for i, result in enumerate(results):
    if isinstance(result, Exception):
        print(f"Agent {i} failed: {result}")
    else:
        print(f"Agent {i}: {result.text}")

3. Retry with Fallback

async def run_with_retry(agent, prompt, retries=2):
    for attempt in range(retries):
        try:
            return await agent.run(prompt)
        except Exception as e:
            if attempt == retries - 1:
                # Fallback to a simpler model
                fallback = Agent(AgentConfig(model="claude-haiku-4-5", ...))
                return await fallback.run(prompt)

Real-World Example: PR Review Bot

A complete multi-agent system that reviews pull requests:

import asyncio
from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import FileReadTool, GrepTool, ShellTool

async def review_pr(changed_files: list[str]) -> str:
    """Multi-agent PR review with specialized reviewers."""

    file_list = "\n".join(f"- {f}" for f in changed_files)

    # Parallel review agents
    agents = {
        "security": Agent(AgentConfig(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            tools=[FileReadTool(), GrepTool()],
            system_prompt="Security reviewer. Find vulnerabilities. Return JSON: {issues: [{file, line, severity, description}]}"
        )),
        "tests": Agent(AgentConfig(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            tools=[FileReadTool(), GrepTool()],
            system_prompt="Test coverage reviewer. Check if changes have adequate tests. Suggest missing tests."
        )),
        "code_quality": Agent(AgentConfig(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            tools=[FileReadTool(), GrepTool()],
            system_prompt="Code quality reviewer. Check for bugs, logic errors, and maintainability issues."
        ))
    }

    # Run all reviews in parallel
    tasks = {
        name: agent.run(f"Review these changed files:\n{file_list}")
        for name, agent in agents.items()
    }

    results = {}
    for name, task in tasks.items():
        results[name] = await task

    # Summarizer combines all reviews
    summarizer = Agent(AgentConfig(
        model="claude-sonnet-4-6",
        max_tokens=4096,
        system_prompt="Combine multiple code reviews into one clear, actionable summary."
    ))

    summary = await summarizer.run(f"""Combine these reviews into one summary:

Security Review:
{results['security'].text}

Test Coverage Review:
{results['tests'].text}

Code Quality Review:
{results['code_quality'].text}

Format: Start with overall verdict (APPROVE, REQUEST_CHANGES, or COMMENT), then list issues by severity.""")

    return summary.text

# Use it
result = await review_pr([
    "src/api/users.py",
    "src/auth/middleware.py",
    "tests/test_users.py"
])
print(result)

Summary

Pattern	How It Works	Best For
Fan-out / Fan-in	Parallel specialized agents	Code review, research
Pipeline	Sequential agents, each building on previous	Feature development
Debate	Independent analysis + judge	Complex debugging
Agent Teams	Claude Code built-in (Opus 4.6)	Interactive development

Feature	Details
Agent Teams	Experimental, requires Opus 4.6
Programmatic	Agent SDK with asyncio/Promise.all
Claude Flow	Open-source orchestration framework
Cost	2-5x single agent cost, but faster and higher quality

Multi-agent systems are the frontier of AI-powered development. Start with simple fan-out patterns and graduate to complex pipelines as your needs grow.

What’s Next?

This completes the core Claude AI tutorial series on agents and multi-agent systems. In the upcoming articles, we will build real projects: a code review bot, a document processing pipeline, and more.

Continue the series: Claude AI Tutorial — Complete Series

Why Multi-Agent?#

When to Use Multi-Agent#

Agent Teams in Claude Code#

Enabling Agent Teams#

How Agent Teams Work#

Example: Using Agent Teams#

Teammates vs Subagents#

Building Multi-Agent Systems with the Agent SDK#

Pattern 1: Fan-Out / Fan-In#

Python#

TypeScript#

Pattern 2: Pipeline#

Python#

TypeScript#

Pattern 3: Debate / Consensus#

Python#

TypeScript#

Communication Between Agents#

1. Shared Context (via Orchestrator)#

2. Shared Files#

3. Message Queue (for Complex Systems)#

Resource Management#

Token Budget#

Parallel Cost Estimates#

Claude Flow Framework#

Example: Research Pipeline#

Error Handling in Multi-Agent Systems#

1. Fail Fast#

2. Continue on Failure#

3. Retry with Fallback#

Real-World Example: PR Review Bot#

Summary#

What’s Next?#

Related Articles#

Why Multi-Agent?

When to Use Multi-Agent

Agent Teams in Claude Code

Enabling Agent Teams

How Agent Teams Work

Example: Using Agent Teams

Teammates vs Subagents

Building Multi-Agent Systems with the Agent SDK

Pattern 1: Fan-Out / Fan-In

Python

TypeScript

Pattern 2: Pipeline

Python

TypeScript

Pattern 3: Debate / Consensus

Python

TypeScript

Communication Between Agents

1. Shared Context (via Orchestrator)

2. Shared Files

3. Message Queue (for Complex Systems)

Resource Management

Token Budget

Parallel Cost Estimates

Claude Flow Framework

Example: Research Pipeline

Error Handling in Multi-Agent Systems

1. Fail Fast

2. Continue on Failure

3. Retry with Fallback

Real-World Example: PR Review Bot

Summary

What’s Next?

Related Articles