An AI agent is not a chatbot. A chatbot answers questions. An agent takes actions. It reads files, writes code, runs commands, queries databases, and makes decisions — autonomously. The Claude Agent SDK gives you the same tools that power Claude Code, but programmable.

This is Article 14 in the Claude AI — From Zero to Power User series. You should have completed Article 8: Tool Use and Article 13: MCP before this article.

By the end of this article, you will know how to build agents that plan, act, observe, and repeat — with proper safety controls.


What is an AI Agent?

An AI agent follows a loop:

  1. Observe — Read the current state (files, data, errors)
  2. Plan — Decide what to do next
  3. Act — Execute an action (call a tool, write a file, run a command)
  4. Check — Evaluate the result
  5. Repeat — Continue until the task is done

The key difference from a chatbot: an agent decides what actions to take. You give it a goal, and it figures out the steps.

Agent vs Chatbot

FeatureChatbotAgent
InteractionQuestion → AnswerGoal → Autonomous execution
ToolsOptionalEssential
LoopsSingle responseMultiple tool calls
StateStateless (per message)Maintains state across steps
ControlUser drives conversationAgent drives execution

The Claude Agent SDK

The Agent SDK provides:

  • The same tool infrastructure that powers Claude Code
  • Built-in tools for file operations, shell commands, and web search
  • Custom tool definitions
  • Hooks for intercepting and modifying agent behavior
  • Safety controls and permissions

Installation

# Python
pip install claude-agent-sdk

# TypeScript
npm install @anthropic-ai/claude-agent-sdk

Your First Agent

A simple agent that reads a file and answers questions about it.

Python

from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import FileReadTool

# Configure the agent
config = AgentConfig(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    tools=[FileReadTool()],
    system_prompt="You are a code analyst. Read files and answer questions about them."
)

agent = Agent(config)

# Run the agent
async def main():
    result = await agent.run(
        "Read the file app/main.py and tell me what endpoints are defined."
    )
    print(result.text)

import asyncio
asyncio.run(main())

TypeScript

import { Agent, AgentConfig } from "@anthropic-ai/claude-agent-sdk";
import { FileReadTool } from "@anthropic-ai/claude-agent-sdk/tools";

const config: AgentConfig = {
  model: "claude-sonnet-4-6",
  maxTokens: 4096,
  tools: [new FileReadTool()],
  systemPrompt:
    "You are a code analyst. Read files and answer questions about them.",
};

const agent = new Agent(config);

const result = await agent.run(
  "Read the file app/main.py and tell me what endpoints are defined."
);
console.log(result.text);

The agent:

  1. Reads the prompt
  2. Decides to use the FileReadTool to read app/main.py
  3. Analyzes the file contents
  4. Returns a summary of the endpoints

Built-In Tools

The Agent SDK provides several built-in tools:

ToolWhat It Does
FileReadToolRead file contents
FileWriteToolWrite or modify files
ShellToolRun shell commands
WebSearchToolSearch the web
GlobToolFind files by pattern
GrepToolSearch file contents

Example: Agent with Multiple Built-In Tools

from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import (
    FileReadTool,
    FileWriteTool,
    ShellTool,
    GlobTool,
    GrepTool
)

config = AgentConfig(
    model="claude-sonnet-4-6",
    max_tokens=8192,
    tools=[
        FileReadTool(),
        FileWriteTool(),
        ShellTool(allowed_commands=["pytest", "python", "pip"]),
        GlobTool(),
        GrepTool()
    ],
    system_prompt="""You are a Python developer assistant. You can:
- Read and write files
- Run Python scripts and tests
- Search the codebase

Always run tests after making changes."""
)

agent = Agent(config)

async def main():
    result = await agent.run(
        "Find all Python test files, run them, and fix any failing tests."
    )
    print(result.text)

import asyncio
asyncio.run(main())

This agent:

  1. Uses GlobTool to find **/test_*.py files
  2. Uses ShellTool to run pytest
  3. If tests fail, reads the failing test files
  4. Fixes the code
  5. Runs tests again to verify

Custom Tools

Define your own tools as Python or TypeScript functions.

Python

from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import Tool
import httpx

class WeatherTool(Tool):
    name = "get_weather"
    description = "Get current weather for a city"
    input_schema = {
        "type": "object",
        "properties": {
            "city": {"type": "string", "description": "City name"}
        },
        "required": ["city"]
    }

    async def run(self, city: str) -> str:
        async with httpx.AsyncClient() as client:
            response = await client.get(
                f"https://api.weatherapi.com/v1/current.json",
                params={"key": "YOUR_API_KEY", "q": city}
            )
            data = response.json()
            return f"{data['current']['temp_c']}°C, {data['current']['condition']['text']}"

class DatabaseTool(Tool):
    name = "query_database"
    description = "Run a read-only SQL query against the database"
    input_schema = {
        "type": "object",
        "properties": {
            "query": {"type": "string", "description": "SQL SELECT query"}
        },
        "required": ["query"]
    }

    async def run(self, query: str) -> str:
        if not query.strip().upper().startswith("SELECT"):
            return "Error: Only SELECT queries are allowed"
        # Execute query against your database
        result = await db.execute(query)
        return str(result)

config = AgentConfig(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    tools=[WeatherTool(), DatabaseTool()],
    system_prompt="You are a data analyst with access to weather data and a user database."
)

agent = Agent(config)

TypeScript

import { Agent, AgentConfig, Tool } from "@anthropic-ai/claude-agent-sdk";

class WeatherTool extends Tool {
  name = "get_weather";
  description = "Get current weather for a city";
  inputSchema = {
    type: "object" as const,
    properties: {
      city: { type: "string", description: "City name" },
    },
    required: ["city"],
  };

  async run({ city }: { city: string }): Promise<string> {
    const response = await fetch(
      `https://api.weatherapi.com/v1/current.json?key=YOUR_KEY&q=${city}`
    );
    const data = await response.json();
    return `${data.current.temp_c}°C, ${data.current.condition.text}`;
  }
}

const config: AgentConfig = {
  model: "claude-sonnet-4-6",
  maxTokens: 4096,
  tools: [new WeatherTool()],
  systemPrompt: "You are a helpful assistant with weather data access.",
};

const agent = new Agent(config);
const result = await agent.run("What is the weather in Berlin and Tokyo?");
console.log(result.text);

The Agent Loop

Under the hood, the agent runs a loop. You can observe each step:

Python

from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import FileReadTool, ShellTool

config = AgentConfig(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    tools=[FileReadTool(), ShellTool()],
)

agent = Agent(config)

async def main():
    async for event in agent.stream("Find and fix the bug in app/utils.py"):
        if event.type == "thinking":
            print(f"[Thinking] {event.text[:100]}...")
        elif event.type == "tool_call":
            print(f"[Tool] {event.tool_name}({event.tool_input})")
        elif event.type == "tool_result":
            print(f"[Result] {event.text[:200]}...")
        elif event.type == "text":
            print(f"[Response] {event.text}")

import asyncio
asyncio.run(main())

Output might look like:

[Thinking] Let me read the file first to understand the code...
[Tool] file_read({"path": "app/utils.py"})
[Result] def calculate_total(items): ...
[Thinking] I see the bug. The loop index is off by one...
[Tool] file_write({"path": "app/utils.py", "content": "..."})
[Result] File written successfully
[Tool] shell({"command": "pytest tests/test_utils.py"})
[Result] 3 passed, 0 failed
[Response] I found and fixed a bug in app/utils.py...

Hooks: Intercepting Agent Behavior

Hooks let you intercept and modify agent behavior at key points.

Python

from claude_agent_sdk import Agent, AgentConfig, Hook
from claude_agent_sdk.tools import FileWriteTool, ShellTool

class ApprovalHook(Hook):
    """Ask for human approval before dangerous operations."""

    async def before_tool_call(self, tool_name: str, tool_input: dict) -> bool:
        if tool_name == "shell":
            print(f"\nAgent wants to run: {tool_input['command']}")
            approval = input("Allow? (y/n): ")
            return approval.lower() == "y"
        if tool_name == "file_write":
            print(f"\nAgent wants to write to: {tool_input['path']}")
            approval = input("Allow? (y/n): ")
            return approval.lower() == "y"
        return True  # Allow other tools

class LoggingHook(Hook):
    """Log all tool calls for debugging."""

    async def after_tool_call(self, tool_name: str, tool_input: dict, result: str):
        print(f"[LOG] {tool_name}: input={tool_input}, result_len={len(result)}")

config = AgentConfig(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    tools=[FileWriteTool(), ShellTool()],
    hooks=[ApprovalHook(), LoggingHook()]
)

agent = Agent(config)

Common hook use cases:

  • Approval — Ask human permission before file writes or shell commands
  • Logging — Record all tool calls for audit trails
  • Filtering — Block certain operations (no rm -rf, no write to /etc/)
  • Cost tracking — Count tokens and tool calls per session
  • Rate limiting — Limit tool calls per minute

Agent Permissions

Control what your agent can and cannot do:

config = AgentConfig(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    tools=[FileReadTool(), FileWriteTool(), ShellTool()],
    permissions={
        "file_read": {"allowed_paths": ["./src/", "./tests/"]},
        "file_write": {"allowed_paths": ["./src/", "./tests/"]},
        "shell": {
            "allowed_commands": ["pytest", "python", "pip", "npm", "node"],
            "blocked_commands": ["rm", "sudo", "curl", "wget"]
        }
    }
)

Always restrict permissions to the minimum your agent needs. A code review agent only needs read access. A test-fixing agent needs read and write access to source and test files, plus the ability to run tests.


Error Handling

Agents can fail. Handle errors gracefully:

Python

from claude_agent_sdk import Agent, AgentConfig, AgentError

config = AgentConfig(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    tools=[...],
    max_tool_calls=20,  # Limit total tool calls
    timeout=300,        # 5 minute timeout
)

agent = Agent(config)

async def main():
    try:
        result = await agent.run("Refactor the authentication module")
        print(result.text)
        print(f"Tool calls: {result.tool_call_count}")
        print(f"Tokens used: {result.total_tokens}")

    except AgentError as e:
        if e.type == "max_tool_calls":
            print("Agent hit the tool call limit. Task may be too complex.")
        elif e.type == "timeout":
            print("Agent timed out. Consider increasing the timeout.")
        elif e.type == "rate_limit":
            print("API rate limit hit. Wait and retry.")
        else:
            print(f"Agent error: {e}")

import asyncio
asyncio.run(main())

Best Practices for Error Handling

  1. Set a max_tool_calls limit — Prevents infinite loops
  2. Set a timeout — Prevents agents from running forever
  3. Log all steps — Use hooks to track what the agent does
  4. Handle partial results — If the agent fails mid-task, save what it completed
  5. Retry with context — If the agent fails, retry with error context in the prompt

Agent Memory

For longer tasks, agents need to maintain context between steps:

config = AgentConfig(
    model="claude-sonnet-4-6",
    max_tokens=8192,
    tools=[...],
    system_prompt="""You are a code refactoring agent.

Keep a mental note of:
- Files you have read
- Changes you have made
- Tests you have run

After each change, run the relevant tests before proceeding."""
)

For multi-session memory, save the agent’s state to a file:

import json

# Save state after a session
state = {
    "files_modified": result.files_modified,
    "summary": result.text,
    "conversation": result.messages
}

with open(".agent-state.json", "w") as f:
    json.dump(state, f)

# Load state in the next session
with open(".agent-state.json") as f:
    previous_state = json.load(f)

result = await agent.run(
    f"Continue the refactoring. Previous progress:\n{previous_state['summary']}"
)

Cost Management

Agents can be expensive because they make many API calls. Control costs with:

config = AgentConfig(
    model="claude-sonnet-4-6",      # Use Sonnet, not Opus, for most tasks
    max_tokens=4096,
    max_tool_calls=15,               # Limit iterations
    timeout=120,                     # 2-minute timeout
    thinking={"type": "enabled", "budget_tokens": 3000},  # Limited thinking
)

Cost estimates for common agent tasks (Sonnet 4.6):

TaskTool CallsApproximate Cost
Simple file analysis2-3$0.01-0.03
Bug fix (single file)5-8$0.05-0.15
Multi-file refactoring10-20$0.15-0.50
Full feature implementation20-50$0.50-2.00

Safety and Sandboxing

For production agents, run them in isolated environments:

Docker Container

config = AgentConfig(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    tools=[FileReadTool(), FileWriteTool(), ShellTool()],
    sandbox={
        "type": "docker",
        "image": "python:3.12-slim",
        "volumes": {"./workspace": "/app"},
        "network": "none",  # No internet access
        "memory_limit": "512m",
        "cpu_limit": 1
    }
)

Safety Checklist

  1. Sandbox the environment — Docker or VM with limited resources
  2. Restrict file access — Only the working directory
  3. Block network access — Unless the agent needs it
  4. Limit shell commands — Allowlist, not blocklist
  5. Set timeouts — Prevent runaway agents
  6. Log everything — Audit trail for all actions
  7. Human approval for destructive actions — Use hooks

Real-World Example: Code Review Agent

from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import FileReadTool, GlobTool, GrepTool

config = AgentConfig(
    model="claude-sonnet-4-6",
    max_tokens=8192,
    tools=[FileReadTool(), GlobTool(), GrepTool()],
    thinking={"type": "enabled", "budget_tokens": 5000},
    system_prompt="""You are a code review agent. Review code for:
1. Bugs and logic errors
2. Security vulnerabilities
3. Performance issues
4. Missing error handling

For each issue, provide:
- File and line number
- Severity (critical, warning, info)
- Description
- Suggested fix

Be thorough but do not invent issues."""
)

agent = Agent(config)

async def review_pr(changed_files: list[str]) -> str:
    file_list = "\n".join(f"- {f}" for f in changed_files)
    result = await agent.run(
        f"Review these changed files for issues:\n{file_list}"
    )
    return result.text

# Use it
review = await review_pr([
    "src/auth/login.py",
    "src/auth/middleware.py",
    "tests/test_auth.py"
])
print(review)

Summary

ConceptDetails
Agent loopObserve → Plan → Act → Check → Repeat
Agent SDKclaude-agent-sdk (Python), @anthropic-ai/claude-agent-sdk (TypeScript)
Built-in toolsFile read/write, shell, web search, glob, grep
Custom toolsExtend the Tool class with your own functions
HooksIntercept tool calls for approval, logging, filtering
SafetyPermissions, sandboxing, timeouts, max tool calls
CostSonnet agent tasks typically cost $0.01-2.00

Agents are the most powerful way to use Claude. They turn Claude from an assistant into an autonomous worker that can complete complex tasks independently.


What’s Next?

In the next article, we will cover Multi-Agent Systems — orchestrating multiple Claude agents that work together on complex tasks.

Next: Multi-Agent Systems