An AI agent is not a chatbot. A chatbot answers questions. An agent takes actions. It reads files, writes code, runs commands, queries databases, and makes decisions — autonomously. The Claude Agent SDK gives you the same tools that power Claude Code, but programmable.
This is Article 14 in the Claude AI — From Zero to Power User series. You should have completed Article 8: Tool Use and Article 13: MCP before this article.
By the end of this article, you will know how to build agents that plan, act, observe, and repeat — with proper safety controls.
What is an AI Agent?
An AI agent follows a loop:
- Observe — Read the current state (files, data, errors)
- Plan — Decide what to do next
- Act — Execute an action (call a tool, write a file, run a command)
- Check — Evaluate the result
- Repeat — Continue until the task is done
The key difference from a chatbot: an agent decides what actions to take. You give it a goal, and it figures out the steps.
Agent vs Chatbot
| Feature | Chatbot | Agent |
|---|---|---|
| Interaction | Question → Answer | Goal → Autonomous execution |
| Tools | Optional | Essential |
| Loops | Single response | Multiple tool calls |
| State | Stateless (per message) | Maintains state across steps |
| Control | User drives conversation | Agent drives execution |
The Claude Agent SDK
The Agent SDK provides:
- The same tool infrastructure that powers Claude Code
- Built-in tools for file operations, shell commands, and web search
- Custom tool definitions
- Hooks for intercepting and modifying agent behavior
- Safety controls and permissions
Installation
# Python
pip install claude-agent-sdk
# TypeScript
npm install @anthropic-ai/claude-agent-sdk
Your First Agent
A simple agent that reads a file and answers questions about it.
Python
from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import FileReadTool
# Configure the agent
config = AgentConfig(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=[FileReadTool()],
system_prompt="You are a code analyst. Read files and answer questions about them."
)
agent = Agent(config)
# Run the agent
async def main():
result = await agent.run(
"Read the file app/main.py and tell me what endpoints are defined."
)
print(result.text)
import asyncio
asyncio.run(main())
TypeScript
import { Agent, AgentConfig } from "@anthropic-ai/claude-agent-sdk";
import { FileReadTool } from "@anthropic-ai/claude-agent-sdk/tools";
const config: AgentConfig = {
model: "claude-sonnet-4-6",
maxTokens: 4096,
tools: [new FileReadTool()],
systemPrompt:
"You are a code analyst. Read files and answer questions about them.",
};
const agent = new Agent(config);
const result = await agent.run(
"Read the file app/main.py and tell me what endpoints are defined."
);
console.log(result.text);
The agent:
- Reads the prompt
- Decides to use the FileReadTool to read
app/main.py - Analyzes the file contents
- Returns a summary of the endpoints
Built-In Tools
The Agent SDK provides several built-in tools:
| Tool | What It Does |
|---|---|
FileReadTool | Read file contents |
FileWriteTool | Write or modify files |
ShellTool | Run shell commands |
WebSearchTool | Search the web |
GlobTool | Find files by pattern |
GrepTool | Search file contents |
Example: Agent with Multiple Built-In Tools
from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import (
FileReadTool,
FileWriteTool,
ShellTool,
GlobTool,
GrepTool
)
config = AgentConfig(
model="claude-sonnet-4-6",
max_tokens=8192,
tools=[
FileReadTool(),
FileWriteTool(),
ShellTool(allowed_commands=["pytest", "python", "pip"]),
GlobTool(),
GrepTool()
],
system_prompt="""You are a Python developer assistant. You can:
- Read and write files
- Run Python scripts and tests
- Search the codebase
Always run tests after making changes."""
)
agent = Agent(config)
async def main():
result = await agent.run(
"Find all Python test files, run them, and fix any failing tests."
)
print(result.text)
import asyncio
asyncio.run(main())
This agent:
- Uses GlobTool to find
**/test_*.pyfiles - Uses ShellTool to run
pytest - If tests fail, reads the failing test files
- Fixes the code
- Runs tests again to verify
Custom Tools
Define your own tools as Python or TypeScript functions.
Python
from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import Tool
import httpx
class WeatherTool(Tool):
name = "get_weather"
description = "Get current weather for a city"
input_schema = {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"}
},
"required": ["city"]
}
async def run(self, city: str) -> str:
async with httpx.AsyncClient() as client:
response = await client.get(
f"https://api.weatherapi.com/v1/current.json",
params={"key": "YOUR_API_KEY", "q": city}
)
data = response.json()
return f"{data['current']['temp_c']}°C, {data['current']['condition']['text']}"
class DatabaseTool(Tool):
name = "query_database"
description = "Run a read-only SQL query against the database"
input_schema = {
"type": "object",
"properties": {
"query": {"type": "string", "description": "SQL SELECT query"}
},
"required": ["query"]
}
async def run(self, query: str) -> str:
if not query.strip().upper().startswith("SELECT"):
return "Error: Only SELECT queries are allowed"
# Execute query against your database
result = await db.execute(query)
return str(result)
config = AgentConfig(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=[WeatherTool(), DatabaseTool()],
system_prompt="You are a data analyst with access to weather data and a user database."
)
agent = Agent(config)
TypeScript
import { Agent, AgentConfig, Tool } from "@anthropic-ai/claude-agent-sdk";
class WeatherTool extends Tool {
name = "get_weather";
description = "Get current weather for a city";
inputSchema = {
type: "object" as const,
properties: {
city: { type: "string", description: "City name" },
},
required: ["city"],
};
async run({ city }: { city: string }): Promise<string> {
const response = await fetch(
`https://api.weatherapi.com/v1/current.json?key=YOUR_KEY&q=${city}`
);
const data = await response.json();
return `${data.current.temp_c}°C, ${data.current.condition.text}`;
}
}
const config: AgentConfig = {
model: "claude-sonnet-4-6",
maxTokens: 4096,
tools: [new WeatherTool()],
systemPrompt: "You are a helpful assistant with weather data access.",
};
const agent = new Agent(config);
const result = await agent.run("What is the weather in Berlin and Tokyo?");
console.log(result.text);
The Agent Loop
Under the hood, the agent runs a loop. You can observe each step:
Python
from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import FileReadTool, ShellTool
config = AgentConfig(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=[FileReadTool(), ShellTool()],
)
agent = Agent(config)
async def main():
async for event in agent.stream("Find and fix the bug in app/utils.py"):
if event.type == "thinking":
print(f"[Thinking] {event.text[:100]}...")
elif event.type == "tool_call":
print(f"[Tool] {event.tool_name}({event.tool_input})")
elif event.type == "tool_result":
print(f"[Result] {event.text[:200]}...")
elif event.type == "text":
print(f"[Response] {event.text}")
import asyncio
asyncio.run(main())
Output might look like:
[Thinking] Let me read the file first to understand the code...
[Tool] file_read({"path": "app/utils.py"})
[Result] def calculate_total(items): ...
[Thinking] I see the bug. The loop index is off by one...
[Tool] file_write({"path": "app/utils.py", "content": "..."})
[Result] File written successfully
[Tool] shell({"command": "pytest tests/test_utils.py"})
[Result] 3 passed, 0 failed
[Response] I found and fixed a bug in app/utils.py...
Hooks: Intercepting Agent Behavior
Hooks let you intercept and modify agent behavior at key points.
Python
from claude_agent_sdk import Agent, AgentConfig, Hook
from claude_agent_sdk.tools import FileWriteTool, ShellTool
class ApprovalHook(Hook):
"""Ask for human approval before dangerous operations."""
async def before_tool_call(self, tool_name: str, tool_input: dict) -> bool:
if tool_name == "shell":
print(f"\nAgent wants to run: {tool_input['command']}")
approval = input("Allow? (y/n): ")
return approval.lower() == "y"
if tool_name == "file_write":
print(f"\nAgent wants to write to: {tool_input['path']}")
approval = input("Allow? (y/n): ")
return approval.lower() == "y"
return True # Allow other tools
class LoggingHook(Hook):
"""Log all tool calls for debugging."""
async def after_tool_call(self, tool_name: str, tool_input: dict, result: str):
print(f"[LOG] {tool_name}: input={tool_input}, result_len={len(result)}")
config = AgentConfig(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=[FileWriteTool(), ShellTool()],
hooks=[ApprovalHook(), LoggingHook()]
)
agent = Agent(config)
Common hook use cases:
- Approval — Ask human permission before file writes or shell commands
- Logging — Record all tool calls for audit trails
- Filtering — Block certain operations (no
rm -rf, no write to/etc/) - Cost tracking — Count tokens and tool calls per session
- Rate limiting — Limit tool calls per minute
Agent Permissions
Control what your agent can and cannot do:
config = AgentConfig(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=[FileReadTool(), FileWriteTool(), ShellTool()],
permissions={
"file_read": {"allowed_paths": ["./src/", "./tests/"]},
"file_write": {"allowed_paths": ["./src/", "./tests/"]},
"shell": {
"allowed_commands": ["pytest", "python", "pip", "npm", "node"],
"blocked_commands": ["rm", "sudo", "curl", "wget"]
}
}
)
Always restrict permissions to the minimum your agent needs. A code review agent only needs read access. A test-fixing agent needs read and write access to source and test files, plus the ability to run tests.
Error Handling
Agents can fail. Handle errors gracefully:
Python
from claude_agent_sdk import Agent, AgentConfig, AgentError
config = AgentConfig(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=[...],
max_tool_calls=20, # Limit total tool calls
timeout=300, # 5 minute timeout
)
agent = Agent(config)
async def main():
try:
result = await agent.run("Refactor the authentication module")
print(result.text)
print(f"Tool calls: {result.tool_call_count}")
print(f"Tokens used: {result.total_tokens}")
except AgentError as e:
if e.type == "max_tool_calls":
print("Agent hit the tool call limit. Task may be too complex.")
elif e.type == "timeout":
print("Agent timed out. Consider increasing the timeout.")
elif e.type == "rate_limit":
print("API rate limit hit. Wait and retry.")
else:
print(f"Agent error: {e}")
import asyncio
asyncio.run(main())
Best Practices for Error Handling
- Set a max_tool_calls limit — Prevents infinite loops
- Set a timeout — Prevents agents from running forever
- Log all steps — Use hooks to track what the agent does
- Handle partial results — If the agent fails mid-task, save what it completed
- Retry with context — If the agent fails, retry with error context in the prompt
Agent Memory
For longer tasks, agents need to maintain context between steps:
config = AgentConfig(
model="claude-sonnet-4-6",
max_tokens=8192,
tools=[...],
system_prompt="""You are a code refactoring agent.
Keep a mental note of:
- Files you have read
- Changes you have made
- Tests you have run
After each change, run the relevant tests before proceeding."""
)
For multi-session memory, save the agent’s state to a file:
import json
# Save state after a session
state = {
"files_modified": result.files_modified,
"summary": result.text,
"conversation": result.messages
}
with open(".agent-state.json", "w") as f:
json.dump(state, f)
# Load state in the next session
with open(".agent-state.json") as f:
previous_state = json.load(f)
result = await agent.run(
f"Continue the refactoring. Previous progress:\n{previous_state['summary']}"
)
Cost Management
Agents can be expensive because they make many API calls. Control costs with:
config = AgentConfig(
model="claude-sonnet-4-6", # Use Sonnet, not Opus, for most tasks
max_tokens=4096,
max_tool_calls=15, # Limit iterations
timeout=120, # 2-minute timeout
thinking={"type": "enabled", "budget_tokens": 3000}, # Limited thinking
)
Cost estimates for common agent tasks (Sonnet 4.6):
| Task | Tool Calls | Approximate Cost |
|---|---|---|
| Simple file analysis | 2-3 | $0.01-0.03 |
| Bug fix (single file) | 5-8 | $0.05-0.15 |
| Multi-file refactoring | 10-20 | $0.15-0.50 |
| Full feature implementation | 20-50 | $0.50-2.00 |
Safety and Sandboxing
For production agents, run them in isolated environments:
Docker Container
config = AgentConfig(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=[FileReadTool(), FileWriteTool(), ShellTool()],
sandbox={
"type": "docker",
"image": "python:3.12-slim",
"volumes": {"./workspace": "/app"},
"network": "none", # No internet access
"memory_limit": "512m",
"cpu_limit": 1
}
)
Safety Checklist
- Sandbox the environment — Docker or VM with limited resources
- Restrict file access — Only the working directory
- Block network access — Unless the agent needs it
- Limit shell commands — Allowlist, not blocklist
- Set timeouts — Prevent runaway agents
- Log everything — Audit trail for all actions
- Human approval for destructive actions — Use hooks
Real-World Example: Code Review Agent
from claude_agent_sdk import Agent, AgentConfig
from claude_agent_sdk.tools import FileReadTool, GlobTool, GrepTool
config = AgentConfig(
model="claude-sonnet-4-6",
max_tokens=8192,
tools=[FileReadTool(), GlobTool(), GrepTool()],
thinking={"type": "enabled", "budget_tokens": 5000},
system_prompt="""You are a code review agent. Review code for:
1. Bugs and logic errors
2. Security vulnerabilities
3. Performance issues
4. Missing error handling
For each issue, provide:
- File and line number
- Severity (critical, warning, info)
- Description
- Suggested fix
Be thorough but do not invent issues."""
)
agent = Agent(config)
async def review_pr(changed_files: list[str]) -> str:
file_list = "\n".join(f"- {f}" for f in changed_files)
result = await agent.run(
f"Review these changed files for issues:\n{file_list}"
)
return result.text
# Use it
review = await review_pr([
"src/auth/login.py",
"src/auth/middleware.py",
"tests/test_auth.py"
])
print(review)
Summary
| Concept | Details |
|---|---|
| Agent loop | Observe → Plan → Act → Check → Repeat |
| Agent SDK | claude-agent-sdk (Python), @anthropic-ai/claude-agent-sdk (TypeScript) |
| Built-in tools | File read/write, shell, web search, glob, grep |
| Custom tools | Extend the Tool class with your own functions |
| Hooks | Intercept tool calls for approval, logging, filtering |
| Safety | Permissions, sandboxing, timeouts, max tool calls |
| Cost | Sonnet agent tasks typically cost $0.01-2.00 |
Agents are the most powerful way to use Claude. They turn Claude from an assistant into an autonomous worker that can complete complex tasks independently.
What’s Next?
In the next article, we will cover Multi-Agent Systems — orchestrating multiple Claude agents that work together on complex tasks.
Next: Multi-Agent Systems