CC
Claude Code
v2.1.88
Claude CodeAgents & Subagents

Agents & Subagents

AgentTool

The most architecturally interesting part of Claude Code. Spawn isolated agents with separate token budgets, zero-cost cache sharing, and 3 isolation levels — all from a single AgentTool call.

TL;DR — Key Takeaways
  • AgentTool spawns a full Claude instance with its own token budget. The child agent runs the same query() loop independently — including its own tool calls, compaction, and recovery.
  • CacheSafeParams (Step 3 of spawn flow) freezes the system prompt bytes at fork time. If parent and child have identical bytes, the API returns a prompt cache hit — zero extra cost for spawning.
  • 3 isolation modes: Local (in-process, shared state), Worktree (git-isolated branch), Remote (CCR server, full isolation). Pick the level of risk your task warrants.
  • 6 forked background agents run automatically: speculation (pre-executes next steps), extractMemories, promptSuggestion, sessionMemory, compaction, and autoDream (after 24h + 5 sessions).
3
Execution modes
Local · Worktree · Remote
0x
Cache sharing cost
Same bytes = free spawn
Nesting depth
Agents spawning agents

The Agent Spawning Flow

Every subagent follows this exact 6-step lifecycle. Understanding it unlocks why cache sharing is the key innovation.

1
User prompt
Task arrives at coordinator
2
AgentTool called
Model emits tool_use block
3
CacheSafeParams frozen
System prompt bytes locked for cache hit
4
Subagent spins up
LocalAgentTask / worktree / CCR remote
5
Executes independently
Own token budget, separate turn loop
6
Returns XML
<task-notification> with result + usage
Step 3 is the key

CacheSafeParams freezes the system prompt bytes at fork time. If parent and subagent bytes match exactly, the API returns a prompt cache hit — making every forked agent nearly free.

3 Execution Modes

Choose the right isolation level for the task. Default is always Local.

← Cheap · Shared · Fast startupIsolated · Safe · Slow startup →
Local
Worktree
Remote (CCR)
~0 tokens
spawn cost
Low
spawn cost
High
spawn cost
Default · Zero cost

Local

In-process
Shares parent AppState directly
Best for
  • Memory extraction
  • Prompt suggestions
  • Auto-compaction
  • Background annotations
No isolation — shares filesystem
Safe · Git branch

Worktree

Git isolation
Creates isolated git worktree branch
Best for
  • Parallel feature development
  • Risky refactors
  • Experimental changes
  • Code review agents
Own git branch — merges manually
Full isolation · Network

Remote

CCR
Runs on remote CCR servers
Best for
  • True parallelism at scale
  • Sensitive operations
  • Long-running background work
  • Multi-machine workflows
High startup cost — full cold start
FeatureLocalWorktreeRemote
Filesystem isolation
Git branch isolation
Background executionoptopt
Shared AppState
Cache sharing
Startup cost~0 tokensLowHigh

Zero-Cost Cache Sharing

The key innovation that makes 6+ background agents per turn economically viable.

The math
Parent Agent
10K
tokens in system prompt
same bytes →
CACHE HIT
Each Subagent
~0
tokens charged for prompt

Instead of paying 10K tokens each time × 6 agents = 60K tokens per turn, Claude Code pays once and gets 6 cache hits. At scale: this is the difference between affordable and prohibitive.

CacheSafeParams — frozen at fork time
systemPrompt

Must match exactly byte-for-byte

userContext

Working dir, platform, git state

toolSchema

All available tool definitions

thinkingConfig

Thinking mode settings

typescript
// CacheSafeParams — frozen at fork time
{
  systemPrompt: bytes      // Must match exactly — any change = cache MISS
  userContext: variables   // Working dir, platform, git
  systemContext: resources // Available tools, MCP servers
  toolSchema: definitions  // All tool definitions
  thinkingConfig: config   // Thinking mode settings
}

// Same bytes → prompt cache HIT → ~0 cost for subagent system prompt
// Changed bytes → cache MISS → pay full 10K+ tokens again

Forked Agents Timeline

Background agents users never see — but that make the product feel smarter. Each fires at a specific phase.

PRE-TURN
MAIN TURN
POST-TURN
PERIODIC
speculation

Fast pre-exec in /tmp

Model thinking + tool calls
extractMemories
promptSuggestion
sessionMemory
compaction
autoDream (24h+5)
speculation
Pre-turn

Pre-executes likely next steps in /tmp overlay

extractMemories
Post-turn

Extracts facts to CLAUDE.md after each query

promptSuggestion
Post-turn

Generates 3 follow-up prompt ideas

sessionMemory
Periodic

Periodic conversation notes across sessions

compaction
Threshold

Conversation summary when context hits threshold

autoDream
24h + 5 sessions

Memory consolidation after 24h + 5 sessions

Multi-Agent Coordinator Pattern

The leader/worker pattern Claude Code encodes into its own system prompt. Synthesis stays with the leader; execution fans out.

Coordinator
Research → Synthesize → Direct
spawns via AgentTool
Worker A
Explore codebase
Worker B
Write tests
Worker C
Review changes
returns <task-notification> XML
Coordinator synthesizes + directs next task
The Golden Rule

"The coordinator must read worker findings, understand them itself, and write follow-up prompts with concrete file paths and changes. It cannot hand-wave research away. Synthesis stays with the leader, execution fans out."

typescript
// coordinator/coordinatorMode.ts — key rules (~370 lines)
getCoordinatorUserContext(...)
  → enumerates worker tool access
  → injects MCP server names
  → may expose scratchpadDir for cross-worker knowledge

getCoordinatorSystemPrompt()
  → workers are async, launch independent work in parallel
  → don't use workers for trivial file/command reporting
  → don't say "based on your findings"
  → always synthesize findings before delegating
  → verification means proving behavior, not just code presence
  → phases: research → synthesis → implementation → verification

Built-in Agent Types

TypePurposeKey Capability
general-purposeDefault for complex tasksFull tool access
ExploreFast codebase explorationSearch-focused, no write tools
PlanArchitecture planningDesign docs, no code changes
code-reviewPR code reviewRead-only analysis
simplicity-engineerOver-engineering reviewComplexity analysis
Custom (.claude/agents/)User-defined agentsYAML/MD with frontmatter
Custom Agents

Drop a .md or .yaml file in .claude/agents/ with frontmatter (name, description, tools, model). The agent becomes available system-wide. Custom agents can restrict tools to a safe subset — e.g., a read-only reviewer.

💡

Prompt caching makes subagents nearly free

When you spawn a subagent with the same system prompt as the parent, the API cost for that entire prompt is literally $0 — the bytes match exactly, triggering an automatic prompt cache hit. You're only paying for the unique task description. This is why Claude Code can afford to run 6+ background agents per turn without the cost becoming prohibitive.