Claude CodeAgents & Subagents

Agents & Subagents

AgentTool

The most architecturally interesting part of Claude Code. Spawn isolated agents with separate token budgets, zero-cost cache sharing, and 3 isolation levels — all from a single AgentTool call.

AgentTool.tsx runAgent.ts coordinator/remote/

TL;DR — Key Takeaways

AgentTool spawns a full Claude instance with its own token budget. The child agent runs the same query() loop independently — including its own tool calls, compaction, and recovery.
CacheSafeParams (Step 3 of spawn flow) freezes the system prompt bytes at fork time. If parent and child have identical bytes, the API returns a prompt cache hit — zero extra cost for spawning.
3 isolation modes: Local (in-process, shared state), Worktree (git-isolated branch), Remote (CCR server, full isolation). Pick the level of risk your task warrants.
6 forked background agents run automatically: speculation (pre-executes next steps), extractMemories, promptSuggestion, sessionMemory, compaction, and autoDream (after 24h + 5 sessions).

Execution modes

Local · Worktree · Remote

Cache sharing cost

Same bytes = free spawn

∞

Nesting depth

Agents spawning agents

Jump To

Spawn Flow

Step-by-step: how a subagent is born.

Execution Modes

Local, worktree, and remote isolation models.

Cache Sharing

Why subagents can be so cheap to launch.

Forked Agents

Background agents that fire at each lifecycle phase.

Coordinator

How leader/worker orchestration really works.

Agent Types

Built-in agent types and custom agents.

The Agent Spawning Flow

Every subagent follows this exact 6-step lifecycle. Understanding it unlocks why cache sharing is the key innovation.

AgentTool.tsx runAgent.ts

User prompt

Task arrives at coordinator

AgentTool called

Model emits tool_use block

CacheSafeParams frozen

System prompt bytes locked for cache hit

Subagent spins up

LocalAgentTask / worktree / CCR remote

Executes independently

Own token budget, separate turn loop

Returns XML

<task-notification> with result + usage

Step 3 is the key

CacheSafeParams freezes the system prompt bytes at fork time. If parent and subagent bytes match exactly, the API returns a prompt cache hit — making every forked agent nearly free.

3 Execution Modes

Choose the right isolation level for the task. Default is always Local.

runAgent.ts EnterWorktreeTool.ts remote/

← Cheap · Shared · Fast startupIsolated · Safe · Slow startup →

Local

Worktree

Remote (CCR)

~0 tokens

spawn cost

Low

spawn cost

High

spawn cost

Default · Zero cost

Local

In-process

Shares parent AppState directly

Best for

Memory extraction
Prompt suggestions
Auto-compaction
Background annotations

⚠ No isolation — shares filesystem

Safe · Git branch

Worktree

Git isolation

Creates isolated git worktree branch

Best for

Parallel feature development
Risky refactors
Experimental changes
Code review agents

⚠ Own git branch — merges manually

Full isolation · Network

Remote

CCR

Runs on remote CCR servers

Best for

True parallelism at scale
Sensitive operations
Long-running background work
Multi-machine workflows

⚠ High startup cost — full cold start

Feature	Local	Worktree	Remote
Filesystem isolation	✗	✓	✓
Git branch isolation	✗	✓	✓
Background execution	opt	opt	✓
Shared AppState	✓	✗	✗
Cache sharing	✓	✓	✗
Startup cost	~0 tokens	Low	High

Zero-Cost Cache Sharing

The key innovation that makes 6+ background agents per turn economically viable.

forkedAgent.ts runAgent.ts

The math

Parent Agent

10K

tokens in system prompt

same bytes →

CACHE HIT

Each Subagent

tokens charged for prompt

Instead of paying 10K tokens each time × 6 agents = 60K tokens per turn, Claude Code pays once and gets 6 cache hits. At scale: this is the difference between affordable and prohibitive.

CacheSafeParams — frozen at fork time

systemPrompt

Must match exactly byte-for-byte

userContext

Working dir, platform, git state

toolSchema

All available tool definitions

thinkingConfig

Thinking mode settings

typescript

// CacheSafeParams — frozen at fork time
{
  systemPrompt: bytes      // Must match exactly — any change = cache MISS
  userContext: variables   // Working dir, platform, git
  systemContext: resources // Available tools, MCP servers
  toolSchema: definitions  // All tool definitions
  thinkingConfig: config   // Thinking mode settings
}

// Same bytes → prompt cache HIT → ~0 cost for subagent system prompt
// Changed bytes → cache MISS → pay full 10K+ tokens again

Forked Agents Timeline

Background agents users never see — but that make the product feel smarter. Each fires at a specific phase.

extractMemories.ts sessionMemory.ts speculation.ts

PRE-TURN

MAIN TURN

POST-TURN

PERIODIC

speculation

Fast pre-exec in /tmp

Model thinking + tool calls

extractMemories

promptSuggestion

sessionMemory

compaction

autoDream (24h+5)

speculation

Pre-turn

Pre-executes likely next steps in /tmp overlay

extractMemories

Post-turn

Extracts facts to CLAUDE.md after each query

promptSuggestion

Post-turn

Generates 3 follow-up prompt ideas

sessionMemory

Periodic

Periodic conversation notes across sessions

compaction

Threshold

Conversation summary when context hits threshold

autoDream

24h + 5 sessions

Memory consolidation after 24h + 5 sessions

Multi-Agent Coordinator Pattern

The leader/worker pattern Claude Code encodes into its own system prompt. Synthesis stays with the leader; execution fans out.

coordinator/coordinatorMode.ts spawnMultiAgent.ts

Coordinator

Research → Synthesize → Direct

spawns via AgentTool

Worker A

Explore codebase

Worker B

Write tests

Worker C

Review changes

returns <task-notification> XML

Coordinator synthesizes + directs next task

The Golden Rule

"The coordinator must read worker findings, understand them itself, and write follow-up prompts with concrete file paths and changes. It cannot hand-wave research away. Synthesis stays with the leader, execution fans out."

typescript

// coordinator/coordinatorMode.ts — key rules (~370 lines)
getCoordinatorUserContext(...)
  → enumerates worker tool access
  → injects MCP server names
  → may expose scratchpadDir for cross-worker knowledge

getCoordinatorSystemPrompt()
  → workers are async, launch independent work in parallel
  → don't use workers for trivial file/command reporting
  → don't say "based on your findings"
  → always synthesize findings before delegating
  → verification means proving behavior, not just code presence
  → phases: research → synthesis → implementation → verification

Built-in Agent Types

loadAgentsDir.ts skills/

Type	Purpose	Key Capability
`general-purpose`	Default for complex tasks	Full tool access
`Explore`	Fast codebase exploration	Search-focused, no write tools
`Plan`	Architecture planning	Design docs, no code changes
`code-review`	PR code review	Read-only analysis
`simplicity-engineer`	Over-engineering review	Complexity analysis
`Custom (.claude/agents/)`	User-defined agents	YAML/MD with frontmatter

Custom Agents

Drop a .md or .yaml file in .claude/agents/ with frontmatter (name, description, tools, model). The agent becomes available system-wide. Custom agents can restrict tools to a safe subset — e.g., a read-only reviewer.

💡

Prompt caching makes subagents nearly free

When you spawn a subagent with the same system prompt as the parent, the API cost for that entire prompt is literally $0 — the bytes match exactly, triggering an automatic prompt cache hit. You're only paying for the unique task description. This is why Claude Code can afford to run 6+ background agents per turn without the cost becoming prohibitive.

Services

The compaction, MCP, memory extraction, and speculation services that power background agents.

Context & Memory

How the system prompt is assembled and why its exact byte content matters so much for cache sharing.

Permissions

How subagents inherit and are restricted by permission modes from the parent session.

Services

MCP (470KB!), 4 compaction strategies, LSP integration, analytics pipeline, and the extractMemories background agent.