Context & Memory
How Claude Code builds its system prompt, loads CLAUDE.md files, manages auto-memory, and provides context to the model at every turn.
- The system prompt is assembled from 5 ordered layers every turn: Default instructions → User context (git, CWD, shell) → System context (MCP capabilities) → Memory (CLAUDE.md) → Skills. Only Layer 1 is fully cacheable.
- CLAUDE.md is loaded from 6 locations in priority order: parent dirs → project root → ~/.claude/ → subdir-specific. Each file can import others with @filename. The files are real at runtime — not embedded.
- The memory system stores facts in ~/.claude/projects/{hash}/memory.jsonl — extracted by a background agent after each query. These persist across sessions and are injected at turn start.
- AppState is a DeepImmutable Zustand-like store with 4 families: Runtime (connection), Agent Orchestration (tasks), Background Intelligence (speculation), and UX Subsystems (companion, bagel).
The system prompt for Claude Code is ~100K tokens before you type a single character. That's the core instructions, all 43 tool schemas, your CLAUDE.md files, git state, and memory — assembled on every turn.
Key Insight
System Prompt Assembly
This section explains what actually goes into the model prompt on every turn.
Base Claude instructions. Tool definitions from all registered tools. MCP server instructions. Agent definitions. Model-specific variants (Opus: full, Sonnet: abbreviated if >50 tools).
Working directory, platform, shell, git status (branch + recent commits, truncated to 2000 chars). Additional directories permissions. Project metadata. Prepended via prependUserContext().
Available resources, MCP server capabilities, coordinator context (if multi-agent mode). Appended via appendSystemContext().
If auto-memory enabled, injects memory mechanics prompt with instructions for reading/writing to memory directory.
// Final assembly in QueryEngine.ts:
const systemPrompt = asSystemPrompt([
...(customPrompt ? [customPrompt] : defaultSystemPrompt),
...(memoryMechanicsPrompt ? [memoryMechanicsPrompt] : []),
...(appendSystemPrompt ? [appendSystemPrompt] : []),
])The more subtle implementation detail is that the prompt is assembled as ordered sections, not one giant string literal. constants/prompts.ts defines a hard SYSTEM_PROMPT_DYNAMIC_BOUNDARY marker so the prefix before that line can use global prompt caching while the tail can safely include user-, repo-, and session-specific data.
Prompt Boundary & Cache Topology
Read this if you want the performance story, not just the content story, of prompt construction.
This explains why Claude Code spends so much effort on prompt section ordering. The cacheable prefix is intended to stay byte-stable across turns and even across sessions, while the dynamic tail can change with user context, git state, memory, hooks, output style, and connected MCP servers.
// constants/prompts.ts
SYSTEM_PROMPT_DYNAMIC_BOUNDARY = "__SYSTEM_PROMPT_DYNAMIC_BOUNDARY__"
// invariant:
// - everything before boundary can use scope: global
// - everything after boundary may contain user/session-specific content
// consequences:
// - section order matters for cache hits
// - "small prompt edits" can break expensive cache reuse
// - prompt architecture is part of runtime performance designCLAUDE.md Loading
Project instructions are layered from several locations; this section shows the order.
CLAUDE.md files provide project-specific instructions. They're loaded from multiple locations and concatenated into the user context:
// Each CLAUDE.md section labeled with source path:
"Contents of /Users/user/.claude/CLAUDE.md (user's private global):"
"Contents of /project/CLAUDE.md (project instructions, checked in):"
// Injected as <system-reminder> tags in conversation context
// Available to model at every turnAuto-Memory System (memdir/)
Use this section to understand what Claude Code tries to remember across sessions and where it stores it.
Persistent project-scoped memory at ~/.claude/projects/<path>/memory/. Four types, each serving a different purpose.
user/profile.md, preferences.mdYour role & preferences — so Claude doesn't keep asking who you are.
feedback/testing_policy.mdCorrections & validated rules — so Claude doesn't repeat mistakes you've corrected.
project/current_sprint.mdActive work context — current sprint, open tasks, recent decisions.
reference/slack_channels.mdExternal system pointers — so Claude can reference team resources without being told.
~/.claude/projects/<path>/memory/
MEMORY.md # Index — always loaded, 200 lines max
user/profile.md # Role, preferences
feedback/ # Corrections & validated rules
project/ # Active sprint / open tasks
reference/ # External system pointers
// Memory file format (frontmatter):
---
type: user | feedback | project | reference
---
{{fact}} **Why:** ... **How to apply:** ...Memory Extraction Pipeline
// extractMemories.ts — runs after each query loop
1. Triggered by handleStopHooks at end of query
2. Runs as forked agent (lightweight, cache-sharing)
3. initExtractMemories() — one-time closure initialization
4. Detects memory writes by main agent (skips range)
5. Extracts anything model missed
6. Two prompt modes:
- buildExtractAutoOnlyPrompt() — auto-memory only
- buildExtractCombinedPrompt() — private + team memory
7. Saves to auto-memory directory with proper typing
// Session Memory (separate service):
- Runs periodically via background forked agent
- Different cadence: init threshold vs. update threshold
- Templates loaded from prompts directory
- Feature-gated (tengu_passport_quail)What matters conceptually is that memory extraction lives on the stop-hook side of the architecture, not inside the main sampling loop. That keeps the user-visible turn fast while still letting Claude Code do post-turn bookkeeping with cheap forked agents.
Cache-Safe Fork Context
A newer pattern in the real codebase is that the stop-hook stage snapshots the exact context needed for cheap background forks. saveCacheSafeParams(createCacheSafeParams(...)) captures the system prompt, contexts, and tool schema so later subagents can reuse prompt caching instead of rebuilding from scratch.
// query/stopHooks.ts
const stopHookContext = {
messages, systemPrompt, userContext, systemContext, toolUseContext
}
saveCacheSafeParams(createCacheSafeParams(stopHookContext))
// downstream consumers:
// - promptSuggestion
// - autoDream
// - extractMemories
// - /btw / side-question style forked queriesAppState Store
This is the runtime map behind the terminal UI, remote bridge, tasks, and speculation.
Central application state using Zustand-like pattern with DeepImmutable for type safety:
// AppStateStore.ts — key state fields:
{
// Settings & Models
settings, mainLoopModel, mainLoopModelForSession
// Permissions
toolPermissionContext: {
mode: 'default' | 'auto' | 'plan' | 'bypass'
additionalWorkingDirectories: Map
alwaysAllowRules, alwaysDenyRules, alwaysAskRules
}
// MCP Integration
mcp: { clients, tools, commands, resources }
// Plugin System
plugins: { enabled, disabled, installationStatus }
// Agent / prompt suggestion state
thinkingEnabled, promptSuggestionEnabled, speculation
// Bridge / remote control
remoteConnectionStatus
replBridgeEnabled
replBridgeConnected
replBridgeSessionActive
replBridgeSessionUrl
// Task orchestration + agent routing
tasks
agentNameRegistry
coordinatorTaskIndex
viewingAgentTaskId
// UI + side systems
expandedView, footerSelection
companionReaction
bagelActive
tungstenActiveSession
}
// Mutation: setAppState(prev => ({ ...prev, field: newValue }))
// Triggers UI re-render in REPL modeThe deeper lesson is that AppState is not only UI state. It is a runtime coordination surface that carries task orchestration, bridge connectivity, speculation lifecycle, plugin reload state, notifications, companion reactions, and tool permission context. The terminal UI is effectively reading from the same operational control plane the agent loop mutates.
Message Types
// API-compatible types:
UserMessage → user input, tool results, text content
AssistantMessage → model response, thinking blocks, tool_use blocks
SystemMessage → various metadata subtypes
// Synthetic types (stripped before API call):
SystemLocalCommandMessage → local command output
TombstoneMessage → marks orphaned partial messages
CompactBoundaryMessage → marks session summary boundaries
ToolUseSummaryMessage → generated summary of tool batch
// Session persistence:
~/.claude/sessions/[sessionId]/transcript.jsonl
// Records every turn for crash recovery via /resumeMemory Types
memory/user/Role, preferences, working style. Persists across all projects.
memory/feedback/Corrections and validations. 'Always test before writing code' patterns.
memory/project/Ongoing sprint context, architecture decisions, current task state.
memory/reference/External pointers: Slack channels, API endpoints, team contacts.
MEMORY.md Index
QueryEngine assembles the system prompt on every turn — this page explains what it builds.
extractMemories.ts lives in services/ and runs as a stop-hook after each query.
Auto-Dream and the memory consolidation system — Claude literally dreams during downtime.