Claude CodeContext & Memory

Context & Memory

How Claude Code builds its system prompt, loads CLAUDE.md files, manages auto-memory, and provides context to the model at every turn.

QueryEngine.ts context/constants/prompts.ts memdir/query/stopHooks.ts

TL;DR — Key Takeaways

The system prompt is assembled from 5 ordered layers every turn: Default instructions → User context (git, CWD, shell) → System context (MCP capabilities) → Memory (CLAUDE.md) → Skills. Only Layer 1 is fully cacheable.
CLAUDE.md is loaded from 6 locations in priority order: parent dirs → project root → ~/.claude/ → subdir-specific. Each file can import others with @filename. The files are real at runtime — not embedded.
The memory system stores facts in ~/.claude/projects/{hash}/memory.jsonl — extracted by a background agent after each query. These persist across sessions and are injected at turn start.
AppState is a DeepImmutable Zustand-like store with 4 families: Runtime (connection), Agent Orchestration (tasks), Background Intelligence (speculation), and UX Subsystems (companion, bagel).

Jump To

Prompt Assembly

How the final system prompt is built from multiple layers.

Prompt Cache

Why prompt boundaries matter for performance.

CLAUDE.md

Where project instructions are loaded from.

Memory System

Persistent memory layout and file model.

AppState

The runtime control surface behind the UI.

~100K

tokens

Everything Claude Knows Before You Type

The system prompt for Claude Code is ~100K tokens before you type a single character. That's the core instructions, all 43 tool schemas, your CLAUDE.md files, git state, and memory — assembled on every turn.

🔀

Key Insight

The prompt is not one giant string — it's ordered sections. constants/prompts.ts defines a hard SYSTEM_PROMPT_DYNAMIC_BOUNDARY marker. Everything before that line can use global prompt caching (stable across users). Everything after can contain your git branch, CLAUDE.md content, and session data.

System Prompt Assembly

This section explains what actually goes into the model prompt on every turn.

QueryEngine.ts context/constants/prompts.ts

Prompt layers (bottom = sent last)

Default System Prompt

cacheable~2000 tokens

Base Claude instructions. Tool definitions from all registered tools. MCP server instructions. Agent definitions. Model-specific variants (Opus: full, Sonnet: abbreviated if >50 tools).

User Context

dynamic

Working directory, platform, shell, git status (branch + recent commits, truncated to 2000 chars). Additional directories permissions. Project metadata. Prepended via prependUserContext().

System Context

varies

Available resources, MCP server capabilities, coordinator context (if multi-agent mode). Appended via appendSystemContext().

Memory Mechanics

if enabled

If auto-memory enabled, injects memory mechanics prompt with instructions for reading/writing to memory directory.

typescript

// Final assembly in QueryEngine.ts:
const systemPrompt = asSystemPrompt([
  ...(customPrompt ? [customPrompt] : defaultSystemPrompt),
  ...(memoryMechanicsPrompt ? [memoryMechanicsPrompt] : []),
  ...(appendSystemPrompt ? [appendSystemPrompt] : []),
])

The more subtle implementation detail is that the prompt is assembled as ordered sections, not one giant string literal. constants/prompts.ts defines a hard SYSTEM_PROMPT_DYNAMIC_BOUNDARY marker so the prefix before that line can use global prompt caching while the tail can safely include user-, repo-, and session-specific data.

Prompt Boundary & Cache Topology

Read this if you want the performance story, not just the content story, of prompt construction.

constants/prompts.ts utils/api.ts services/api/claude.ts

This explains why Claude Code spends so much effort on prompt section ordering. The cacheable prefix is intended to stay byte-stable across turns and even across sessions, while the dynamic tail can change with user context, git state, memory, hooks, output style, and connected MCP servers.

typescript

// constants/prompts.ts
SYSTEM_PROMPT_DYNAMIC_BOUNDARY = "__SYSTEM_PROMPT_DYNAMIC_BOUNDARY__"

// invariant:
// - everything before boundary can use scope: global
// - everything after boundary may contain user/session-specific content

// consequences:
// - section order matters for cache hits
// - "small prompt edits" can break expensive cache reuse
// - prompt architecture is part of runtime performance design

CLAUDE.md Loading

Project instructions are layered from several locations; this section shows the order.

context/constants/

CLAUDE.md files provide project-specific instructions. They're loaded from multiple locations and concatenated into the user context:

Load order (all concatenated)

~/.claude/

└─CLAUDE.mdGlobal user instructions

<project>/

├─.claude/

└─CLAUDE.mdProject instructions (checked in)

└─CLAUDE.mdProject root convenience

~/.claude/projects/<path>/

└─CLAUDE.mdProject-specific user instructions

typescript

// Each CLAUDE.md section labeled with source path:
"Contents of /Users/user/.claude/CLAUDE.md (user's private global):"
"Contents of /project/CLAUDE.md (project instructions, checked in):"

// Injected as <system-reminder> tags in conversation context
// Available to model at every turn

Auto-Memory System (memdir/)

Use this section to understand what Claude Code tries to remember across sessions and where it stores it.

memdir/SessionMemory/

Persistent project-scoped memory at ~/.claude/projects/<path>/memory/. Four types, each serving a different purpose.

user/profile.md, preferences.md

Your role & preferences — so Claude doesn't keep asking who you are.

feedback/testing_policy.md

Corrections & validated rules — so Claude doesn't repeat mistakes you've corrected.

project/current_sprint.md

Active work context — current sprint, open tasks, recent decisions.

reference/slack_channels.md

External system pointers — so Claude can reference team resources without being told.

typescript

~/.claude/projects/<path>/memory/
  MEMORY.md         # Index — always loaded, 200 lines max
  user/profile.md   # Role, preferences
  feedback/         # Corrections & validated rules
  project/          # Active sprint / open tasks
  reference/        # External system pointers

// Memory file format (frontmatter):
---
type: user | feedback | project | reference
---
{{fact}}  **Why:** ...  **How to apply:** ...

Memory Extraction Pipeline

extractMemories.ts query/stopHooks.ts

typescript

// extractMemories.ts — runs after each query loop

1. Triggered by handleStopHooks at end of query
2. Runs as forked agent (lightweight, cache-sharing)
3. initExtractMemories() — one-time closure initialization
4. Detects memory writes by main agent (skips range)
5. Extracts anything model missed
6. Two prompt modes:
   - buildExtractAutoOnlyPrompt() — auto-memory only
   - buildExtractCombinedPrompt() — private + team memory
7. Saves to auto-memory directory with proper typing

// Session Memory (separate service):
- Runs periodically via background forked agent
- Different cadence: init threshold vs. update threshold
- Templates loaded from prompts directory
- Feature-gated (tengu_passport_quail)

What matters conceptually is that memory extraction lives on the stop-hook side of the architecture, not inside the main sampling loop. That keeps the user-visible turn fast while still letting Claude Code do post-turn bookkeeping with cheap forked agents.

Cache-Safe Fork Context

query/stopHooks.ts PromptSuggestion/

A newer pattern in the real codebase is that the stop-hook stage snapshots the exact context needed for cheap background forks. saveCacheSafeParams(createCacheSafeParams(...)) captures the system prompt, contexts, and tool schema so later subagents can reuse prompt caching instead of rebuilding from scratch.

typescript

// query/stopHooks.ts
const stopHookContext = {
  messages, systemPrompt, userContext, systemContext, toolUseContext
}

saveCacheSafeParams(createCacheSafeParams(stopHookContext))

// downstream consumers:
// - promptSuggestion
// - autoDream
// - extractMemories
// - /btw / side-question style forked queries

AppState Store

This is the runtime map behind the terminal UI, remote bridge, tasks, and speculation.

state/AppStateStore.ts PromptSuggestion/bridge/

Central application state using Zustand-like pattern with DeepImmutable for type safety:

typescript

// AppStateStore.ts — key state fields:
{
  // Settings & Models
  settings, mainLoopModel, mainLoopModelForSession

  // Permissions
  toolPermissionContext: {
    mode: 'default' | 'auto' | 'plan' | 'bypass'
    additionalWorkingDirectories: Map
    alwaysAllowRules, alwaysDenyRules, alwaysAskRules
  }

  // MCP Integration
  mcp: { clients, tools, commands, resources }

  // Plugin System
  plugins: { enabled, disabled, installationStatus }

  // Agent / prompt suggestion state
  thinkingEnabled, promptSuggestionEnabled, speculation

  // Bridge / remote control
  remoteConnectionStatus
  replBridgeEnabled
  replBridgeConnected
  replBridgeSessionActive
  replBridgeSessionUrl

  // Task orchestration + agent routing
  tasks
  agentNameRegistry
  coordinatorTaskIndex
  viewingAgentTaskId

  // UI + side systems
  expandedView, footerSelection
  companionReaction
  bagelActive
  tungstenActiveSession
}

// Mutation: setAppState(prev => ({ ...prev, field: newValue }))
// Triggers UI re-render in REPL mode

The deeper lesson is that AppState is not only UI state. It is a runtime coordination surface that carries task orchestration, bridge connectivity, speculation lifecycle, plugin reload state, notifications, companion reactions, and tool permission context. The terminal UI is effectively reading from the same operational control plane the agent loop mutates.

Message Types

typescript

// API-compatible types:
UserMessage      → user input, tool results, text content
AssistantMessage → model response, thinking blocks, tool_use blocks
SystemMessage    → various metadata subtypes

// Synthetic types (stripped before API call):
SystemLocalCommandMessage → local command output
TombstoneMessage          → marks orphaned partial messages
CompactBoundaryMessage    → marks session summary boundaries
ToolUseSummaryMessage     → generated summary of tool batch

// Session persistence:
~/.claude/sessions/[sessionId]/transcript.jsonl
// Records every turn for crash recovery via /resume

Memory Types

User Memorymemory/user/

Role, preferences, working style. Persists across all projects.

Feedback Memorymemory/feedback/

Corrections and validations. 'Always test before writing code' patterns.

Project Memorymemory/project/

Ongoing sprint context, architecture decisions, current task state.

Reference Memorymemory/reference/

External pointers: Slack channels, API endpoints, team contacts.

📋

MEMORY.md Index

MEMORY.md is the index — max 200 lines, 25KB. Each entry is one line under 150 chars. Claude reads this at every session start, so keep it lean.

Query/Engine Module

QueryEngine assembles the system prompt on every turn — this page explains what it builds.

Services Module

extractMemories.ts lives in services/ and runs as a stop-hook after each query.

Fun Facts

Auto-Dream and the memory consolidation system — Claude literally dreams during downtime.

File Map

Complete directory structure with every key file and its purpose — a searchable map of 1,884 source files.