Services
20+ servicesClaude Code's service layer handles compaction, MCP integration, LSP, analytics, memory extraction, and more — running in parallel with the main query loop.
- MCP is the largest single service at 470KB across 25 files — bigger than BashTool. External tool integration is a first-class architecture concern, not an afterthought.
- Compaction has 4 escalating strategies (Microcompact → Snipping → Autocompact → Collapse) that activate from lightest to heaviest. Without them, sessions would hit the token wall every few hours.
- The API layer handles streaming, retries, beta-flag management, and prompt cache control — all transparently. Most developers never touch it directly, but it fires on every single turn.
- Bridge/Remote (35KB, 20 files) is a second top-level execution path alongside the local REPL — remote sessions dispatch work through the same QueryEngine/query() core.
MCP is the largest single service at 470KB across 25 files — larger than BashTool. External tool integration is treated as a first-class architectural concern, not an afterthought.
Service Overview
Start here to see which responsibilities live outside the main query loop.
4-level context window management
External tool integration (4 transports)
Language Server Protocol
Datadog + GrowthBook pipeline
Auto-extraction + session memory
Streaming client, retries, betas, prompt caching
StreamingToolExecutor + orchestration
Plugin install + marketplace
Multi-provider token counting
Remote sessions, work dispatch, reconnect logic
Compaction System
This section explains how Claude Code keeps long sessions alive instead of hitting a hard wall.
Multi-level context window management keeps conversations within token limits. Four strategies with increasing aggressiveness:
Single-turn inline compression. Uses cached tool results. No extra API call.
Removes oldest messages below threshold. Less aggressive than autocompact.
Full conversation summary via forked agent. Replaces old messages. Circuit breaker: max 3 failures.
Incremental context reduction. Builds collapse store separately. Projected at read-time (non-destructive).
// Token budget calculation:
effective_window = model_context (e.g., 200K for opus)
- max_output_tokens (e.g., 16K)
- reserved_for_summary (20K)
= ~164K effective
autocompact_threshold = effective_window - 13K bufferMCP (Model Context Protocol)
Read this when you want to understand how Claude Code turns external MCP servers into first-class tools.
The MCP service is the largest service at 470KB across 25 files. It enables Claude Code to integrate external tools from any MCP-compatible server.
// How MCP tools work:
1. MCP server exposes tools via JSON schema
2. mcpClient.ts patches MCPTool definition at runtime:
- Sets real tool name (e.g., "mcp_weather_get_current")
- Injects actual input/output schemas
- Wires up call() to invoke MCP server RPC
3. MCPTool uses passthrough schema (z.object({}).passthrough())
4. No validation at Claude Code layer — MCP server is responsible
// Key files:
client.ts — 119KB — Protocol client orchestrator
config.ts — 51KB — Settings, env vars, server validation
auth.ts — 88KB — OAuth flow, token management
elicitationHandler.ts — User prompts during tool callsLSP (Language Server Protocol)
// LSP provides IDE-like features:
- Diagnostics (errors, warnings)
- Hover information
- Go-to-definition
- Code completions
// Architecture:
LSPServerManager (singleton)
└─ LSPServerInstance[] (per language/framework)
└─ LSPClient (protocol implementation)
└─ LSPDiagnosticRegistry (collects diagnostics)
// Lifecycle:
initializeLspServerManager() → Async init with generation counter
getLspServerManager() → Get active manager (undefined if not ready)
getInitializationStatus() → not-started | pending | success | failed
// Integration with tools:
FileEditTool → Notifies LSP of file changes → Triggers diagnostics
FileWriteTool → Same notification path
LSPTool → Direct query interface for the modelAnalytics Pipeline
AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHSThis is a real TypeScript type. The name itself is the enforcement mechanism — every analytics event must be typed with this, forcing the developer to consciously confirm they aren't accidentally logging user code or file paths. It's the most creative use of a type name for PII safety we've ever seen.
// Event pipeline with queue-until-sink pattern:
logEvent(name, metadata) → Sync event logging
logEventAsync(name, metadata) → Async event logging
attachAnalyticsSink() → Register backend (Datadog, 1P)
// The safety type — you must use this for every analytics event:
type AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS = { ... }
// → Enforces: no file content, no user code in analytics
// PII handling:
_PROTO_* keys → PII-tagged columns (Anthropic 1P only)
stripProtoFields() → Removes PII before Datadog fanout
// GrowthBook integration (feature gates):
checkStatsigFeatureGate_CACHED_MAY_BE_STALE()
// → Cached gate values prevent blocking on init
// → User attributes: ID, session, platform, org, subscription
// → A/B experiment tracking with variation IDsAPI Layer Is a Policy Engine
This is the deeper service section: the API layer decides caching, retries, betas, and stream normalization.
One of the easiest things to underestimate in Claude Code is the API client. claude.ts is not a thin transport wrapper: it decides which beta headers to send, how to split cacheable vs dynamic system prompt sections, how to retry recoverable failures, when to record quota/cost state, and how to normalize streamed content back into the internal message model.
// services/api/claude.ts
build request:
→ normalizeMessagesForAPI(...)
→ splitSysPromptPrefix(...) for prompt caching
→ choose beta headers (fast mode, effort, structured outputs, tool search)
→ attach attribution + client request IDs
stream response:
→ normalizeContentFromAPI(...)
→ ensureToolResultPairing(...)
→ capture usage deltas + request fingerprints
→ update quota/cost/session activity
failure path:
→ withRetry(...)
→ distinguish abort / timeout / 529 / fallback-triggered cases
→ emit assistant-visible API error messagesBridge & Remote Execution
Use this section to understand how Claude Code can operate as remote capacity, not only as a local CLI loop.
The newer repo has a substantial bridge/remote layer that the older analysis pages barely mentioned. bridgeMain.ts is effectively a miniature control plane: it polls for work, spawns or reconnects sessions, heartbeats active jobs, refreshes ingress tokens, manages worktrees, and tears sessions down safely.
// bridge/bridgeMain.ts
runBridgeLoop(config, environmentId, secret, api, spawner, logger, signal)
→ poll bridge API for work
→ spawn local session or reconnect existing session
→ send heartbeatWork() for active jobs
→ refresh ingress/JWT tokens
→ create/remove agent worktrees
→ wake capacity when sessions finish
→ stop or reconnect timed-out sessions
// related files:
sessionRunner.ts // child session spawning
workSecret.ts // SDK / worker registration secrets
bridgeApi.ts // typed bridge API client
remote/*.ts // session manager + websocket transportSpeculation & Prompt Suggestions
This explains the hidden background work Claude Code performs to make the next step feel faster.
Another service family worth studying is PromptSuggestion. It is no longer just a UI nicety: speculation.ts creates copy-on-write overlays under /tmp, forks a cheap background agent using cache-safe params, pre-executes likely next steps, and can copy successful writes back into the main working directory.
// services/PromptSuggestion/speculation.ts
getOverlayPath(id) → /tmp/.../speculation/<pid>/<id>
prepareMessagesForInjection(messages)
runForkedAgent(cacheSafeParams, ...)
copyOverlayToMain(overlayPath, writtenPaths, cwd)
guards:
- stop at write tools outside the overlay rules
- stop on denied tools or non-read-only bash
- cap to 20 turns / 100 messages
- log speculation outcome + time savedTool Orchestration Service
There are really two orchestration layers. toolOrchestration.ts handles already-buffered tool blocks in ordered batches; StreamingToolExecutor handles the earlier phase where tool_use blocks are still arriving over the wire and must be launched optimistically without breaking ordering guarantees.
// services/tools/ — Two key files:
// 1. toolOrchestration.ts (189 lines) — runTools() generator
// Batch partitioning and serial/concurrent execution
// Read-only batch → up to 10 parallel
// Write batch → serial with context modifiers
// 2. StreamingToolExecutor.ts (226 lines)
// Concurrent execution while model streams
// addTool() → enqueue as tool_use blocks arrive
// processQueue() → respect concurrency constraints
// getCompletedResults() → yield finished results
// discard() → cleanup on streaming fallback
// siblingAbortController → kill sibling subprocesses on bash errorMemory extraction runs after every single query
Services like extractMemories and compaction run as forked background agents — understand how spawning and cache sharing work.
The memory system is powered by the extractMemories and autoDream services explained on this page.
Exact file paths for every service directory described on this page.