Services

20+ services

Claude Code's service layer handles compaction, MCP integration, LSP, analytics, memory extraction, and more — running in parallel with the main query loop.

services/bridge/remote/mcp/

TL;DR — Key Takeaways

MCP is the largest single service at 470KB across 25 files — bigger than BashTool. External tool integration is a first-class architecture concern, not an afterthought.
Compaction has 4 escalating strategies (Microcompact → Snipping → Autocompact → Collapse) that activate from lightest to heaviest. Without them, sessions would hit the token wall every few hours.
The API layer handles streaming, retries, beta-flag management, and prompt cache control — all transparently. Most developers never touch it directly, but it fires on every single turn.
Bridge/Remote (35KB, 20 files) is a second top-level execution path alongside the local REPL — remote sessions dispatch work through the same QueryEngine/query() core.

Jump To

Overview

A quick map of the major service families.

Compaction

How context is kept inside model limits.

MCP

External tool integration and transport handling.

API Layer

The request/stream/retry layer behind every turn.

Bridge/Remote

Remote session control-plane logic.

Speculation

PromptSuggestion and cheap background previews.

Key Insight

MCP is the largest single service at 470KB across 25 files — larger than BashTool. External tool integration is treated as a first-class architectural concern, not an afterthought.

Service Overview

Start here to see which responsibilities live outside the main query loop.

~15K

Compaction

13 files

4-level context window management

470KB

MCP

25 files

External tool integration (4 transports)

~5K

LSP

6 files

Language Server Protocol

~8K

Analytics

6 files

Datadog + GrowthBook pipeline

~6K

Memory

5 files

Auto-extraction + session memory

~45K

API

8 files

Streaming client, retries, betas, prompt caching

~1K

Tools

2 files

StreamingToolExecutor + orchestration

~10K

Plugins

8 files

Plugin install + marketplace

~2K

Tokens

1 files

Multi-provider token counting

~35K

Bridge/Remote

20 files

Remote sessions, work dispatch, reconnect logic

Compaction System

This section explains how Claude Code keeps long sessions alive instead of hitting a hard wall.

services/compact/autoCompact.ts

Multi-level context window management keeps conversations within token limits. Four strategies with increasing aggressiveness:

least aggressivemost aggressive

MicrocompactEvery API call

Single-turn inline compression. Uses cached tool results. No extra API call.

History SnippingFeature-gated thresholdmore aggressive

Removes oldest messages below threshold. Less aggressive than autocompact.

AutocompactToken threshold triggermore aggressive

Full conversation summary via forked agent. Replaces old messages. Circuit breaker: max 3 failures.

Context CollapseExperimentalmore aggressive

Incremental context reduction. Builds collapse store separately. Projected at read-time (non-destructive).

typescript

// Token budget calculation:
effective_window = model_context (e.g., 200K for opus)
  - max_output_tokens (e.g., 16K)
  - reserved_for_summary (20K)
  = ~164K effective

autocompact_threshold = effective_window - 13K buffer

MCP (Model Context Protocol)

Read this when you want to understand how Claude Code turns external MCP servers into first-class tools.

services/mcp/client.ts auth.ts

The MCP service is the largest service at 470KB across 25 files. It enables Claude Code to integrate external tools from any MCP-compatible server.

⚡

stdio

Local process communication

LatencyLowest — direct pipe

UseLocal tools, shell commands, filesystem access

📡

SSE

Server-Sent Events (HTTP streaming)

LatencyLow — persistent HTTP stream

UseRemote servers with streaming responses

🌐

HTTP

Standard HTTP requests

LatencyMedium — per-request round trip

UseREST APIs, web services, stateless tools

🔌

WebSocket

Full-duplex communication

LatencyLow — persistent bidirectional

UseReal-time tools, interactive services

typescript

// How MCP tools work:
1. MCP server exposes tools via JSON schema
2. mcpClient.ts patches MCPTool definition at runtime:
   - Sets real tool name (e.g., "mcp_weather_get_current")
   - Injects actual input/output schemas
   - Wires up call() to invoke MCP server RPC
3. MCPTool uses passthrough schema (z.object({}).passthrough())
4. No validation at Claude Code layer — MCP server is responsible

// Key files:
client.ts    — 119KB — Protocol client orchestrator
config.ts    — 51KB  — Settings, env vars, server validation
auth.ts      — 88KB  — OAuth flow, token management
elicitationHandler.ts — User prompts during tool calls

LSP (Language Server Protocol)

services/lsp/index.ts

FileEditTool → LSP integration flow

FileEditTool saves

File written to disk

Notifies LSP

didChange/didSave event sent

Triggers diagnostics

Errors & warnings surfaced to model

typescript

// LSP provides IDE-like features:
- Diagnostics (errors, warnings)
- Hover information
- Go-to-definition
- Code completions

// Architecture:
LSPServerManager (singleton)
  └─ LSPServerInstance[] (per language/framework)
       └─ LSPClient (protocol implementation)
            └─ LSPDiagnosticRegistry (collects diagnostics)

// Lifecycle:
initializeLspServerManager()  → Async init with generation counter
getLspServerManager()          → Get active manager (undefined if not ready)
getInitializationStatus()      → not-started | pending | success | failed

// Integration with tools:
FileEditTool → Notifies LSP of file changes → Triggers diagnostics
FileWriteTool → Same notification path
LSPTool → Direct query interface for the model

Analytics Pipeline

services/analytics/index.ts

Best TypeScript type name in the codebase

AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS

This is a real TypeScript type. The name itself is the enforcement mechanism — every analytics event must be typed with this, forcing the developer to consciously confirm they aren't accidentally logging user code or file paths. It's the most creative use of a type name for PII safety we've ever seen.

typescript

// Event pipeline with queue-until-sink pattern:
logEvent(name, metadata)        → Sync event logging
logEventAsync(name, metadata)   → Async event logging
attachAnalyticsSink()           → Register backend (Datadog, 1P)

// The safety type — you must use this for every analytics event:
type AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS = { ... }
// → Enforces: no file content, no user code in analytics

// PII handling:
_PROTO_* keys → PII-tagged columns (Anthropic 1P only)
stripProtoFields() → Removes PII before Datadog fanout

// GrowthBook integration (feature gates):
checkStatsigFeatureGate_CACHED_MAY_BE_STALE()
// → Cached gate values prevent blocking on init
// → User attributes: ID, session, platform, org, subscription
// → A/B experiment tracking with variation IDs

API Layer Is a Policy Engine

This is the deeper service section: the API layer decides caching, retries, betas, and stream normalization.

services/api/claude.ts services/api/withRetry.ts utils/api.ts

One of the easiest things to underestimate in Claude Code is the API client. claude.ts is not a thin transport wrapper: it decides which beta headers to send, how to split cacheable vs dynamic system prompt sections, how to retry recoverable failures, when to record quota/cost state, and how to normalize streamed content back into the internal message model.

typescript

// services/api/claude.ts
build request:
  → normalizeMessagesForAPI(...)
  → splitSysPromptPrefix(...) for prompt caching
  → choose beta headers (fast mode, effort, structured outputs, tool search)
  → attach attribution + client request IDs

stream response:
  → normalizeContentFromAPI(...)
  → ensureToolResultPairing(...)
  → capture usage deltas + request fingerprints
  → update quota/cost/session activity

failure path:
  → withRetry(...)
  → distinguish abort / timeout / 529 / fallback-triggered cases
  → emit assistant-visible API error messages

Bridge & Remote Execution

Use this section to understand how Claude Code can operate as remote capacity, not only as a local CLI loop.

bridgeMain.ts bridge/remote/

The newer repo has a substantial bridge/remote layer that the older analysis pages barely mentioned. bridgeMain.ts is effectively a miniature control plane: it polls for work, spawns or reconnects sessions, heartbeats active jobs, refreshes ingress tokens, manages worktrees, and tears sessions down safely.

typescript

// bridge/bridgeMain.ts
runBridgeLoop(config, environmentId, secret, api, spawner, logger, signal)
  → poll bridge API for work
  → spawn local session or reconnect existing session
  → send heartbeatWork() for active jobs
  → refresh ingress/JWT tokens
  → create/remove agent worktrees
  → wake capacity when sessions finish
  → stop or reconnect timed-out sessions

// related files:
sessionRunner.ts        // child session spawning
workSecret.ts           // SDK / worker registration secrets
bridgeApi.ts            // typed bridge API client
remote/*.ts             // session manager + websocket transport

Speculation & Prompt Suggestions

This explains the hidden background work Claude Code performs to make the next step feel faster.

services/PromptSuggestion/speculation.ts

Another service family worth studying is PromptSuggestion. It is no longer just a UI nicety: speculation.ts creates copy-on-write overlays under /tmp, forks a cheap background agent using cache-safe params, pre-executes likely next steps, and can copy successful writes back into the main working directory.

typescript

// services/PromptSuggestion/speculation.ts
getOverlayPath(id) → /tmp/.../speculation/<pid>/<id>
prepareMessagesForInjection(messages)
runForkedAgent(cacheSafeParams, ...)
copyOverlayToMain(overlayPath, writtenPaths, cwd)

guards:
- stop at write tools outside the overlay rules
- stop on denied tools or non-read-only bash
- cap to 20 turns / 100 messages
- log speculation outcome + time saved

Tool Orchestration Service

toolOrchestration.ts StreamingToolExecutor.ts

There are really two orchestration layers. toolOrchestration.ts handles already-buffered tool blocks in ordered batches; StreamingToolExecutor handles the earlier phase where tool_use blocks are still arriving over the wire and must be launched optimistically without breaking ordering guarantees.

typescript

// services/tools/ — Two key files:

// 1. toolOrchestration.ts (189 lines) — runTools() generator
//    Batch partitioning and serial/concurrent execution
//    Read-only batch → up to 10 parallel
//    Write batch → serial with context modifiers

// 2. StreamingToolExecutor.ts (226 lines)
//    Concurrent execution while model streams
//    addTool() → enqueue as tool_use blocks arrive
//    processQueue() → respect concurrency constraints
//    getCompletedResults() → yield finished results
//    discard() → cleanup on streaming fallback
//    siblingAbortController → kill sibling subprocesses on bash error

⚙️

Memory extraction runs after every single query

After EVERY query, Claude runs extractMemories() in the background as a forked agent. After 24 hours + 5 sessions, autoDream() fires — a deeper memory consolidation pass. These agents are invisible to the user but silently make Claude smarter about your codebase over time.

Agents & Subagents

Services like extractMemories and compaction run as forked background agents — understand how spawning and cache sharing work.

Context & Memory

The memory system is powered by the extractMemories and autoDream services explained on this page.

File Map

Exact file paths for every service directory described on this page.

Context & Memory

How the system prompt is built from 4 layers, where CLAUDE.md files are loaded from, and how auto-memory extraction works.