CC
Claude Code
v2.1.88
Claude CodeServices Module

Services Module

110 files · ~80K lines

The Service Layer — 8 background services that power everything. API client, MCP runtime (470KB!), context compaction, LSP diagnostics, memory extraction, analytics, tool orchestration, and token management.

Service Catalogue — 8 Services

Each service is an independent subsystem with a clear input/output contract.

services/api/
30+ files3500+ lines
  • ·Builds Anthropic API requests with full system prompt, tools, and message history
  • ·Handles streaming token events: text_delta, tool_use start/delta/stop
  • ·Token budget management — enforces context limits before sending
Key fact: claude.ts is the largest single file in the entire codebase
utils/messagesapiLLM responses
services/mcp/
20+ files470KB
  • ·MCP protocol client supporting 4 transport types: stdio, SSE, HTTP, WebSocket
  • ·Fetches tool definitions from connected servers at startup
  • ·Patches MCP tools directly into Claude's tool namespace at runtime
Key fact: 470KB — the single largest service. Bigger than most npm packages.
config (server list)mcpMCP tools
services/compact/
13 files~15KB
  • ·4-strategy escalating compression pipeline triggered when context approaches limit
  • ·Strategies escalate from simple truncation to full AI-summarization of history
  • ·Tracks compression ratios and chooses least-aggressive strategy that fits
Key fact: Strategy 4 (Full Context Replacement) is the nuclear option — replaces everything except the last N turns
services/api (for summarization)compactcompressed messages
services/lsp/
6 files~8KB
  • ·Language Server Protocol integration for IDE-quality diagnostics
  • ·Runs linters/type checkers as background processes connected via LSP
  • ·Surfaces errors inline in tool results so Claude can fix them immediately
Key fact: Claude can see TypeScript errors before running the code — same data flow as your IDE
project root pathlspLSP diagnostics
services/extractMemories/
~5 files~15KB
  • ·Background agent that runs after every conversation turn
  • ·Sends recent messages to Claude: 'what facts are worth remembering?'
  • ·Stores extracted memories as YAML in ~/.claude/memories/ for future sessions
Key fact: Uses Claude to build Claude's long-term memory — recursive self-improvement of context
services/api (Claude call)extractMemories~/.claude/memories/*.yaml
services/analytics/
~8 files~10KB
  • ·Async event pipeline — usage events are queued and never block the main loop
  • ·Drains to Datadog metrics + first-party analytics endpoint
  • ·Includes a PII safety type called SanitizedEventProperties to mark clean data
Key fact: PII safety type: SanitizedEventProperties — the data is safe if the type says so
nothing (fire and forget)analyticsmetrics/events
services/tools/
~10 files~20KB
  • ·StreamingToolExecutor — queues tool_use blocks as they arrive from the stream
  • ·Concurrency guard — manages parallel vs sequential tool execution
  • ·Routes each tool call through validate → checkPermissions → invoke → renderResult
Key fact: StreamingToolExecutor starts executing tool calls before the full response is received — parallelism at stream time
services/mcp (for MCP tools)toolstool results
services/tokens/
~5 files~8KB
  • ·Tracks token consumption across the session for reporting and budget enforcement
  • ·Projects remaining token budget for the current context window
  • ·Feeds into services/compact to decide when compaction should trigger
Key fact: Tokens are counted before sending, not after — the budget projection runs before every API call
utils/messages (for counting)tokenstoken counts + budgets

API Call Lifecycle — 7 Steps

A single API call touches multiple services in sequence.

1
QueryEngine calls query()
The main query loop requests a new API call. It passes the full message history and system prompt.
2
services/api/claude.ts — buildRequest()
Constructs the Anthropic API request. Injects tools, handles model selection, applies token budget limits.
3
Anthropic API — streams tokens
Response arrives as a stream of events: text_delta, tool_use start/delta/stop. Each event is processed immediately.
4
StreamingToolExecutor — queues tool_use
As tool_use blocks complete, they are enqueued. The executor decides execution order and concurrency.
5
services/tools/ — runs tools
Each queued tool runs through the Tools module: validate → checkPermissions → invoke → renderResult.
6
services/analytics/ — async drain
Usage events are queued and drained asynchronously. Never blocks the main loop.
7
services/extractMemories/ — post-turn
After each turn completes, a background agent mines the conversation for facts worth persisting across sessions.

Compaction — 4 Escalating Strategies

When context gets large, compaction triggers. Each strategy is more aggressive than the last. Strategy 4 is the nuclear option.

1
Window Trim
aggressiveness
20%

Drop oldest messages that exceed context budget. Simple FIFO drop — no summarization yet.

2
Tool Result Truncation
aggressiveness
40%

Truncate large tool outputs (file contents, bash output) to their first N tokens. Tool call is preserved.

3
Conversation Summarization
aggressiveness
65%

Send older conversation turns to Claude for summarization. Compressed summary replaces raw messages.

4
Full Context Replacement
aggressiveness
90%

Replace everything except the last N turns with a single summary. Nuclear option — only when critically close to limit.

Strategy 4 = Context Collapse (Nuclear Option)Replaces everything except the last N turns with a single AI-generated summary. Information is lost — but the session continues.

MCP — 4 Transport Types

470KB of MCP client code — all behind one unified interface. Different transports for different deployment contexts.

⚙️stdio
Local subprocesses (npx servers)
~1ms (in-process)

Spawns a local subprocess. Communication over stdin/stdout JSON-RPC. Most common for local tools.

📡SSE
Remote HTTP servers
~50-200ms (network)

HTTP Server-Sent Events. The MCP server runs as an HTTP endpoint. Claude receives a stream of events.

🌐HTTP
Stateless REST APIs
~50-200ms (network)

Plain HTTP REST. Each tool call is a POST request. Stateless — no persistent connection required.

🔄WebSocket
Bidirectional streaming
~10-50ms (persistent)

Full duplex WebSocket. The server can push updates to Claude mid-execution.

Patch-at-runtime patternMCP tool definitions are fetched from connected servers at startup and patched directly into the Claude tool namespace. Claude sees MCP tools as if they were native built-in tools.