Claude CodeQuery Loop

Query Loop

query.ts

The core agentic execution cycle — how messages flow from user input through the API to tool execution and back. The main loop lives in query.ts (~1700 lines).

Loop State Machine

type LoopState = {
  messages: Message[]
  toolUseContext: ToolUseContext
  autoCompactTracking?: AutoCompactTrackingState
  maxOutputTokensRecoveryCount: number   // max 3 retries
  hasAttemptedReactiveCompact: boolean
  maxOutputTokensOverride?: number       // escalate 8K → 64K
  pendingToolUseSummary?: Promise<ToolUseSummaryMessage | null>
  stopHookActive?: boolean
  turnCount: number
  transition?: Continue                  // why previous iteration continued
}

Loop Iteration Flow

1
Context Projection
Extract messages after compact boundary. Apply per-message tool result budget (content replacement). Apply history snipping, microcompact (single-turn compression), and context collapse.
2
Auto-Compaction Pre-Check
If context exceeds threshold (model context - max output - 13K buffer), trigger async autocompact. Replace messages with post-compact version. Track compaction info for analytics.
3
API Call with Streaming
Call queryModelWithStreaming() with messages, system prompt, tools, thinking config. Stream back text blocks, tool_use blocks, and thinking blocks. StreamingToolExecutor starts executing tools as they arrive — reducing latency by parallelizing tool execution with continued model streaming.
4
Error Recovery
Multiple recovery strategies: (1) Collapse Drain — drain staged context collapses. (2) Reactive Compact — full summary if collapse insufficient. (3) Max Output Escalation — 8K → 64K one-shot. (4) Multi-turn — inject 'continue' message, max 3 attempts.
5
Tool Execution
Partition tool calls by concurrency safety. Read-only tools run in parallel (up to 10). Write tools run serially with context modifiers applied between batches. Results yielded as messages.
6
Attachment Processing
Memory prefetch results, skill discovery, queued command attachments (task notifications). All appended to messages before next API call.
7
Continuation Decision
If no tool use → check for natural completion. Run stop hooks for conditional continuation. Check token budget. Return terminal state or continue loop.

Streaming Tool Execution

The StreamingToolExecutor is a key innovation — tools start executing while the model is still generating tokens. This significantly reduces end-to-end latency.

// StreamingToolExecutor.ts (226 lines)

class StreamingToolExecutor {
  // Queue management
  addTool(block, assistantMessage)    // Enqueue when tool_use block arrives
  processQueue()                      // Start tools respecting concurrency
  getCompletedResults()               // Yield finished results immediately

  // Concurrency enforcement
  // Non-concurrent tools: wait for exclusive access
  // Concurrent-safe tools: run in parallel with other safe tools

  // Fallback handling
  discard()                           // Discard pending on streaming fallback
  // Generates synthetic error results for in-flight tools
}

Error Recovery Cascade

When things go wrong, the loop tries 4 recovery strategies in order — each more aggressive:

Step 1
Collapse Drain

Drain staged context collapses

Step 2
Reactive Compact

Full conversation summary

Step 3
Token Escalation

8K → 64K one-shot

Step 4
Multi-turn

Inject 'continue', max 3x

Loop Exit Conditions

completed

Natural end of response

prompt_too_long

Unrecoverable context overflow

max_output_tokens

Output limit exhausted after recovery

aborted_streaming

User interrupt during model call

aborted_tools

User interrupt during tool execution

stop_hook_prevented

Hook rejected continuation

blocking_limit

Hard context limit hit

token_budget_completed

Token budget exhausted

Message Flow Example

User: "write a hello.py file"
    ↓
QueryEngine.submitMessage(prompt)
    ↓
fetchSystemPromptParts() → [default prompt + 50 tools]
    ↓
processUserInput() → [user message + attachments]
    ↓
yield buildSystemInitMessage()
    ↓
query() loop iteration 1:
  ─ prepend user context (cwd, platform, git status)
  ─ call queryModelWithStreaming()
  ─ stream: "I'll create a Python file..."
  ─ stream: tool_use { name: "Write", input: { file_path, content } }
      ├─ addTool() to StreamingToolExecutor
      └─ model continues streaming...
  ─ tool completes → tool_result message
  ─ yield tool_result
    ↓
  ─ getAttachmentMessages() → file change notification
  ─ yield attachment message
    ↓
  ─ needsFollowUp = false (no more tool calls)
  ─ stop hooks pass
  ─ return { reason: 'completed' }
    ↓
Session ends, messages persisted to transcript.jsonl