claude-code-ultimate-guide/guide/architecture.md
Florian BRUNIAUX 18ea240e12 docs: add MCP Apps (SEP-1865) documentation
Integrate comprehensive documentation for MCP Apps, the first official
MCP extension enabling interactive UI delivery.

Changes:
- guide/architecture.md (656): New section "MCP Extensions: Apps"
  - Technical architecture (primitives, SDK, security)
  - Platform support (Claude Desktop, VS Code, ChatGPT, Goose)
  - Example implementations (9 production tools at launch)
  - Developer workflow and SDK usage
  - ~150 lines of technical documentation

- guide/ultimate-guide.md (6509): New section "MCP Evolution: Apps"
  - User context and use cases
  - Available interactive tools (Asana, Slack, Figma, etc.)
  - Platform support matrix
  - Hybrid workflow examples
  - ~90 lines of user-facing documentation

- guide/ultimate-guide.md (7522): Table update
  - Added "Interactive UI" row to Plugin vs. MCP Server comparison
  - Clarified MCP Apps = "What Claude can show"

- machine-readable/reference.yaml: 8 new entries
  - mcp_apps_architecture, mcp_apps_evolution
  - Links to spec, SDK, blog posts
  - CLI relevance note (indirect)

- docs/resource-evaluations/mcp-apps-announcement.md: New evaluation
  - Score: 4/5 (High Value - Integrate within 1 week)
  - Fact-checked with Perplexity searches
  - Technical review by agent

Resource evaluated:
- https://blog.modelcontextprotocol.io/posts/2026-01-26-mcp-apps/
- https://claude.com/blog/interactive-tools-in-claude

Total documentation: ~240 lines across 3 files
Score: 4/5 (High Value)
CLI relevance: Indirect (ecosystem understanding, MCP server dev, hybrid workflows)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-27 08:14:49 +01:00

1183 lines
59 KiB
Markdown

# How Claude Code Works: Architecture & Internals
> A technical deep-dive into Claude Code's internal mechanisms, based on official Anthropic documentation and verified community analysis.
**Author**: Florian BRUNIAUX | Contributions from Claude (Anthropic)
**Reading time**: ~25 minutes (full) | ~5 minutes (TL;DR only)
**Last verified**: January 2026 (Claude Code 3.3.x)
---
## Source Transparency
This document combines three tiers of sources:
| Tier | Description | Confidence | Example |
|------|-------------|------------|---------|
| **Tier 1** | Official Anthropic documentation | 100% | anthropic.com/engineering/* |
| **Tier 2** | Verified reverse-engineering | 70-90% | PromptLayer analysis, code.claude.com behavior |
| **Tier 3** | Community inference | 40-70% | Observed but not officially confirmed |
Each claim is marked with its confidence level. **Always prefer official documentation** when available.
---
## TL;DR - 5 Bullet Summary
1. **Simple Loop**: Claude Code runs a `while(tool_call)` loop — no DAGs, no classifiers, no RAG. The model decides everything.
2. **Eight Core Tools**: Bash (universal adapter), Read, Edit, Write, Grep, Glob, Task (sub-agents), TodoWrite. That's the entire arsenal.
**Search Strategy Evolution**: Early Claude Code versions experimented with RAG using Voyage embeddings for semantic code search. Anthropic switched to grep-based (ripgrep) agentic search after internal benchmarks showed superior performance with lower operational complexity — no index sync required, no security liabilities from external embedding providers. This "Search, Don't Index" philosophy trades latency/tokens for simplicity/security. Community plugins (ast-grep for AST patterns) and MCP servers (Serena for symbols, grepai for RAG) available for specialized needs.
*Source*: [Latent Space podcast](https://www.latent.space/p/claude-code) (May 2025), ast-grep documentation
3. **200K Token Budget**: Context window shared between system prompt, history, tool results, and response buffer. Auto-compacts at ~75-92% capacity.
4. **Sub-agents = Isolation**: The `Task` tool spawns sub-agents with their own context. They cannot spawn more sub-agents (depth=1). Only their summary returns.
5. **Philosophy**: "Less scaffolding, more model" — trust Claude's reasoning instead of building complex orchestration systems around it.
---
## Visual Overview
Before diving into the technical details, this diagram by Mohamed Ali Ben Salem captures the essential architecture:
![Claude Code Architecture Overview](./images/claude-code-architecture-overview.jpeg)
*Source: [Mohamed Ali Ben Salem on LinkedIn](https://www.linkedin.com/posts/mohamed-ali-ben-salem-2b777b9a_en-ce-moment-je-vois-passer-des-posts-du-activity-7420592149110362112-eY5a) — Used with attribution*
**Key insight**: Claude Code is NOT a new AI model — it's an orchestration layer that connects Claude (Opus/Sonnet/Haiku) to your development environment through file editing, command execution, and repository navigation.
---
## Table of Contents
- [Visual Overview](#visual-overview)
1. [The Master Loop](#1-the-master-loop)
2. [The Tool Arsenal](#2-the-tool-arsenal)
3. [Context Management Internals](#3-context-management-internals)
4. [Sub-Agent Architecture](#4-sub-agent-architecture)
5. [Permission & Security Model](#5-permission--security-model)
6. [MCP Integration](#6-mcp-integration)
7. [The Edit Tool: How It Actually Works](#7-the-edit-tool-how-it-actually-works)
8. [Session Persistence](#8-session-persistence)
9. [Philosophy: Less Scaffolding, More Model](#9-philosophy-less-scaffolding-more-model)
10. [Claude Code vs Alternatives](#10-claude-code-vs-alternatives)
11. [Sources & References](#11-sources--references)
12. [Appendix: What We Don't Know](#12-appendix-what-we-dont-know)
---
## 1. The Master Loop
**Confidence**: 100% (Tier 1 - Official)
**Source**: [Anthropic Engineering Blog](https://www.anthropic.com/engineering/claude-code-best-practices)
At its core, Claude Code is remarkably simple:
```
┌─────────────────────────────────────────────────────────────┐
│ CLAUDE CODE MASTER LOOP │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ │
│ │ Your Prompt │ │
│ └──────┬───────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ CLAUDE REASONS │ │
│ │ (No classifier, no routing layer) │ │
│ │ │ │
│ └────────────────────────┬─────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────┐ │
│ │ Tool Call? │ │
│ └───────┬────────┘ │
│ │ │
│ YES │ NO │
│ ┌─────────────────┴─────────────────┐ │
│ │ │ │
│ ▼ ▼ │
│ ┌────────────┐ ┌────────────┐ │
│ │ Execute │ │ Text │ │
│ │ Tool │ │ Response │ │
│ │ │ │ (DONE) │ │
│ └─────┬──────┘ └────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Feed Result │ │
│ │ to Claude │──────────────────┐ │
│ └─────────────┘ │ │
│ │ │
│ ▼ │
│ ┌────────────────┐ │
│ │ LOOP BACK │ │
│ │ (Next turn) │ │
│ └────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
```
### What This Means
The entire architecture is a simple `while` loop:
```
while (claude_response.has_tool_call):
result = execute_tool(tool_call)
claude_response = send_to_claude(result)
return claude_response.text
```
**There is no:**
- Intent classifier
- Task router
- RAG/embedding pipeline
- DAG orchestrator
- Planner/executor split
The model itself decides when to call tools, which tools to call, and when it's done. This is the "agentic loop" pattern described in Anthropic's engineering blog.
### Why This Design?
1. **Simplicity**: Fewer components = fewer failure modes
2. **Model-driven**: Claude's reasoning is better than hand-coded heuristics
3. **Flexibility**: No rigid pipeline constraining what Claude can do
4. **Debuggability**: Easy to understand what happened and why
---
## 2. The Tool Arsenal
**Confidence**: 100% (Tier 1 - Official)
**Source**: [code.claude.com/docs](https://code.claude.com/docs/en/setup)
Claude Code has exactly 8 core tools:
| Tool | Purpose | Key Behavior | Token Cost |
|------|---------|--------------|------------|
| `Bash` | Execute shell commands | Universal adapter, most powerful | Low (command) + Variable (output) |
| `Read` | Read file contents | Max 2000 lines, handles truncation | High for large files |
| `Edit` | Modify existing files | Diff-based, requires exact match | Medium |
| `Write` | Create/overwrite files | Must read first if file exists | Medium |
| `Grep` | Search file contents | Ripgrep-based (regex), replaced RAG/embedding approach. For structural code search (AST-based), see ast-grep plugin. Trade-off: Grep (fast, simple) vs ast-grep (precise, setup required) vs Serena MCP (semantic, symbol-aware) | Low |
| `Glob` | Find files by pattern | Path matching, sorted by mtime | Low |
| `Task` | Spawn sub-agents | Isolated context, depth=1 limit | High (new context) |
| `TodoWrite` | Track progress | Structured task management | Low |
### The Bash Universal Adapter
**Key insight**: Bash is Claude's swiss-army knife. It can:
- Run any CLI tool (git, npm, docker, curl...)
- Execute scripts
- Chain commands with pipes
- Access system state
The model has been trained on massive amounts of shell data, making it highly effective at using Bash as a universal adapter when specialized tools aren't enough.
### Tool Selection Logic
Claude decides which tool to use based on the task. There's no hardcoded routing:
```
┌─────────────────────────────────────────────────────┐
│ TOOL SELECTION (Model-Driven) │
├─────────────────────────────────────────────────────┤
│ │
│ "Read auth.ts" → Read tool │
│ "Find all test files" → Glob tool │
│ "Search for TODO" → Grep tool │
│ "Run npm test" → Bash tool │
│ "Explore the codebase" → Task tool (sub-agent) │
│ "Track my progress" → TodoWrite tool │
│ │
│ The model learns these patterns during training, │
│ not from explicit rules. │
│ │
└─────────────────────────────────────────────────────┘
```
### Extended Tool Ecosystem
Beyond the 8 core tools, Claude Code can leverage:
**MCP Servers** (Model Context Protocol):
- **Serena**: Symbol-aware code navigation + session memory
- **grepai**: Semantic search + call graph analysis (Ollama-based)
- **Context7**: Official library documentation lookup
- **Sequential**: Structured multi-step reasoning
- **Playwright**: Browser automation and E2E testing
**Community Plugins**:
- **ast-grep**: AST-based structural code search (explicit invocation)
### Search Tool Selection Matrix
Claude Code offers multiple ways to search code, each with specific strengths:
| Search Need | Native Tool | MCP/Plugin Alternative | When to Escalate |
|-------------|-------------|----------------------|------------------|
| Exact text | `Grep` (ripgrep) | - | Never (fastest) |
| Function name | `Grep` | Serena `find_symbol` | Multi-file refactoring |
| By meaning | - | grepai `search` | Don't know exact text |
| Call graph | - | grepai `trace_callers` | Dependency analysis |
| Structural pattern | - | ast-grep | Large migrations (>50k lines) |
| File structure | - | Serena `get_symbols_overview` | Need symbol context |
**Performance Comparison**:
| Tool | Speed | Setup | Use Case |
|------|-------|-------|----------|
| Grep (ripgrep) | ⚡ ~20ms | ✅ None | 90% of searches |
| Serena | ⚡ ~100ms | ⚠️ MCP | Refactoring, symbols |
| grepai | 🐢 ~500ms | ⚠️ Ollama + MCP | Semantic, call graph |
| ast-grep | 🕐 ~200ms | ⚠️ Plugin | AST patterns, migrations |
**Decision principle**: Start with Grep (fastest), escalate to specialized tools only when needed.
> **📖 Deep Dive**: See [Search Tools Mastery](../workflows/search-tools-mastery.md) for comprehensive workflows combining all search tools.
---
## 3. Context Management Internals
**Confidence**: 80% (Tier 2 - Partially Official)
**Sources**:
- [platform.claude.com/docs](https://platform.claude.com/docs/en/build-with-claude/context-windows) (Tier 1)
- Observed behavior (Tier 2)
Claude Code operates within a fixed context window (200K tokens for Claude 3.5 Sonnet, varies by model).
### Context Budget Breakdown
```
┌─────────────────────────────────────────────────────────────┐
│ CONTEXT BUDGET (~200K tokens) │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ System Prompt (~5-15K) │ │
│ │ • Tool definitions │ │
│ │ • Safety instructions │ │
│ │ • Behavioral guidelines │ │
│ │ • See detailed breakdown below ↓ │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ CLAUDE.md Files (~1-10K) │ │
│ │ • Global ~/.claude/CLAUDE.md │ │
│ │ • Project /CLAUDE.md │ │
│ │ • Local /.claude/CLAUDE.md │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ Conversation History (variable) │ │
│ │ • Your prompts │ │
│ │ • Claude's responses │ │
│ │ • Tool call records │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ Tool Results (variable) │ │
│ │ • File contents from Read │ │
│ │ • Command outputs from Bash │ │
│ │ • Search results from Grep │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ Reserved for Response (~40-45K) │ │
│ │ • Claude's thinking │ │
│ │ • Generated code/text │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ USABLE = Total - System - Reserved ≈ 140-150K tokens │
│ │
└─────────────────────────────────────────────────────────────┘
```
### System Prompt Contents
**Confidence**: 100% (Tier 1 - Official Anthropic Documentation)
**Sources**:
- [Anthropic System Prompts Release Notes](https://platform.claude.com/docs/en/release-notes/system-prompts)
- [Anthropic Engineering: Claude Code Best Practices](https://www.anthropic.com/engineering/claude-code-best-practices)
Claude system prompts (~5-15K tokens) are **publicly published** by Anthropic as part of their transparency commitment. These prompts define:
**Core Components**:
- **Tool definitions**: Bash, Read, Edit, Write, Grep, Glob, Task, TodoWrite
- **Safety instructions**: Content policies, refusal patterns (see [Security Hardening](./security-hardening.md))
- **Behavioral guidelines**: Task-first approach, MVP-first, no over-engineering
- **Context instructions**: How to gather and use project context
**Important Distinctions**:
- **Claude.ai/Mobile**: Published prompts available publicly
- **Anthropic API**: Different default instructions, configurable by developers
- **Claude Code CLI**: Agentic coding assistant with context-gathering behavior
**Community Analysis** (for deeper understanding):
- **Simon Willison's Claude 4 Analysis** (May 2025): [Deep-dive into thinking blocks, search rules, safety guardrails](https://simonwillison.net/2025/May/25/claude-4-system-prompt/)
- **PromptHub Technical Breakdown** (June 2025): [Detailed analysis of prompt engineering patterns](https://www.prompthub.us/blog/an-analysis-of-the-claude-4-system-prompt)
**Cross-reference**: For security implications, see [Section 5: Permission & Security Model](#5-permission--security-model)
**Note**: Claude Code system prompts may differ from Claude.ai/mobile versions. The above sources cover the Claude family; Code-specific prompts are integrated into the CLI tool's behavior.
---
### Auto-Compaction
**Confidence**: 50% (Tier 3 - Conflicting reports)
When context usage exceeds a threshold, Claude Code automatically summarizes older conversation turns:
| Source | Reported Threshold |
|--------|-------------------|
| PromptLayer analysis | 92% |
| Community observations | 75-80% |
| User-triggered `/compact` | Anytime |
**What happens during compaction:**
1. Older conversation turns are summarized
2. Tool results are condensed
3. Recent context is preserved in full
4. The model receives a "context was compacted" signal
**User control**: Use `/compact` to manually trigger summarization before hitting limits.
### Context Preservation Strategies
| Strategy | When to Use | How |
|----------|-------------|-----|
| Sub-agents | Exploratory tasks | `Task` tool for isolated search |
| Manual compact | Proactive cleanup | `/compact` command |
| Clear session | Fresh start needed | `/clear` command |
| Specific reads | Know what you need | Read exact files, not directories |
| CLAUDE.md | Persistent context | Store conventions in memory files |
### Session Degradation Limits
**Confidence**: 70% (Tier 2 - Practitioner studies, arXiv research)
Claude Code's effectiveness degrades predictably under certain conditions:
| Condition | Observed Threshold | Symptom |
|-----------|-------------------|---------|
| Conversation turns | **15-25 turns** | Loses track of earlier constraints |
| Token accumulation | **80-100K tokens** | Ignores requirements stated early in session |
| Problem scope | **>5 files simultaneously** | Inconsistent changes, missed files |
**Success rates by scope** (from practitioner studies):
| Scope | Success Rate | Example |
|-------|--------------|---------|
| 1-3 files | ~85% | Fix bug in single module |
| 4-7 files | ~60% | Refactor feature across components |
| 8+ files | ~40% | Codebase-wide changes |
**Mitigation strategies**:
1. **Checkpoint prompts**: "Before continuing, recap the current requirements and constraints."
2. **Session resets**: Start fresh for new tasks (`/clear`)
3. **Scope tightly**: Break large tasks into focused sub-tasks
4. **Use sub-agents**: Delegate exploration to `Task` tool to preserve main context
---
## 4. Sub-Agent Architecture
**Confidence**: 100% (Tier 1 - Documented behavior)
**Source**: [code.claude.com/docs](https://code.claude.com/docs/en/setup) + System prompt (visible in tool definitions)
The `Task` tool spawns sub-agents for parallel or isolated work.
### Isolation Model
```
┌─────────────────────────────────────────────────────────────┐
│ MAIN AGENT │
│ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ Context: Full conversation + all file reads │ │
│ │ │ │
│ │ Task("Explore authentication patterns") │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────────────────────────────────┐ │ │
│ │ │ SUB-AGENT (Spawned) │ │ │
│ │ │ │ │ │
│ │ │ • Own fresh context window │ │ │
│ │ │ • Receives: task description only │ │ │
│ │ │ • Has access to: same tools (except Task) │ │ │
│ │ │ • CANNOT spawn sub-sub-agents (depth = 1) │ │ │
│ │ │ • Returns: summary text only │ │ │
│ │ │ │ │ │
│ │ └─────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Result: "Found 3 auth patterns: JWT in..." │ │
│ │ (Only this text enters main context) │ │
│ │ │ │
│ └───────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
```
### Why Depth = 1?
Limiting sub-agents to one level prevents:
1. **Recursive explosion**: Agent-ception would consume infinite resources
2. **Context pollution**: Each level would accumulate context
3. **Debugging nightmares**: Tracking multi-level agent chains is hard
4. **Unpredictable costs**: Nested agents = unpredictable token usage
### Sub-Agent Types
Claude Code offers specialized sub-agent types via the `subagent_type` parameter:
| Type | Purpose | Tools Available |
|------|---------|-----------------|
| `Explore` | Codebase exploration | All read-only tools |
| `Plan` | Architecture planning | All except Edit/Write |
| `Bash` | Command execution | Bash only |
| `general-purpose` | Complex multi-step | All tools |
### When to Use Sub-Agents
| Use Case | Why Sub-Agent Helps |
|----------|---------------------|
| Searching large codebases | Keeps main context clean |
| Parallel exploration | Multiple searches simultaneously |
| Risky exploration | Errors don't pollute main context |
| Specialized analysis | Different "mindset" for different tasks |
---
## 5. Permission & Security Model
**Confidence**: 100% (Tier 1 - Official)
**Sources**:
- [code.claude.com/docs/en/hooks](https://code.claude.com/docs/en/hooks)
- [code.claude.com/docs/en/sandboxing](https://code.claude.com/docs/en/sandboxing)
Claude Code has a layered security model:
```
┌─────────────────────────────────────────────────────────────┐
│ PERMISSION LAYERS │
├─────────────────────────────────────────────────────────────┤
│ │
│ Layer 1: INTERACTIVE PROMPTS │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Claude wants to run: rm -rf node_modules │ │
│ │ [Allow once] [Allow always] [Deny] [Edit command] │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Layer 2: ALLOW/DENY RULES (settings.json) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ { │ │
│ │ "permissions": { │ │
│ │ "allow": ["Bash(npm:*)", "Read(**)"], │ │
│ │ "deny": ["Bash(rm -rf *)"] │ │
│ │ } │ │
│ │ } │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Layer 3: HOOKS (Pre/Post execution) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ PreToolUse: Validate before execution │ │
│ │ PostToolUse: Audit after execution │ │
│ │ PermissionRequest: Override permission prompts │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Layer 4: SANDBOX MODE (Optional isolation) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Filesystem isolation + Network restrictions │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
```
### Dangerous Pattern Detection
**Confidence**: 80% (Tier 2 - Observed but not exhaustive)
Claude Code appears to flag certain patterns for extra scrutiny:
| Pattern | Risk | Behavior |
|---------|------|----------|
| `rm -rf` | Destructive deletion | Always prompts |
| `sudo` | Privilege escalation | Always prompts |
| `curl \| sh` | Remote code execution | Always prompts |
| `chmod 777` | Insecure permissions | Always prompts |
| `git push --force` | History destruction | Always prompts |
| `DROP TABLE` | Data destruction | Always prompts |
This is not a complete blocklist — patterns are likely detected through model training rather than explicit rules.
### Hooks System
Hooks allow programmatic control over Claude's actions:
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [{
"type": "command",
"command": "/path/to/validate-command.sh"
}]
}
],
"PostToolUse": [
{
"matcher": "*",
"hooks": [{
"type": "command",
"command": "/path/to/audit-log.sh"
}]
}
]
}
}
```
**Hook capabilities:**
| Capability | Supported | How |
|------------|-----------|-----|
| Block execution | Yes | Exit code != 0 |
| Modify parameters | Yes | Return modified JSON |
| Log actions | Yes | Write to file in hook |
| Async processing | No | Hooks are synchronous |
**Hook JSON payload** (passed via stdin):
```json
{
"event": "PreToolUse",
"tool": "Bash",
"params": {
"command": "npm install lodash"
},
"sessionId": "abc123",
"cwd": "/path/to/project"
}
```
**Cross-reference**: See [Section 7 - Hooks](./ultimate-guide.md#7-hooks) in the main guide for complete examples.
---
## 6. MCP Integration
**Confidence**: 100% (Tier 1 - Official)
**Source**: [code.claude.com/docs/en/mcp](https://code.claude.com/docs/en/mcp)
MCP (Model Context Protocol) servers extend Claude Code with additional tools.
### MCP Architecture Overview
> **💡 Visual Guide**: The following diagram illustrates how MCP creates a secure control layer between LLMs and real systems. The LLM layer has **no direct data access** - the MCP Server enforces security policies before tools can interact with databases, APIs, or files.
![MCP Architecture - 7-Layer Security Model](./images/mcp-architecture-diagram.svg)
*Figure 1: MCP Architecture showing separation between thinking (LLM), control (MCP Server), and execution (Tools). Design inspired by [Dinesh Kumar's LinkedIn visualization](https://www.linkedin.com/posts/dinesh-kumar-6b0528b4_model-context-protocol-mcp-why-it-came-activity-7419969525795782656-VoFh), recreated under Apache-2.0 license.*
**Key security boundaries**:
- **Yellow layer (LLM)**: Reasoning only - **No Data Access**
- **Orange layer (MCP Server)**: Security control point (policies, validation, logs)
- **Grey layer (Real Systems)**: Protected data - **Hidden From AI**
### How MCP Works (Technical Details)
```
┌─────────────────────────────────────────────────────────────┐
│ MCP INTEGRATION │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ CLAUDE CODE │ │
│ │ │ │
│ │ Native Tools MCP Tools │ │
│ │ ┌─────────┐ ┌─────────────────────────┐ │ │
│ │ │ Bash │ │ mcp__serena__* │ │ │
│ │ │ Read │ │ mcp__context7__* │ │ │
│ │ │ Edit │ │ mcp__playwright__* │ │ │
│ │ │ ... │ │ mcp__custom__* │ │ │
│ │ └─────────┘ └───────────┬─────────────┘ │ │
│ │ │ │ │
│ └──────────────────────────────────┼──────────────────┘ │
│ │ │
│ JSON-RPC 2.0 │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ MCP SERVER │ │
│ │ │ │
│ │ stdio/HTTP transport │ │
│ │ Tool definitions (JSON Schema) │ │
│ │ Tool implementations │ │
│ │ │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
```
### Key MCP Facts
| Aspect | Behavior |
|--------|----------|
| Protocol | JSON-RPC 2.0 over stdio or HTTP |
| Tool naming | `mcp__<server>__<tool>` convention |
| Context sharing | Only via tool parameters and return values |
| Lifecycle | Server starts on first use, stays alive during session |
| Permissions | Same system as native tools |
### What MCP Cannot Do
| Limitation | Explanation |
|------------|-------------|
| Access conversation history | Only sees tool params, not full context |
| Maintain state across calls | Each call is independent (unless server implements caching) |
| Modify Claude's system prompt | Tools only, no prompt injection |
| Bypass permissions | Same security layer as native tools |
**Cross-reference**: See [Section 8.6 - MCP Security](./ultimate-guide.md#86-mcp-security) for security considerations.
### MCP Extensions: Apps (SEP-1865)
**Status**: Stable (January 26, 2026)
**Spec**: [SEP-1865 on GitHub](https://github.com/modelcontextprotocol/ext-apps)
**Co-authored by**: OpenAI, Anthropic, MCP-UI creators
#### What Are MCP Apps?
MCP Apps is the **first official extension** to the Model Context Protocol, enabling MCP servers to deliver **interactive user interfaces** alongside traditional tool responses.
**The problem solved**: Traditional text-based responses create friction for workflows requiring exploration. Each interaction (sort, filter, drill-down) demands a new prompt cycle. MCP Apps eliminates this "context gap" by rendering interactive UIs directly in the conversation.
#### Technical Architecture
**Two core primitives**:
1. **Tools with UI metadata**:
```json
{
"name": "query_database",
"description": "Query customer database",
"_meta": {
"ui": {
"resourceUri": "ui://dashboard/customers"
}
}
}
```
2. **UI Resources** (`ui://` scheme):
- Server-side HTML/JavaScript bundles
- Rendered in sandboxed iframes by host
- Bidirectional JSON-RPC communication via `postMessage`
**Communication flow**:
```
┌─────────────────────────────────────────────────────────┐
│ MCP APPS ARCHITECTURE │
├─────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ MCP Client │◄───────►│ MCP Server │ │
│ │ (Claude/IDE) │ JSON-RPC│ (Your App) │ │
│ └──────┬───────┘ └──────────────┘ │
│ │ │
│ │ Fetches ui:// resource │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ Sandboxed Iframe (UI Render) │ │
│ │ ┌───────────────────────────────────┐ │ │
│ │ │ HTML/JS Bundle from Server │ │ │
│ │ │ - Interactive dashboard │ │ │
│ │ │ - Forms with validation │ │ │
│ │ │ - Real-time data visualization │ │ │
│ │ └───────────────────────────────────┘ │ │
│ │ │ │
│ │ postMessage ◄─────► JSON-RPC │ │
│ └─────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘
```
#### Security Model
**Multi-layered protection**:
| Layer | Protection |
|-------|------------|
| **Iframe sandbox** | Restricted permissions (no direct system access) |
| **Pre-declared templates** | Hosts review HTML/JS before rendering |
| **Auditable messaging** | All UI-to-host communication via JSON-RPC logs |
| **User consent** | Optional requirement for UI-initiated tool calls |
| **Content blocking** | Hosts can reject suspicious resources pre-render |
→ **Cross-reference**: See [Section 8.6 - MCP Security](./ultimate-guide.md#86-mcp-security) for broader MCP security considerations.
#### SDK: @modelcontextprotocol/ext-apps
**Installation**:
```bash
npm install @modelcontextprotocol/ext-apps
```
**Core API** (framework-agnostic):
```typescript
import { App } from '@modelcontextprotocol/ext-apps';
const app = new App();
// 1. Establish communication with host
await app.connect();
// 2. Receive tool results from host
app.ontoolresult = (result) => {
// Update UI with tool execution results
updateDashboard(result.data);
};
// 3. Call server tools from UI
await app.callServerTool('fetch_analytics', {
timeRange: '7d',
metrics: ['users', 'revenue']
});
// 4. Update model context asynchronously
await app.updateModelContext({
selectedFilters: ['region:EU', 'status:active']
});
// Additional capabilities:
app.logDebug('User action', { filter: 'applied' });
app.openBrowserLink('https://docs.example.com');
app.sendFollowUpMessage('Applied filters: EU, Active');
```
**Standard communication**: All features operate over `postMessage` (no framework lock-in).
#### Platform Support
| Platform | MCP Apps Support | Notes |
|----------|------------------|-------|
| **Claude Desktop** | ✅ Available now | claude.ai/directory (Pro/Max/Team/Enterprise) |
| **Claude Cowork** | 🔄 Coming soon | Agentic workflow integration planned |
| **VS Code** | ✅ Insiders build | [Official blog post](https://code.visualstudio.com/blogs/2026/01/26/mcp-apps-support) |
| **ChatGPT** | 🔄 Rolling out | Week of Jan 26, 2026 |
| **Goose** | ✅ Available now | Open-source CLI with UI support |
| **Claude Code CLI** | ❌ N/A | Terminal text-only (no iframe rendering) |
#### Relevance for Claude Code Users
**Direct usage**: None (CLI is text-only, cannot render iframes)
**Indirect benefits**:
1. **Ecosystem understanding**: MCP Apps represents the future of agentic workflows
2. **MCP server development**: If building custom MCP servers, Apps is now a design option
3. **Hybrid workflows**:
- Use Claude Desktop to explore data with Apps (dashboards, visualizations)
- Switch to Claude Code CLI for implementation (scripting, automation)
4. **Context for configuration**: MCP servers may advertise UI capabilities in metadata
#### Example Implementations
**Official example servers** (in [`ext-apps` repository](https://github.com/modelcontextprotocol/ext-apps)):
- **threejs-server**: 3D visualization and manipulation
- **map-server**: Interactive geographic data exploration
- **pdf-server**: Document viewing with inline highlights
- **system-monitor-server**: Real-time metrics dashboards
- **sheet-music-server**: Music notation rendering
**Production adoption** (January 2026):
| Tool | Provider | Capabilities |
|------|----------|--------------|
| Asana | Atlassian | Project timelines, task boards |
| Slack | Salesforce | Message drafting with formatting preview |
| Figma | Figma | Flowcharts, Gantt charts in FigJam |
| Amplitude | Amplitude | Analytics charts with interactive filtering |
| Box | Box | File search, document previews |
| Canva | Canva | Presentation design with real-time customization |
| Clay | Clay | Company research, contact discovery |
| Hex | Hex | Data analysis with interactive queries |
| monday.com | monday.com | Work management boards |
**Coming soon**: Salesforce (Agentforce 360)
#### Relationship to Prior Work
MCP Apps standardizes patterns pioneered by:
- **MCP-UI**: Early UI extension for MCP (community project)
- **OpenAI Apps SDK**: Parallel effort for interactive tools
Both frameworks continue to be supported. MCP Apps provides a **unified specification** (SEP-1865) co-authored by maintainers from both ecosystems plus Anthropic and OpenAI.
**Migration path**: Straightforward for existing MCP-UI and Apps SDK implementations.
#### When to Use MCP Apps
**Decision tree for MCP server developers**:
```
Building a custom MCP server?
├─ Users need to SELECT from 50+ options? → MCP Apps (dropdown, multi-select UI)
├─ Users need to VISUALIZE data patterns? → MCP Apps (charts, maps, graphs)
├─ Users need MULTI-STEP workflows with conditional logic? → MCP Apps (wizard forms)
├─ Users need REAL-TIME updates? → MCP Apps (live dashboards)
└─ Simple data retrieval or actions only? → Traditional MCP tools (sufficient)
```
**Trade-off**: UI complexity and implementation effort vs. user experience improvement.
#### Resources
- **Specification**: [SEP-1865 on GitHub](https://github.com/modelcontextprotocol/ext-apps)
- **SDK**: [`@modelcontextprotocol/ext-apps` (npm)](https://www.npmjs.com/package/@modelcontextprotocol/ext-apps)
- **Example servers**: [modelcontextprotocol/ext-apps repository](https://github.com/modelcontextprotocol/ext-apps)
- **Blog post (MCP)**: [MCP Apps announcement](https://blog.modelcontextprotocol.io/posts/2026-01-26-mcp-apps/)
- **Blog post (Claude)**: [Interactive tools in Claude](https://claude.com/blog/interactive-tools-in-claude)
- **VS Code**: [MCP Apps support announcement](https://code.visualstudio.com/blogs/2026/01/26/mcp-apps-support)
---
### MCP Tool Search (Lazy Loading)
**Confidence**: 100% (Tier 1 - Official)
**Source**: [anthropic.com/engineering/advanced-tool-use](https://www.anthropic.com/engineering/advanced-tool-use)
Since v2.1.7 (January 2026), Claude Code uses **lazy loading** for MCP tool definitions instead of preloading all tools into context. This is powered by Anthropic's [Advanced Tool Use](https://www.anthropic.com/engineering/advanced-tool-use) API feature.
**The problem solved:**
- MCP tool definitions consume significant context (e.g., GitHub MCP alone: ~46K tokens for 93 tools)
- Developer Scott Spence documented 66,000+ tokens consumed before typing a single prompt
- This "context pollution" limited practical MCP adoption
**How Tool Search works:**
```
┌─────────────────────────────────────────────────────────────┐
│ MCP TOOL SEARCH FLOW │
├─────────────────────────────────────────────────────────────┤
│ │
│ WITHOUT Tool Search (eager loading): │
│ ┌──────────────────────────────────────────────────────┐ │
│ │All 100+ tool definitions loaded upfront (~55K tokens)│ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ WITH Tool Search (lazy loading): │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Step 1: Only search tool loaded (~500 tokens) │ │
│ │ Step 2: Claude determines needed capability │ │
│ │ Step 3: Tool Search finds matching tools (regex/BM25)│ │
│ │ Step 4: Only matched tools loaded (~600 tokens each) │ │
│ │ Step 5: Tool invoked normally │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ Result: 55K tokens → ~8.7K tokens (85% reduction) │
│ │
└─────────────────────────────────────────────────────────────┘
```
**Measured improvements** (Anthropic benchmarks):
| Metric | Before | After | Improvement |
|--------|--------|-------|-------------|
| Token overhead (5-server setup) | ~55K | ~8.7K | **85% reduction** |
| Opus 4 tool selection accuracy | 49% | 74% | +25 points |
| Opus 4.5 tool selection accuracy | 79.5% | 88.1% | +8.6 points |
**Configuration** (v2.1.9+):
```bash
# Environment variable
ENABLE_TOOL_SEARCH=auto # Default (10% context threshold)
ENABLE_TOOL_SEARCH=auto:5 # Aggressive (5% threshold)
ENABLE_TOOL_SEARCH=auto:20 # Conservative (20% threshold)
ENABLE_TOOL_SEARCH=true # Always enabled
ENABLE_TOOL_SEARCH=false # Disabled (eager loading)
```
| Threshold | Recommended for |
|-----------|-----------------|
| `auto:20` | Lightweight setups (5-10 tools) |
| `auto:10` | Balanced default (20-50 tools) |
| `auto:5` | Power users (100+ tools) |
→ As Simon Willison noted: "Context pollution is why I rarely used MCP. Now that it's solved, there's no reason not to hook up dozens or even hundreds of MCPs to Claude Code." — [X/Twitter, January 14, 2026](https://twitter.com/simonw)
---
## 7. The Edit Tool: How It Actually Works
**Confidence**: 90% (Tier 2 - Verified through behavior)
**Sources**:
- Observed behavior
- [github.com/cline/cline/issues/2909](https://github.com/cline/cline/issues/2909) (similar implementation)
The Edit tool is more sophisticated than it appears.
### Edit Algorithm
```
┌─────────────────────────────────────────────────────────────┐
│ EDIT TOOL FLOW │
├─────────────────────────────────────────────────────────────┤
│ │
│ Input: old_string, new_string, file_path │
│ │
│ ┌──────────────────────────────────────┐ │
│ │ Step 1: EXACT MATCH │ │
│ │ Search for literal old_string │ │
│ └────────────────┬─────────────────────┘ │
│ │ │
│ Found? ────┴──── Not found? │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────┐ ┌──────────────────┐ │
│ │ REPLACE │ │ Step 2: FUZZY │ │
│ │ (done) │ │ MATCH │ │
│ └──────────┘ └────────┬─────────┘ │
│ │ │
│ Found? ────┴──── Not found? │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────┐ ┌──────────────┐ │
│ │ REPLACE │ │ ERROR │ │
│ │ + WARN │ │ (mismatch) │ │
│ └──────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
```
### Fuzzy Matching Details
When exact match fails, the Edit tool attempts:
1. **Whitespace normalization**: Ignore trailing spaces, normalize indentation
2. **Line ending normalization**: Handle CRLF vs LF differences
3. **Context expansion**: Use surrounding lines to locate the right spot
If fuzzy matching also fails, the tool returns an error asking Claude to verify the old_string.
### Edit Validation
Before applying changes, the Edit tool:
| Check | Purpose |
|-------|---------|
| File exists | Prevent creating files via Edit |
| old_string found | Ensure we're editing the right place |
| Single match | old_string must be unique (or use `replace_all`) |
| New content differs | Prevent no-op edits |
### When Edit Fails
| Error | Cause | Claude's Response |
|-------|-------|-------------------|
| "old_string not found" | Content changed since last read | Re-reads file, tries again |
| "Multiple matches" | old_string isn't unique | Uses more context lines |
| "File not found" | Wrong path | Searches for correct path |
---
## 8. Session Persistence
**Confidence**: 100% (Tier 1 - Official)
**Source**: [code.claude.com/docs](https://code.claude.com/docs/en/setup)
Sessions can be resumed across terminal sessions.
### Resume Mechanisms
| Command | Behavior |
|---------|----------|
| `claude --continue` / `claude -c` | Resume most recent session |
| `claude --resume <id>` / `claude -r <id>` | Resume specific session by ID |
### What Gets Persisted
| Persisted | Not Persisted |
|-----------|---------------|
| Conversation history | Live tool state |
| Tool call results | Pending operations |
| Session ID | File locks |
| Working directory context | Environment variables |
### Storage Format
**Confidence**: 50% (Tier 3 - Inferred)
Sessions appear to be stored as JSON/JSONL files in `~/.claude/` but:
- Format is not publicly documented
- Not intended as a stable API
- May change between versions
**Do not rely on session file format** for external tooling.
---
## 9. Philosophy: Less Scaffolding, More Model
**Confidence**: 100% (Tier 1 - Official)
**Source**: Daniela Amodei (Anthropic CEO) - Public statements
The core philosophy behind Claude Code:
> "Do more with less. Smart architecture choices, better training efficiency, and focused problem-solving can compete with raw scale."
### What This Means in Practice
| Traditional Approach | Claude Code Approach |
|---------------------|---------------------|
| Intent classifier → Router → Specialist | Single model decides everything |
| RAG with embeddings | Grep + Glob (regex search) |
| DAG task orchestration | Simple while loop |
| Tool-specific planners | Model-driven tool selection |
| Complex state machines | Conversation as state |
| Prompt engineering frameworks | Trust the model |
### Why It Works
1. **Model capability**: Claude 3.5+ is capable enough to handle routing decisions
2. **Reduced latency**: Fewer components = faster response
3. **Simpler debugging**: When something fails, there's one place to look
4. **Better generalization**: No hand-coded rules to break on edge cases
### The Trade-offs
| Advantage | Disadvantage |
|-----------|--------------|
| Simplicity | Less fine-grained control |
| Flexibility | Harder to enforce strict behaviors |
| Fewer bugs | Model errors affect everything |
| Fast iteration | Requires good model quality |
---
## 10. Claude Code vs Alternatives
**Confidence**: 70% (Tier 3 - Based on public information)
**Sources**: Various 2024-2025 comparisons, official documentation
| Dimension | Claude Code | GitHub Copilot Workspace | Cursor | Amazon Q Developer |
|-----------|-------------|-------------------------|--------|-------------------|
| **Architecture** | while(tool) loop | Cloud-based planning | Event-driven + cloud | AWS-integrated agents |
| **Execution** | Local terminal | Cloud sandbox | Local + cloud | Cloud/local hybrid |
| **Model** | Claude (single) | GPT-4 variants | Multiple (adaptive) | Amazon Titan + others |
| **Context** | ~200K tokens | Limited | Limited | Varies |
| **Transparency** | High (visible reasoning) | Medium | Medium | Low |
| **Customization** | CLAUDE.md + hooks | Limited | Plugins | AWS integration |
| **MCP Support** | Native | No | Some servers | No |
| **Pricing** | Pro/Max tiers | GitHub subscription | Per-seat | AWS-integrated |
### When to Choose Claude Code
| Scenario | Claude Code | Alternative |
|----------|-------------|-------------|
| Deep codebase exploration | Excellent | Good |
| Terminal-native workflow | Excellent | Limited |
| Custom automation (hooks) | Excellent | Limited |
| Team standardization | Good (CLAUDE.md) | Varies |
| IDE integration | Limited (VS Code ext) | Cursor/Copilot better |
| Enterprise compliance | Via Anthropic enterprise | Varies |
---
## 11. Sources & References
### Tier 1 - Official Anthropic
| Source | URL | Topics |
|--------|-----|--------|
| Engineering Blog | anthropic.com/engineering/claude-code-best-practices | Master loop, philosophy |
| Setup Docs | code.claude.com/docs/en/setup | Tools, commands |
| Context Windows | platform.claude.com/docs/en/build-with-claude/context-windows | Token limits |
| Hooks Reference | code.claude.com/docs/en/hooks | Hook system |
| Hooks Guide | code.claude.com/docs/en/hooks-guide | Hook examples |
| MCP Docs | code.claude.com/docs/en/mcp | MCP integration |
| Sandboxing | code.claude.com/docs/en/sandboxing | Security model |
### Tier 2 - Verified Analysis
| Source | URL | Topics |
|--------|-----|--------|
| PromptLayer Analysis | blog.promptlayer.com/claude-code-behind-the-scenes-of-the-master-agent-loop/ | Internal architecture |
| Steve Kinney Course | stevekinney.com/courses/ai-development/claude-code-* | Permissions, sessions |
### Tier 3 - Community Resources
| Source | Topics |
|--------|--------|
| GitHub Issues (anthropics/claude-code) | Edge cases, bugs, feature discussions |
| Reddit r/ClaudeAI | User experiences, workarounds |
| YouTube tutorials | Visual walkthroughs |
---
## 12. Appendix: What We Don't Know
Transparency about gaps in our understanding:
### Unknown or Unconfirmed
| Topic | What We Don't Know | Confidence in Current Understanding |
|-------|-------------------|-------------------------------------|
| **Exact compaction threshold** | Is it 75%? 85%? 92%? Varies by model? | 40% |
| **System prompt contents** | Full text not public, varies by model version | 30% |
| **Token counting method** | Exact tokenizer, overhead for tool schemas | 50% |
| **Model fallback** | Does Claude Code fallback if a model fails? | 20% |
| **Internal caching** | Is there result caching between sessions? | 20% |
| **Rate limiting logic** | How rate limits are applied per-tool | 40% |
### Explicitly Undocumented
These are intentionally not documented by Anthropic:
- Session file format (internal implementation detail)
- System prompt variations between models
- Internal component names/architecture
- Token usage breakdown per component
- Exact permission evaluation order
### How to Stay Updated
1. **Official changelog**: Watch anthropic.com/changelog
2. **GitHub releases**: github.com/anthropics/claude-code/releases
3. **Community Discord**: Various Claude-focused servers
4. **This guide**: Updated periodically based on verified information
---
## Contributing
Found an error? Have verified new information? Contributions welcome:
1. **For official facts**: Cite the Anthropic source
2. **For observations**: Describe how you verified the behavior
3. **For corrections**: Explain what's wrong and why
---
**Last updated**: January 2026
**Claude Code version**: 3.3.x
**Document version**: 1.0.0