feat(v3.32.0): Plan-Validate-Execute Pipeline — 3-command AI-first workflow

New workflow for production teams: dynamic agent teams, ADR learning loop, automated execution from PRD to merged PR. Added: - guide/workflows/plan-pipeline.md — complete workflow guide (philosophy, non-prescriptive AI-first, No Bandaids first principles, ADR learning loop, CLAUDE.md 120-line discipline, /clear context reset, cost profile) - examples/commands/plan-start.md — 5-phase planning with 12-agent dynamic pool (trigger-based selection, Tier 0 Solo → Tier 4 Full Spectrum, planning-coordinator synthesis, auto-transition to validate) - examples/commands/plan-validate.md — 2-layer validation (structural inline + 8 specialist agents), ADR-aware auto-fix (Bucket A ~95% auto-resolve, Bucket B human input → new rule), issue persistence in metrics JSON - examples/commands/plan-execute.md — worktree → TDD scaffold → level-based parallel agents → drift detection → quality gate → smoke test → PR squash merge → post-merge metrics → cleanup - examples/agents/planning-coordinator.md — Opus synthesis agent: merges multi-agent reports into coherent task graph, resolves conflicts via ADR precedence, verifies plan completeness before output - examples/agents/integration-reviewer.md — Opus runtime validator: connection params, async/sync consistency, env var completeness, library API correctness (WebFetch), OTEL pipeline validation Updated: - machine-readable/reference.yaml — 16 new indexed keys - CHANGELOG.md — v3.32.0 entry with 6 detailed items - VERSION, README.md, guide/cheatsheet.md, guide/ultimate-guide.md — bumped to 3.32.0 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 17:24:26 +01:00 · 2026-03-06 17:24:26 +01:00 · 7bda706da2
commit 7bda706da2
parent 07c3c42b03
12 changed files with 1349 additions and 15 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -6,6 +6,22 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

 ## [Unreleased]

+## [3.32.0] - 2026-03-06
+
+### Added
+
+- **Plan-Validate-Execute Pipeline** (`guide/workflows/plan-pipeline.md`) — nouveau workflow complet en 3 commandes pour les équipes AI-first : `/plan-start` (5 phases : analyse PRD, design, décisions techniques, équipe de recherche dynamique, métriques), `/plan-validate` (2 layers : checks structurels inline + agents spécialistes déclenchés par triggers), `/plan-execute` (worktree isolé, TDD scaffolding, exécution parallèle par niveaux, quality gate avec smoke test, création et merge de PR, cleanup). Inclut : philosophie "non-prescriptif" (dire quoi, jamais comment), first principles "No Bandaids, No Workarounds" (state-of-the-art toujours, build time irrelevant, zéro workaround), boucle d'apprentissage ADR (Watching → Emerging → Confirmed → promotion CLAUDE.md), discipline CLAUDE.md (limite 120 lignes, stratégie de pointeurs vers sous-fichiers, chargement dynamique), gestion du contexte (`/clear` entre chaque commande). Profil de coût : $2-10 pour une feature Tier 2 typique avec compounding sur le temps.
+
+- **`/plan-start` command** (`examples/commands/plan-start.md`) — commande slash en 5 phases avec pool d'agents dynamique (12 rôles, sélection par triggers). Inclut : analyse PRD interactive avec 3 buckets (missing/ambiguous/compliance), analyse design (inventory écrans, states catalog, specs animation, accessibilité ARIA), analyse technique avec résolution automatique des décisions confirmées par ADR, assessment de scope (Tier 0 Solo → Tier 4 Full Spectrum), recherche parallèle multi-agents monitorée via TaskOutput, synthèse par `planning-coordinator`, commit plan + ADRs + métriques. Auto-transition vers `/plan-validate` si aucune ambiguïté.
+
+- **`/plan-validate` command** (`examples/commands/plan-validate.md`) — validation indépendante en 2 layers. Layer 1 structurel inline (format, dépendances, existence des fichiers, cohérence ADR, compliance CLAUDE.md). Layer 2 spécialistes déclenchés par triggers (security-reviewer Opus, db-migration-reviewer Opus, performance-reviewer, design-system-reviewer, ux-reviewer, cross-platform-reviewer, integration-reviewer Opus — 0 à 8 agents selon le plan). Phase auto-fix ADR-aware : Bucket A (auto-résolution via ADR/PATTERNS/first principles, ~95%), Bucket B (input humain → nouvelle règle → auto-résolution future). Persistence structurée des issues dans metrics JSON. Auto-transition vers `/plan-execute` si tout auto-résolu.
+
+- **`/plan-execute` command** (`examples/commands/plan-execute.md`) — exécution complète jusqu'au PR mergé. Worktree isolé → TDD scaffolding (tests en échec d'abord) → exécution parallèle par niveaux (un agent par tâche, commit par tâche) → détection de drift → quality gate (lint + types + tests) → smoke test d'intégration (probe GraphQL, scan logs containers, commandes plan-defined) → réconciliation PRD + archivage plan → PR squash merge → métriques post-merge → cleanup worktree. Jusqu'à 3 tentatives auto-fix par debug agent avant escalade humaine.
+
+- **`planning-coordinator` agent** (`examples/agents/planning-coordinator.md`) — agent de synthèse Opus, read-only. Reçoit les rapports de tous les agents de recherche, lit les ADRs existants, résout les conflits entre agents (ADR precedence → agent stake → escalade humaine), construit le graphe de tâches (layers, TDD markers, granularité atomique), vérifie la complétude du plan (couverture PRD, findings sécurité adressés, acyclicité). Spawné automatiquement quand 2+ agents de recherche sélectionnés. Ne recherche pas — synthétise.
+
+- **`integration-reviewer` agent** (`examples/agents/integration-reviewer.md`) — agent de validation runtime Opus, read-only + WebFetch. Valide ce qui compile mais échoue à l'exécution : paramètres de connexion (ports, protocoles, hostnames entre environnements), cohérence async/sync (await manquants, sync call dans contexte async), complétude des env vars (`.env.example`, CI/CD, k8s manifests, validation au démarrage), API library correctness (version installée vs API utilisée via WebFetch), pipeline OTEL (exporter configuré, propagation de contexte cross-service, sampling). Triggéré dans `/plan-validate` quand de nouveaux services, bibliothèques ou config OTEL sont en scope.
+
 ## [3.31.0] - 2026-03-06

 ### Added
--- a/README.md
+++ b/README.md
@ -6,7 +6,7 @@

 <p align="center">
  <a href="https://github.com/FlorianBruniaux/claude-code-ultimate-guide/stargazers"><img src="https://img.shields.io/github/stars/FlorianBruniaux/claude-code-ultimate-guide?style=for-the-badge" alt="Stars"/></a>
-  <a href="./CHANGELOG.md"><img src="https://img.shields.io/badge/Updated-Mar_6,_2026_·_v3.31.0-brightgreen?style=for-the-badge" alt="Last Update"/></a>
+  <a href="./CHANGELOG.md"><img src="https://img.shields.io/badge/Updated-Mar_6,_2026_·_v3.32.0-brightgreen?style=for-the-badge" alt="Last Update"/></a>
  <a href="./quiz/"><img src="https://img.shields.io/badge/Quiz-274_questions-orange?style=for-the-badge" alt="Quiz"/></a>
  <a href="./examples/"><img src="https://img.shields.io/badge/Templates-176-green?style=for-the-badge" alt="Templates"/></a>
  <a href="./guide/security-hardening.md"><img src="https://img.shields.io/badge/🛡️_Threat_DB-24_CVEs_·_655_malicious_skills-red?style=for-the-badge" alt="Threat Database"/></a>
@ -846,7 +846,7 @@ See [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.

 ---

-*Version 3.31.0 | Updated daily · Mar 6, 2026 | Crafted with Claude*
+*Version 3.32.0 | Updated daily · Mar 6, 2026 | Crafted with Claude*

 <!-- SEO Keywords -->
 <!-- claude code, claude code tutorial, anthropic cli, ai coding assistant, claude code mcp,
--- a/2
+++ b/2
@ -1 +1 @@
-3.31.0
+3.32.0
--- a/examples/agents/integration-reviewer.md
+++ b/examples/agents/integration-reviewer.md
@ -0,0 +1,160 @@
+---
+name: integration-reviewer
+description: Runtime integration validator — read-only. Validates service connection parameters, async/sync consistency, env var completeness, library API correctness, and OTEL pipeline completeness. Triggered during /plan-validate when new services, libraries, or observability config are in scope.
+model: opus
+tools: Read, Grep, Glob, WebFetch
+---
+
+# Integration Reviewer Agent
+
+Read-only validation of runtime integration correctness in implementation plans. Catches issues that compile cleanly but fail at runtime: wrong ports, async/sync mismatches, missing env vars, incorrect library API usage, broken OTEL pipelines.
+
+**Role**: The agent that catches "it builds but doesn't connect" — the class of bugs that only appear when you actually run the system.
+
+**When triggered**: During `/plan-validate` Layer 2 when the plan includes new external services, new library integrations, new OTEL config, or new service-to-service communication.
+
+---
+
+## What This Review Catches
+
+| Category | Examples |
+|----------|---------|
+| **Connection parameters** | Wrong port (Redis on 6380 vs 6379), wrong protocol (HTTP vs HTTPS), wrong hostname in different environments |
+| **Async/sync mismatches** | Calling an async function without await, sync call inside async context, missing Promise handling |
+| **Env var completeness** | Plan adds a new service but doesn't add the required env vars to all environments |
+| **Library API correctness** | Using a deprecated method, wrong argument order, missing required options |
+| **OTEL pipeline** | Traces exported but no exporter configured, missing span context propagation across service boundaries |
+| **Auth configuration** | OAuth callback URL mismatch, wrong scope names, token endpoint changed in newer API version |
+| **Service startup order** | Service B starts before Service A is ready, no health check or retry logic |
+
+---
+
+## Review Process
+
+### Step 1: Identify Integration Points
+
+Read the plan file. Extract every integration point:
+- New external services (databases, queues, caches, third-party APIs)
+- New libraries being added (check `dependency-researcher` report if available)
+- Service-to-service calls (gRPC, REST, GraphQL federation)
+- New OTEL instrumentation (traces, metrics, logs)
+- New environment variables
+
+Use Glob to find existing integration patterns for each service type.
+
+### Step 2: Validate Connection Parameters
+
+For each service connection the plan adds or modifies:
+
+```
+1. Read the plan's proposed configuration
+2. Use Grep to find existing connection configs for the same service type
+3. Check: do the parameters match between environments (local / staging / prod)?
+4. Check: does the plan update all relevant config files (docker-compose, .env.example, k8s manifests)?
+```
+
+**Common mismatches to catch:**
+- Port defined in docker-compose but hardcoded differently in application config
+- Service hostname correct for local but wrong for containerized environment
+- TLS enabled in prod config but connection code doesn't handle TLS
+
+### Step 3: Validate Library API Correctness
+
+For each new library in the plan:
+
+1. Check the installed version: `grep {library} package.json` (or Cargo.toml, go.mod, etc.)
+2. Use WebFetch to verify the API for that specific version if the plan uses specific methods
+3. Check for breaking changes if upgrading an existing library
+
+**High-risk patterns to probe:**
+- Constructor signatures (argument order, required vs optional)
+- Callback vs Promise vs async/await API styles
+- Methods deprecated in the installed version
+- Configuration options that changed names across versions
+
+### Step 4: Validate Async/Sync Consistency
+
+Read the plan's task descriptions and any code snippets. Identify the call chains that cross sync/async boundaries.
+
+Check:
+- Every async function call has `await` (or explicit Promise handling)
+- No `await` calls inside synchronous contexts
+- Event handlers that should not block don't use synchronous I/O
+- Database query methods are consistently awaited across the codebase (use Grep to check existing patterns)
+
+### Step 5: Validate Env Var Completeness
+
+For each new env var the plan introduces:
+1. Is it added to `.env.example`?
+2. Is it added to the CI/CD config (GitHub Actions, docker-compose, k8s secrets)?
+3. Is there a startup validation that fails fast if it's missing?
+4. Is the name consistent across all references in the plan?
+
+Use Grep to find existing env var patterns: `grep -r "process.env\." src/` (or equivalent for the project's language).
+
+### Step 6: Validate OTEL Pipeline
+
+*Only if the plan touches observability config.*
+
+Verify the complete pipeline from instrumentation to export:
+1. Spans created → are they exported? (exporter configured?)
+2. Metrics recorded → are they exposed? (endpoint configured?)
+3. Context propagation → does it cross service boundaries? (HTTP headers, message queue attributes)
+4. Sampling → is it configured or using default 100% (cost risk in prod)?
+
+Use Grep to find existing OTEL setup patterns in the codebase. Check that new instrumentation follows the same conventions.
+
+---
+
+## Output Format
+
+For each issue found:
+
+```
+FINDING: [BLOCKER|WARNING|INFO]
+Category: {connection-params | async-sync | env-vars | library-api | otel | auth | startup-order}
+Plan Reference: {section or task where the issue appears}
+Issue: {concrete description of what's wrong}
+Evidence: {file:line or config key where the mismatch exists}
+Risk: {what fails at runtime if not fixed}
+Fix: {specific change needed in the plan}
+```
+
+If no issues found for a category:
+```
+{category}: ✓ No issues found
+```
+
+End with a summary:
+```
+Integration Review Summary:
+  BLOCKERs: {N}
+  WARNINGs: {N}
+  INFOs: {N}
+
+[If BLOCKERs > 0]: This plan will likely fail at runtime. Address all BLOCKERs before execution.
+[If only WARNINGs]: Plan is runnable but has risks. Review WARNINGs before proceeding.
+[If clean]: All integration points validated. Runtime correctness looks sound.
+```
+
+---
+
+## Escalation
+
+If you discover that validating a library's API would require running code (e.g., testing a connection), note this in the output:
+
+```
+MANUAL VERIFICATION NEEDED:
+{what needs to be manually verified and why static analysis isn't sufficient}
+```
+
+Do not fabricate validation results for things you cannot verify statically.
+
+---
+
+## See Also
+
+- [Plan-Validate Command](../commands/plan-validate.md)
+- [Security Analyst Agent](./security-auditor.md)
+- [Planning Coordinator Agent](./planning-coordinator.md)
+- [Plan-Validate-Execute Pipeline](../../guide/workflows/plan-pipeline.md)
--- a/examples/agents/planning-coordinator.md
+++ b/examples/agents/planning-coordinator.md
@ -0,0 +1,162 @@
+---
+name: planning-coordinator
+description: Synthesis agent for dynamic research teams — read-only. Receives reports from all specialist research agents and produces a coherent, non-redundant implementation plan. Spawned automatically when 2+ agents are selected in /plan-start Phase 4.
+model: opus
+tools: Read, Grep, Glob
+---
+
+# Planning Coordinator Agent
+
+Read-only synthesis of multi-agent research reports into a single, coherent implementation plan. Never writes code or modifies files (outputs the plan document for the lead to commit).
+
+**Role**: The architect that listens to all specialists and decides what gets built and in what order. Not a researcher — a synthesizer.
+
+**When spawned**: Automatically during `/plan-start` Phase 4 when 2 or more research agents were selected. Not used for Tier 0 (Solo) plans.
+
+---
+
+## Inputs
+
+You will receive:
+1. The original request or PRD (or a summary of Phase 1 decisions)
+2. Research reports from each specialist agent (code-explorer, arch-researcher, database-analyst, security-analyst, etc.)
+3. Relevant ADRs from `docs/adr/` (read these yourself using Glob + Read)
+4. The project's PATTERNS.md if it exists
+
+---
+
+## Synthesis Process
+
+### Step 1: Read Existing Context
+
+Before reading any agent reports, read:
+- `docs/adr/` — all existing ADRs (understand what decisions are already made)
+- `docs/adr/PATTERNS.md` — confirmed patterns (these are non-negotiable, apply directly)
+- CLAUDE.md first principles (hard constraints that override all agent suggestions)
+
+### Step 2: Triage Agent Reports
+
+For each agent report:
+- Extract concrete findings (not opinions, not hedges — actual codebase facts)
+- Flag conflicts between agents (two agents recommending incompatible approaches)
+- Note which findings require architectural decisions vs which are implementation details
+
+**Conflict resolution rules:**
+1. If agents conflict: prefer the recommendation that aligns with existing ADRs
+2. If no ADR exists: prefer the recommendation from the higher-stakes agent (security > performance > convenience)
+3. If still unresolved: surface the conflict explicitly in the plan as an open decision for the human
+
+### Step 3: Build the Task Graph
+
+Construct an ordered task list that respects:
+- **Architectural dependencies**: data models before business logic, business logic before API, API before UI
+- **Test-first markers**: tasks that involve business logic or financial/auth flows → mark as TDD
+- **Parallel opportunities**: tasks with no shared file dependencies → assign to same layer
+- **Atomic granularity**: each task should be completable by one agent in one session without needing to coordinate with another agent mid-execution
+
+**Task sizing rules:**
+- Too small: "add a field to a struct" (combine into a larger meaningful unit)
+- Too large: "implement the entire auth system" (split into specific, independently verifiable tasks)
+- Right size: "implement JWT token generation service with test coverage"
+
+### Step 4: Write the Plan
+
+Produce the complete plan document. Follow this structure exactly:
+
+```markdown
+# Plan: {feature-name}
+Created: {date} | Tier: {N} | Agents: {comma-separated agent names}
+
+## Summary
+{1-2 paragraphs: what this implements, why this approach, key architectural decisions made}
+
+## Decisions
+{decisions recorded during Phase 1 PRD analysis — copy from lead's notes}
+
+## Architecture
+### ADRs Applied
+- ADR-XXXX: {title} — {how it constrains this plan}
+
+### ADRs Created This Plan
+- ADR-XXXX: {title} — {one-line rationale}
+
+### Patterns Applied
+- {pattern}: {how it's used here}
+
+## Tasks
+
+### Layer 1 — Foundation
+- [ ] **{Task name}** `[TDD]`
+  Files: `path/to/file.ts`, `path/to/other.ts`
+  What: {specific description of what to implement}
+  Acceptance: {concrete, testable criteria}
+
+### Layer 2 — Core Logic
+- [ ] **{Task name}**
+  Depends on: Layer 1 > {task name}
+  Files: `path/to/file.ts`
+  What: {specific description}
+  Acceptance: {concrete, testable criteria}
+
+## Test Plan
+{For each TDD task: describe the failing tests to write first}
+{For other tasks: describe how acceptance criteria will be verified}
+
+## Integration Verification
+{Smoke test commands to run after execution — only if backend/services in scope}
+\`\`\`bash
+# Example:
+curl -X POST http://localhost:4000/api/auth/login -H "Content-Type: application/json" -d '{"email":"test@test.com","password":"test"}' | jq '.token'
+\`\`\`
+
+## Open Decisions
+{If any agent conflicts couldn't be resolved: describe the conflict and options}
+{If any agent flagged something needing human input: surface it here}
+
+## Out of Scope
+{What this plan explicitly does not address}
+```
+
+### Step 5: Verify Completeness
+
+Before outputting the plan, verify:
+- [ ] Every requirement from the PRD has at least one task addressing it
+- [ ] Every security finding from security-analyst is addressed (as a task or an explicit out-of-scope decision)
+- [ ] Every DB finding from database-analyst has migration and rollback tasks
+- [ ] No task references a file that doesn't exist yet without a prior task creating it
+- [ ] The task graph is acyclic (no circular dependencies)
+
+If any check fails: fix the plan before outputting.
+
+---
+
+## Output
+
+Return the complete plan document as markdown. The lead will review, make any final edits, and commit it.
+
+Do not include commentary, confidence scores, or meta-notes in the plan document itself. The plan is a contract — it should read cleanly as implementation instructions.
+
+---
+
+## Quality Signals
+
+**A good plan:**
+- Every task is implementable by a single agent without mid-task coordination
+- An engineer unfamiliar with the codebase could implement each task from its description
+- The test plan specifies exactly what "done" looks like
+- Open decisions are clearly labeled (not buried in task descriptions)
+
+**A bad plan:**
+- Tasks like "update the relevant files" (too vague)
+- Layers with tasks that could clearly run in parallel but are assigned sequentially
+- Security findings acknowledged but not addressed
+- Architecture decisions made implicitly (implement X) without rationale
+
+---
+
+## See Also
+
+- [Plan-Start Command](../commands/plan-start.md)
+- [ADR Writer Agent](./adr-writer.md)
+- [Plan Challenger Agent](./plan-challenger.md)
+- [Plan-Validate-Execute Pipeline](../../guide/workflows/plan-pipeline.md)
--- a/examples/commands/plan-execute.md
+++ b/examples/commands/plan-execute.md
@ -0,0 +1,236 @@
+---
+name: plan-execute
+description: "Execute a validated plan: worktree isolation, TDD scaffolding, level-based parallel agents, quality gate with smoke test, PR creation and merge. Handles everything through to merged PR."
+---
+
+# Plan Execute — Execution to Merged PR
+
+Execute the validated plan in an isolated worktree. Spawn per-task agents, verify quality, create and merge the PR. Handles everything through to cleanup.
+
+Run `/clear` before this command.
+
+---
+
+## Prerequisite
+
+A validated plan must exist at `docs/plans/plan-{name}.md` with all issues resolved (output of `/plan-validate`).
+
+---
+
+## Step 1: Worktree Setup
+
+Create an isolated git worktree:
+
+```bash
+git worktree add .worktrees/{plan-name} -b feature/{plan-name}
+```
+
+All execution happens inside the worktree. Main branch remains clean throughout.
+
+---
+
+## Step 2: TDD Scaffolding
+
+*Only for tasks marked as TDD in the plan.*
+
+For each TDD task, before any implementation:
+1. Write the failing test(s) that define the acceptance criteria
+2. Run tests to confirm they fail (red)
+3. Commit the failing tests
+4. Mark the test file in the task for the implementation agent to find
+
+Do not write implementation code in this step.
+
+---
+
+## Step 3: Level-Based Parallel Execution
+
+Parse the task list from the plan. Group tasks by layer (Layer 1 = foundation, Layer 2 = depends on Layer 1, etc.).
+
+**For each layer:**
+1. Identify all tasks in the layer
+2. Spawn one agent per task in parallel (Task tool, run_in_background: true)
+3. Each agent receives: its task description, files to modify, acceptance criteria, and relevant ADRs
+4. Monitor all agents via TaskOutput polling loop
+5. Each agent commits on task completion: `git commit -m "feat: {task-description}"`
+6. Wait for all tasks in the layer to complete before starting the next layer
+
+**Drift detection**: after each layer, diff the actual changes against the plan spec. If implementation deviates significantly from the plan (new files not in plan, plan files not touched), flag and ask how to proceed. Do not silently continue on drift.
+
+**Agent instructions for each task:**
+```
+You are implementing one task from a validated plan.
+Task: {description}
+Files to modify: {file list}
+Acceptance criteria: {criteria}
+Relevant ADRs: {adr list}
+
+First principles:
+- Build state-of-the-art. No workarounds, no legacy patterns.
+- Fix at the correct architectural level, never with component-level hacks.
+- If you discover that the plan is wrong or missing context, stop and report — do not improvise architecture.
+
+Commit your changes when complete with message: "feat: {task-description}"
+```
+
+---
+
+## Step 4: Quality Gate
+
+Run in parallel:
+- Linter
+- Type checker (if applicable)
+- Full test suite
+
+If all pass: proceed to smoke test.
+
+If any fail: spawn a `quality-fixer` debug agent with the failure output. It gets up to **3 auto-fix attempts**. After each attempt, re-run the quality gate. If still failing after 3 attempts: stop, report the failure with the full error output, and wait for human intervention.
+
+**Integration smoke test** *(skip for pure frontend or docs-only plans)*:
+
+Run the smoke commands defined in the plan's `## Integration Verification` section. Additionally:
+- If GraphQL: run an introspection probe to verify schema is accessible
+- If Docker services: scan container logs for ERROR-level entries
+- If new API routes: verify each returns expected status codes
+
+Smoke test failures are debugged by a `quality-fixer-smoke` agent with the same 3-attempt limit.
+
+---
+
+## Step 5: Pre-PR Documentation
+
+*In the worktree, before creating the PR.*
+
+**PRD Reconciliation**: compare the implemented behavior against the original PRD. Note any deviations or additions discovered during implementation. Update the PRD with actuals. These updates ship in the same PR as the feature.
+
+**Plan Archival**: move `docs/plans/plan-{name}.md` to `docs/plans/completed/plan-{name}.md`. Update the status header.
+
+Commit documentation updates: `docs: reconcile PRD and archive plan for {feature-name}`.
+
+---
+
+## Step 6: Push and PR
+
+Push the worktree branch and create the PR:
+
+```bash
+git push origin feature/{plan-name}
+gh pr create \
+  --title "{feature-name}: {one-line summary from plan}" \
+  --body "$(cat .pr-body.md)"
+```
+
+PR body template:
+```markdown
+## Summary
+{plan summary paragraph}
+
+## Changes
+{auto-generated from task list: bullet per task with files affected}
+
+## ADRs
+{list of ADRs created during this plan}
+
+## Test Plan
+{from plan test plan section}
+
+## Smoke Test Results
+{output from integration verification}
+```
+
+Merge using squash:
+```bash
+gh pr merge --squash --delete-branch
+```
+
+---
+
+## Step 7: Post-Merge Metrics
+
+Switch back to develop/main. Update `docs/plans/metrics/{name}.json` with execution data:
+- Task count and per-layer breakdown
+- TDD task count
+- Diff stats (files changed, lines added/removed)
+- Quality gate results (pass/fail, fix attempts)
+- Smoke test results
+- Drift score (0-1, how closely implementation matched plan)
+- PR data (number, merge commit, timestamp)
+
+Commit metrics update.
+
+---
+
+## Step 8: Worktree Cleanup
+
+```bash
+git worktree remove .worktrees/{plan-name}
+```
+
+---
+
+## Usage
+
+```
+/plan-execute
+```
+
+Picks up the most recent validated plan. Or specify:
+
+```
+/plan-execute plan-user-authentication
+```
+
+## Output
+
+```
+Setting up worktree: .worktrees/user-authentication
+Branch: feature/user-authentication
+
+TDD scaffolding: 2 tasks marked TDD
+  ✓ Written failing tests for: auth-token-validation
+  ✓ Written failing tests for: refresh-token-rotation
+  Committed: "test: failing tests for auth pipeline (TDD)"
+
+Executing Layer 1 (3 tasks, parallel)...
+  [agent-1] Implementing: JWT token generation service
+  [agent-2] Implementing: User session model
+  [agent-3] Implementing: Auth middleware
+  ✓ Layer 1 complete. 3 commits.
+
+Drift check: Layer 1... ✓ No drift detected.
+
+Executing Layer 2 (2 tasks, parallel)...
+  [agent-4] Implementing: Login endpoint
+  [agent-5] Implementing: Refresh endpoint
+  ✓ Layer 2 complete. 2 commits.
+
+Quality gate...
+  ✓ Lint passed
+  ✓ Type check passed
+  ✓ Tests: 47 passed, 0 failed
+
+Smoke test...
+  ✓ GraphQL introspection: OK
+  ✓ POST /api/auth/login: 200
+  ✓ POST /api/auth/refresh: 200
+
+Pre-PR docs...
+  ✓ PRD reconciled (1 minor deviation noted)
+  ✓ Plan archived to docs/plans/completed/
+
+PR created: #142 "user-authentication: JWT auth with refresh token rotation"
+PR merged (squash). Branch deleted.
+
+Metrics committed. Worktree cleaned.
+✅ Feature complete.
+```
+
+## When to Use
+
+After `/plan-validate` confirms all issues are resolved. Never skip validation — executing an unvalidated plan skips the independent review that catches ~18 issues on average.
+
+## See Also
+
+- [Plan-Validate-Execute Pipeline](../../guide/workflows/plan-pipeline.md)
+- [Git Worktree Command](./git-worktree.md)
+- [TDD with Claude](../guide/workflows/tdd-with-claude.md)
--- a/examples/commands/plan-start.md
+++ b/examples/commands/plan-start.md
@ -0,0 +1,180 @@
+---
+name: plan-start
+description: "5-phase planning command: PRD analysis, design review, technical decisions, dynamic research team, metrics. Produces a complete implementation plan + ADRs before any code is written."
+---
+
+# Plan Start — 5-Phase Planning
+
+Analyze the request and produce a complete implementation plan through structured phases. No code is written. Every significant decision is recorded. Run `/clear` after this command before running `/plan-validate`.
+
+---
+
+## Phase 1: PRD & Design Analysis
+
+### Step 1.1 — PRD Analysis
+
+*Skip if no PRD exists (refactor, infra change, bug fix).*
+
+Read all PRD files and `docs/INFORMATION_ARCHITECTURE.md` if present. Scan the codebase to understand current implementation status.
+
+Surface findings in 3 buckets:
+
+**Missing requirements** — acceptance criteria that are absent or incomplete
+**Ambiguous requirements** — items with multiple valid interpretations
+**Compliance concerns** — security, data privacy, API contract implications
+
+For each finding: present options with concrete pros/cons. Discuss with user. Record every decision in the plan file under a `## Decisions` section before moving on. Do not proceed past unresolved ambiguities.
+
+### Step 1.2 — Design Analysis
+
+*Skip if no UI changes are in scope.*
+
+Read: `DESIGN_SYSTEM.md`, existing UX ADRs, CLAUDE.md UX rules.
+
+Produce specs for:
+- **Screen inventory**: new/modified screens, route placement, component reuse audit
+- **State catalog**: empty, loading, populated, error, and partial states for every interactive element
+- **Interaction specs**: user flows (happy path + alternates), focus/keyboard behavior
+- **Animation specs**: map each interaction to existing keyframes or specify new ones, include `prefers-reduced-motion` fallbacks
+- **Responsive behavior**: breakpoints, web/mobile divergence decisions
+- **Accessibility**: WAI-ARIA pattern selection, live regions, error visibility
+
+Create Design ADRs for significant UX decisions (choice of interaction pattern, new animation convention, platform divergence). Record minor layout choices directly in the plan file.
+
+---
+
+## Phase 2: Technical Analysis
+
+Spawn 1-2 Explore agents for targeted codebase research. Run them in the background via Task tool.
+
+While agents run, check:
+- Existing ADRs in `docs/adr/` — if 3+ ADRs confirm a decision → auto-resolve without asking
+- PATTERNS.md — apply confirmed patterns directly
+
+When agents return: present architecture decisions with 2-3 options each, concrete pros/cons, and a recommendation. Ask for user input on each unresolved decision.
+
+For each significant decision:
+1. Create `docs/adr/ADR-XXXX.md` using standard Nygard format (Context / Decision / Status / Consequences)
+2. Update `docs/adr/PATTERNS.md` with the new observation
+
+---
+
+## Phase 3: Scope Assessment
+
+Apply trigger rules to determine which research agents are needed. Present the proposed team with justification for each inclusion.
+
+**Research agent pool:**
+
+| Agent | Trigger | Model |
+|-------|---------|-------|
+| `code-explorer` | Always | Sonnet |
+| `arch-researcher` | Changes touch 2+ architectural layers | Sonnet |
+| `database-analyst` | Any DB schema change | Sonnet |
+| `security-analyst` | Auth, payments, PII, RBAC, rate limiting | Opus |
+| `test-analyzer` | Non-trivial feature (not just a bug fix) | Sonnet |
+| `cross-platform-specialist` | Web + mobile parity required | Sonnet |
+| `native-app-specialist` | Tasks touch mobile/native UI package | Sonnet |
+| `design-system-researcher` | UI changes in scope | Sonnet |
+| `dependency-researcher` | New packages being added | Sonnet |
+| `devops-specialist` | Docker, env vars, CI/CD changes | Sonnet |
+| `integration-researcher` | New services, libraries, OTEL config | Opus |
+| `planning-coordinator` | Always, when 2+ agents selected | Opus |
+
+**Tier labels** (descriptive, not prescriptive):
+- Tier 0 (0 agents): Solo — inline research, no spawning
+- Tier 1 (1-3 agents): Focused
+- Tier 2 (4-6 agents): Standard
+- Tier 3 (7-9 agents): Comprehensive
+- Tier 4 (10+ agents): Full Spectrum
+
+Tell the user: "I recommend a **[Tier N - Label]** team: [agent list with one-line justification each]. Want to add or remove any agents?"
+
+Wait for approval before Phase 4.
+
+---
+
+## Phase 4: Research & Plan Creation
+
+**Tier 0**: Conduct inline research. Write plan directly without spawning agents.
+
+**Tier 1+**: Spawn approved agents in parallel using Task tool (run_in_background: true). For each agent, provide:
+- Its specific research scope
+- The relevant files/areas to investigate
+- The questions it needs to answer
+
+Monitor agents via TaskOutput polling loop. Report progress: "3/6 agents complete..."
+
+When all agents return: if `planning-coordinator` was spawned, send it all agent reports and have it synthesize the final plan. Otherwise, synthesize directly.
+
+**Plan file structure** (`docs/plans/plan-{name}.md`):
+
+```markdown
+# Plan: {feature-name}
+Created: {date} | Branch: {branch-name} | Tier: {N}
+
+## Summary
+One paragraph: what this implements and why.
+
+## Decisions
+Decisions recorded during Phase 1 (PRD analysis).
+
+## Architecture
+ADRs created, patterns applied, architectural choices made.
+
+## Tasks
+Ordered task list with layers (1 = foundation, 2 = depends on 1, etc.)
+
+### Layer 1
+- [ ] Task A — description, files affected, acceptance criteria
+- [ ] Task B — description, files affected, acceptance criteria
+
+### Layer 2
+- [ ] Task C — depends on A — description, files affected, acceptance criteria
+
+## Test Plan
+How each task will be verified. TDD tasks marked explicitly.
+
+## Integration Verification
+Smoke test commands to run post-execution (if backend/services in scope).
+
+## Out of Scope
+What this plan explicitly does not address.
+```
+
+Commit: plan file + ADR files + agent report manifests.
+
+---
+
+## Phase 5: Finalize Metrics
+
+Record timestamps, phase durations, agent counts, and cost estimates in `docs/plans/metrics/{name}.json`. Commit.
+
+---
+
+## Auto-Transition
+
+If Phase 1 produced no unresolved ambiguities and Phase 2 produced no unresolved decisions: auto-start `/plan-validate` without asking.
+
+If any human discussion occurred: ask "Ready to validate this plan?" before proceeding.
+
+---
+
+## Usage
+
+```
+/plan-start
+```
+
+Provide the feature description or point to a PRD file when prompted. The command handles the rest interactively.
+
+## When to Use
+
+Use for any non-trivial feature: anything touching more than 2 files, involving architecture decisions, or where a planning mistake would be expensive to undo.
+
+For simple changes (typos, trivial refactors): use `/plan` mode instead.
+
+## See Also
+
+- [Plan-Validate-Execute Pipeline](../../guide/workflows/plan-pipeline.md)
+- [Planning Coordinator Agent](../agents/planning-coordinator.md)
+- [ADR Writer Agent](../agents/adr-writer.md)
--- a/examples/commands/plan-validate.md
+++ b/examples/commands/plan-validate.md
@ -0,0 +1,187 @@
+---
+name: plan-validate
+description: "2-layer plan validation: instant structural checks + trigger-based specialist agents. Auto-fixes issues using ADRs and first principles. Every issue must be resolved before execution."
+---
+
+# Plan Validate — 2-Layer Validation
+
+Independently validate the plan produced by `/plan-start`. No code is written. Run `/clear` after this command before running `/plan-execute`.
+
+Validation is separate from planning by design: validators that didn't write the plan are not anchored to its assumptions.
+
+---
+
+## Prerequisite
+
+A committed plan file must exist at `docs/plans/plan-{name}.md`. If multiple plans exist, list them and ask the user which to validate.
+
+---
+
+## Layer 1: Structural Validation
+
+Run immediately, no agents required. Check the plan document for:
+
+**Format & Completeness**
+- [ ] All required sections present (Summary, Decisions, Architecture, Tasks, Test Plan, Out of Scope)
+- [ ] Each task has: description, files affected, acceptance criteria, layer assignment
+
+**Dependency Chain**
+- [ ] No circular dependencies between tasks
+- [ ] Tasks in higher layers only depend on tasks in lower layers
+- [ ] All stated dependencies exist in the plan
+
+**File Existence**
+- [ ] Every file listed for modification actually exists in the codebase (use Glob)
+- [ ] New files are in appropriate directories per project conventions
+
+**ADR Consistency**
+- [ ] Plan decisions align with ADRs created during `/plan-start`
+- [ ] No contradiction with existing ADRs in `docs/adr/`
+
+**CLAUDE.md Compliance**
+- [ ] Plan respects all hard rules in CLAUDE.md
+- [ ] No first principles violations (no workarounds, no backward-compat shims)
+
+**Test Coverage**
+- [ ] Every new function/component has a corresponding test task
+- [ ] TDD-marked tasks have failing test written before implementation task
+
+Record all Layer 1 issues with severity (BLOCKER / WARNING / INFO) before proceeding to Layer 2.
+
+---
+
+## Layer 2: Specialist Review
+
+Select agents by applying trigger rules to the plan content. No user input needed — triggers are objective.
+
+**Validation agent pool:**
+
+| Agent | Trigger | Model |
+|-------|---------|-------|
+| `security-reviewer` | Auth, payments, PII, RBAC, new public APIs | Opus |
+| `db-migration-reviewer` | New tables, columns, indexes, or migration files | Opus |
+| `performance-reviewer` | New queries, resolvers, routes, or added dependencies | Sonnet |
+| `design-system-reviewer` | New UI components or visual styling changes | Sonnet |
+| `ux-reviewer` | New pages, forms, modals, or interaction patterns | Sonnet |
+| `cross-platform-reviewer` | Changes touching both web and mobile, or shared packages | Sonnet |
+| `native-app-reviewer` | Mobile screens, native UI package changes | Sonnet |
+| `integration-reviewer` | New external services, libraries, or OTEL config | Opus |
+
+Spawn triggered agents in parallel (Task tool, run_in_background: true). Each agent receives: the plan file, relevant ADRs, and targeted questions based on its domain.
+
+Monitor via TaskOutput polling loop. Report progress to user.
+
+Each agent must return structured findings:
+```
+FINDING: [BLOCKER|WARNING|INFO]
+Location: [plan section or file reference]
+Issue: [concrete description]
+Risk: [what breaks if this isn't addressed]
+Suggestion: [specific fix or alternative]
+```
+
+---
+
+## Auto-Fix Phase
+
+Merge Layer 1 structural issues + Layer 2 specialist findings into a single issue list. Every issue must be resolved. No skipping.
+
+**Triage each issue:**
+
+**Bucket A — Auto-resolve:**
+- Issue matches an existing ADR decision → cite ADR, mark resolved
+- Issue matches a confirmed pattern in PATTERNS.md → cite pattern, mark resolved
+- Issue resolvable from first principles in CLAUDE.md → apply rule, mark resolved
+
+**Bucket B — Needs human input:**
+- Novel architectural question not covered by existing decisions
+- Conflicting ADRs with no clear precedent
+- Blocker with no obvious resolution
+
+For Bucket B items: present the issue, explain why it can't be auto-resolved, propose options, wait for decision. Record the decision in the plan's `## Decisions` section and create a new ADR if it's architecturally significant.
+
+**Apply all fixes in one batch** once all issues are triaged. Update the plan file. Commit the updated plan.
+
+---
+
+## Issue Persistence
+
+Record every issue in `docs/plans/metrics/{name}.json` under `validation.issues`:
+
+```json
+{
+  "id": "S-001",
+  "layer": 1,
+  "severity": "WARNING",
+  "category": "test-coverage",
+  "description": "No test task for the new webhook handler",
+  "reporting_agent": "structural",
+  "triage": "A",
+  "resolution_source": "first-principles",
+  "resolution": "Added test task in Layer 2 of the plan"
+}
+```
+
+This data feeds `/plan-metrics` for pattern analysis over time.
+
+---
+
+## Auto-Transition
+
+If all issues are auto-resolved (Bucket A only): auto-start `/plan-execute` without asking.
+
+If any human input was required (Bucket B): ask "All issues resolved. Ready to execute?" before proceeding.
+
+---
+
+## Usage
+
+```
+/plan-validate
+```
+
+Picks up the most recent uncommitted plan automatically. Or specify:
+
+```
+/plan-validate plan-user-authentication
+```
+
+## Output
+
+```
+Layer 1: Structural validation...
+  ✓ Format complete
+  ✓ Dependencies valid
+  ⚠ WARNING S-001: Missing test task for webhook handler
+  ✓ CLAUDE.md compliant
+
+Layer 2: Triggering specialist agents...
+  → security-reviewer (auth changes detected) [Opus]
+  → db-migration-reviewer (new users table) [Opus]
+  → performance-reviewer (new query in /api/users) [Sonnet]
+  Monitoring... 1/3 complete... 2/3 complete... done.
+
+  BLOCKER B-001 [security-reviewer]: JWT expiry not validated on refresh endpoint
+  WARNING B-002 [db-migration-reviewer]: Migration lacks rollback strategy
+
+Auto-fix phase:
+  S-001 → auto-resolved (first principles: test coverage rule)
+  B-001 → NEEDS INPUT (no existing ADR for JWT refresh strategy)
+  B-002 → auto-resolved (ADR-0003: migration rollback pattern)
+
+[User input requested for B-001]
+Decision recorded. ADR-0011 created.
+
+All 3 issues resolved. Plan updated.
+→ Auto-starting /plan-execute
+```
+
+## When to Use
+
+Always — before any `/plan-execute` call. The cost of validation ($0.20-3.00) is negligible against the cost of discovering issues mid-execution.
+
+## See Also
+
+- [Plan-Validate-Execute Pipeline](../../guide/workflows/plan-pipeline.md)
+- [Integration Reviewer Agent](../agents/integration-reviewer.md)
+- [Plan Challenger Agent](../agents/plan-challenger.md)
--- a/guide/cheatsheet.md
+++ b/guide/cheatsheet.md
@ -12,7 +12,7 @@ tags: [cheatsheet, reference]

 **Written with**: Claude (Anthropic)

-**Version**: 3.31.0 | **Last Updated**: February 2026
+**Version**: 3.32.0 | **Last Updated**: February 2026

 ---

@ -614,4 +614,4 @@ where.exe claude; claude doctor; claude mcp list

 **Author**: Florian BRUNIAUX | [@Méthode Aristote](https://methode-aristote.fr) | Written with Claude

-*Last updated: February 2026 | Version 3.31.0*
+*Last updated: February 2026 | Version 3.32.0*
--- a/guide/ultimate-guide.md
+++ b/guide/ultimate-guide.md
@ -16,7 +16,7 @@ tags: [guide, reference, workflows, agents, hooks, mcp, security]

 **Last updated**: January 2026

-**Version**: 3.31.0
+**Version**: 3.32.0

 ---

@ -4968,7 +4968,7 @@ The `.claude/` folder is your project's Claude Code directory for memory, settin
 | Personal preferences | `CLAUDE.md` | ❌ Gitignore |
 | Personal permissions | `settings.local.json` | ❌ Gitignore |

-### 3.31.0 Version Control & Backup
+### 3.32.0 Version Control & Backup

 **Problem**: Without version control, losing your Claude Code configuration means hours of manual reconfiguration across agents, skills, hooks, and MCP servers.

@ -22721,4 +22721,4 @@ We'll evaluate and add it to this section if it meets quality criteria.

 **Contributions**: Issues and PRs welcome.

-**Last updated**: January 2026 | **Version**: 3.31.0
+**Last updated**: January 2026 | **Version**: 3.32.0
--- a/guide/workflows/plan-pipeline.md
+++ b/guide/workflows/plan-pipeline.md
@ -0,0 +1,371 @@
+---
+title: "Plan-Validate-Execute Pipeline"
+description: "Production-grade 3-command workflow with dynamic agent teams, ADR learning loop, and automated execution from PRD to merged PR"
+tags: [workflow, agents, architecture, advanced]
+---
+
+# Plan-Validate-Execute Pipeline
+
+> **Confidence**: Tier 2 — Battle-tested by production teams shipping AI-first products at scale. Extends native `/plan` mode with structured agent orchestration and institutional memory.
+
+A complete development workflow in 3 commands: plan with a dynamic research team, validate with independent specialist reviewers, execute with parallel agents. Each run improves the next through an ADR learning loop that progressively reduces human interruptions.
+
+**Reading time**: ~25 min
+**Prerequisites**: Sub-agents, Task tool, worktrees, basic ADR concepts
+**Related**: [Plan-Driven Development](./plan-driven.md), [Agent Teams](./agent-teams.md), [Spec-First](./spec-first.md)
+
+---
+
+## Table of Contents
+
+1. [TL;DR](#tldr)
+2. [Philosophy](#philosophy)
+3. [The Three Commands](#the-three-commands)
+4. [Dynamic Agent Pool](#dynamic-agent-pool)
+5. [ADR Learning Loop](#adr-learning-loop)
+6. [CLAUDE.md Discipline](#claudemd-discipline)
+7. [Context Management](#context-management)
+8. [When to Use](#when-to-use)
+9. [Cost Profile](#cost-profile)
+10. [See Also](#see-also)
+
+---
+
+## TL;DR
+
+```
+/plan-start    → 5-phase planning: PRD analysis + dynamic research team + ADRs
+/plan-validate → 2-layer review: structural checks + trigger-based specialist agents
+/plan-execute  → worktree + TDD + parallel execution + PR + merge + cleanup
+```
+
+**What makes this different from `/plan` mode**:
+- Research is done by specialized agents in parallel, not one agent sequentially
+- Validation is independent from planning (no confirmation bias)
+- Every significant decision generates an ADR that auto-resolves future decisions
+- Execution spawns per-task agents in a git worktree, commits per task, handles everything through to merged PR
+
+**Run `/clear` between each command** to reset context and avoid compacting overhead.
+
+---
+
+## Philosophy
+
+### Non-Prescriptive AI-First
+
+Tell Claude **what** to achieve, never **how** to implement it. The moment you prescribe implementation details, you're using your knowledge as a ceiling instead of Claude's as a floor.
+
+A good opening prompt for a new project:
+```
+How should I use you most effectively to build this platform?
+```
+
+Let Claude propose the architecture. Your job is to validate decisions, not dictate them.
+
+### No Bandaids, No Workarounds
+
+Hard rule for every agent in the pipeline:
+
+> We build state-of-the-art software. Always choose the best-in-class architecture, the most robust pattern, and the industry-standard approach.
+
+Build time and effort are irrelevant to architectural decisions. Never factor implementation complexity into option assessments. The right solution is always the best solution.
+
+**Enforcement checklist** (apply before implementing):
+- [ ] Am I using a backward-compatibility flag, shim, or legacy mode?
+- [ ] Would a new project following current official docs do it this way?
+- [ ] Am I porting old patterns instead of learning the new ones?
+- [ ] Am I patching one component when the fix belongs at the system level?
+
+If any answer is yes: stop, fix at the correct level.
+
+### Why Independent Validation?
+
+Validators that didn't write the plan are not anchored to its assumptions. Research shows multi-agent review with adversarial framing catches significantly more issues than self-review. The average plan produces ~18 issues when challenged by an independent team — ~95% auto-resolve from existing ADRs and first principles.
+
+---
+
+## The Three Commands
+
+### `/plan-start` — 5-Phase Planning
+
+**Phase 1: PRD & Design Analysis** *(interactive, no agents)*
+
+Read the PRD and surface issues in 3 buckets before any agent work:
+- Missing requirements (unclear acceptance criteria, unspecified edge cases)
+- Ambiguous requirements (multiple valid interpretations)
+- Compliance concerns (security, data privacy, API contracts)
+
+Present options with pros/cons, record decisions. Skip for non-PRD work (refactors, infra, bug fixes).
+
+If UI changes are in scope, extend to design analysis: screen inventory, state catalog (empty/loading/populated/error), interaction specs, animation patterns, accessibility (ARIA), design token updates.
+
+**Phase 2: Technical Analysis** *(1-2 Explore agents + interactive)*
+
+Check existing ADRs and PATTERNS.md first. If 3+ confirmed ADRs match the decision → auto-resolve without asking. Otherwise:
+- Spawn Explore agents for targeted codebase research
+- Present architecture decisions with options and recommendations
+- Create ADR documents for significant decisions (see [ADR Learning Loop](#adr-learning-loop))
+- Update PATTERNS.md
+
+**Phase 3: Scope Assessment** *(automatic + user approval)*
+
+Apply trigger rules against the agent pool (see [Dynamic Agent Pool](#dynamic-agent-pool)). Present the proposed team with justification for each agent. User can add or remove agents before research starts.
+
+| Tier | Agent Count | Label |
+|------|-------------|-------|
+| 0 | 0 | Solo (inline research) |
+| 1 | 1-3 | Focused |
+| 2 | 4-6 | Standard |
+| 3 | 7-9 | Comprehensive |
+| 4 | 10+ | Full Spectrum |
+
+**Phase 4: Research & Plan Creation** *(dynamic team)*
+
+- Tier 0: inline research, no agents
+- Tier 1+: spawn approved agents in parallel (background), lead monitors via TaskOutput loop
+- `planning-coordinator` (Opus) synthesizes all reports into final plan
+- Commit plan file, ADRs, and creation artifacts
+
+Output: `docs/plans/plan-{name}.md` + `docs/adr/ADR-XXXX.md` + `docs/plans/metrics/{name}.json`
+
+**Auto-transition**: no unresolved ambiguity → auto-start `/plan-validate`
+
+---
+
+### `/plan-validate` — 2-Layer Validation
+
+**Layer 1: Structural** *(inline, instant)*
+
+Mechanical checks that don't need agents:
+- Plan format and completeness (all required sections present)
+- Task ordering and dependency chain (no circular deps)
+- File existence checks (files listed for modification actually exist)
+- ADR consistency (plan aligns with its ADRs)
+- CLAUDE.md rule compliance
+
+**Layer 2: Specialist Review** *(trigger-based, 0-8 agents)*
+
+| Agent | Trigger | Model |
+|-------|---------|-------|
+| `security-reviewer` | Auth, payments, PII, RBAC, new APIs | Opus |
+| `db-migration-reviewer` | New tables, columns, indexes, migrations | Opus |
+| `performance-reviewer` | New resolvers, queries, routes, new deps | Sonnet |
+| `design-system-reviewer` | New UI components, visual styling | Sonnet |
+| `ux-reviewer` | New pages, forms, interactions | Sonnet |
+| `cross-platform-reviewer` | Web + mobile, or shared packages | Sonnet |
+| `native-app-reviewer` | Mobile screens, native UI packages | Sonnet |
+| `integration-reviewer` | New services, libraries, OTEL config | Opus |
+
+No agents selected automatically for trivial plans. A payments feature might trigger 4+.
+
+**Auto-Fix Phase**
+
+Every issue must be resolved — no skipping. Triage:
+1. Issues matching existing ADR decisions → auto-resolve
+2. Issues matching confirmed PATTERNS.md entries → auto-resolve
+3. Issues resolvable from first principles → auto-resolve
+4. Residual → human decision → new rule → auto-resolved next time
+
+**Auto-transition**: all issues resolved → auto-start `/plan-execute`
+
+---
+
+### `/plan-execute` — Execution to Merged PR
+
+Single command handles everything:
+
+1. **Worktree creation** — isolated branch from current branch
+2. **TDD scaffolding** — write failing tests first for TDD-marked tasks
+3. **Level-based parallel execution** — detect independent tasks, spawn per-task agents, commit per task
+4. **Drift detection** — flag if implementation diverges from plan
+5. **Quality gate** — parallel tests + integration smoke test (GraphQL probe, container log scan, plan-defined smoke commands)
+6. **Pre-PR docs update** — PRD reconciliation + plan archival (in worktree)
+7. **PR creation and merge** — squash merge, clean commit message
+8. **Post-merge metrics** — execution data committed to metrics file
+9. **Worktree cleanup** — remove branch and worktree
+
+If quality gate fails: up to 3 auto-fix attempts by dedicated debug agents. Still failing → notify human.
+
+---
+
+## Dynamic Agent Pool
+
+Agents are **not** hardcoded in CLAUDE.md. They are defined at invocation time — description, trigger criteria, and model selection embedded in the plan phase where they're spawned. This keeps CLAUDE.md lightweight while giving each agent full context for its role.
+
+### Research Pool (`/plan-start`)
+
+| Agent | Trigger | Model |
+|-------|---------|-------|
+| `code-explorer` | Always | Sonnet |
+| `arch-researcher` | Multi-layer changes (2+ layers) | Sonnet |
+| `database-analyst` | Any DB schema changes | Sonnet |
+| `security-analyst` | Auth, payments, PII, RBAC | Opus |
+| `test-analyzer` | Non-trivial feature | Sonnet |
+| `cross-platform-specialist` | Mobile parity needed | Sonnet |
+| `native-app-specialist` | Tasks touch mobile/UI | Sonnet |
+| `design-system-researcher` | UI changes in scope | Sonnet |
+| `dependency-researcher` | New packages being added | Sonnet |
+| `devops-specialist` | Docker, env vars, CI/CD | Sonnet |
+| `integration-researcher` | New services, libraries, OTEL | Opus |
+| `planning-coordinator` | Always (when 2+ agents) | Opus |
+
+**Key design choices**:
+- Opus only for high-stakes roles (security, integration, coordination)
+- Sonnet for standard research (good quality, lower cost)
+- `planning-coordinator` only spawned when 2+ agents are selected — it synthesizes, it doesn't research
+
+### Validation Pool (`/plan-validate`)
+
+See Layer 2 table above. These are different agents from the research pool — validators are not biased by the creation process.
+
+---
+
+## ADR Learning Loop
+
+Every significant architectural decision generates an ADR. Over time, ADRs compound into institutional memory that reduces human interruptions.
+
+### What Triggers an ADR
+
+**Always create an ADR for:**
+- Choice between multiple valid interaction patterns (overlay vs page, drawer vs modal)
+- New animation or transition patterns not in existing conventions
+- Platform divergence decisions (web vs mobile behavior)
+- Loading state strategy that introduces a new pattern
+- Auth strategy, DB schema approach, service boundary decisions
+- New dependency selections with architectural implications
+
+**Do NOT create an ADR for:**
+- Decisions dictated by an approved source or existing convention
+- Minor layout choices within established patterns
+- Obvious state catalog entries (standard empty/loading/error states)
+
+### Maturity Levels
+
+```
+1 ADR  → Watching   — tracked, not yet prescriptive
+2 ADRs → Emerging   — presented as recommended default with precedent context
+3+ ADRs → Confirmed — auto-resolved during planning (no human input needed)
+          → Candidate for promotion to CLAUDE.md as hard rule
+```
+
+### The Loop
+
+```
+/plan-start Phase 2     →  ADR created
+                               ↓
+                        PATTERNS.md updated
+                               ↓
+/adr-review (periodic)  →  Detect patterns across ADRs
+                               ↓
+                        Propose CLAUDE.md promotions
+                               ↓
+Future /plan-start      ←  Confirmed rules auto-resolve decisions
+```
+
+Run `/adr-review` every 10-15 plans to batch-analyze patterns and propose CLAUDE.md additions.
+
+**The compounding effect**: a project with 20 plans auto-resolves ~80% of architecture decisions. Human input focuses on genuinely novel decisions only.
+
+---
+
+## CLAUDE.md Discipline
+
+### Hard Limit: 120 Lines
+
+Every line in CLAUDE.md costs context on every request. A 300-line CLAUDE.md is overhead that runs before every single prompt. Set and enforce a hard limit.
+
+**What earns a line in CLAUDE.md:**
+- First principles (hard rules that override agent preferences)
+- Confirmed ADR patterns (3+ occurrences)
+- Project-specific conventions that agents cannot infer from the codebase
+- Pointers to sub-files
+
+**What should NOT be in CLAUDE.md:**
+- Full content of design systems, env configs, architecture docs
+- Rules that apply to <10% of tasks
+- Explanations and rationale (write those in ADRs)
+
+### Pointer Strategy
+
+Instead of loading all context into CLAUDE.md, use pointers:
+
+```markdown
+## Context Files (load only when relevant)
+- @docs/DESIGN_SYSTEM.md    — when UI changes are in scope
+- @docs/ARCHITECTURE.md     — when service boundaries are touched
+- @docs/ENV_CONFIG.md       — when Docker or env vars are modified
+- @docs/ADR_PATTERNS.md     — during planning phases
+```
+
+Agents load only what their task requires. A backend task never loads the design system. Context stays clean.
+
+### Regular Trimming
+
+Review CLAUDE.md every 10-15 plans alongside `/adr-review`. Promote confirmed patterns, remove rules that have become obvious through codebase conventions, trim anything that hasn't been referenced.
+
+---
+
+## Context Management
+
+### `/clear` Between Steps
+
+Run `/clear` between `/plan-start`, `/plan-validate`, and `/plan-execute`. Each command is self-contained — the plan file on disk is the handoff artifact, not in-memory context.
+
+Without `/clear`: context accumulates across all phases, compacting triggers earlier, agents inherit irrelevant context from previous phases, and token costs increase significantly.
+
+### Why This Works
+
+Each command reads its inputs from disk (plan files, ADRs, codebase). There's no state that needs to live in the context window between steps. The discipline of clearing between steps is what makes the pipeline scale to large projects without hitting context limits mid-execution.
+
+---
+
+## When to Use
+
+### ✅ Use This Pipeline When
+
+- Feature requires multiple files and layers (API + DB + UI)
+- Security-sensitive changes (auth, payments, PII)
+- Complex DB migrations
+- New external service integrations
+- Anything where a planning mistake would be expensive to undo
+- Team projects where decision history matters
+
+### ❌ Don't Use When
+
+- Typo fix, trivial refactor (use standard `/plan` mode)
+- Exploratory prototyping where requirements are unknown
+- Hotfix under time pressure (use dual-instance planning instead)
+- Changes touching ≤2 files with no architectural decisions
+
+### ⚡ Tier 0 Shortcut
+
+For small but non-trivial changes, still run the pipeline but the system will detect Tier 0 scope and skip agent spawning — research happens inline, validation is Layer 1 only, execution is single-agent. Same commands, lower overhead.
+
+---
+
+## Cost Profile
+
+| Phase | Cost Driver | Approximate Range |
+|-------|-------------|-------------------|
+| `/plan-start` Tier 0 | Inline research only | $0.10-0.30 |
+| `/plan-start` Tier 1-2 | 2-6 Sonnet agents | $0.50-2.00 |
+| `/plan-start` Tier 3 | 7+ agents + Opus coordinator | $2.00-8.00 |
+| `/plan-validate` | 0-8 agents | $0.20-3.00 |
+| `/plan-execute` | Per-task agents + quality gate | $0.50-5.00 |
+| **Typical feature (Tier 2)** | Full pipeline | **$2-10** |
+
+Cost compounds as ADR coverage grows: fewer agents needed, fewer validation issues, faster execution.
+
+Use `/plan-metrics` periodically to review historical cost trends and calibrate estimates.
+
+---
+
+## See Also
+
+- [Plan-Driven Development](./plan-driven.md) — native `/plan` mode, lighter alternative
+- [Dual-Instance Planning](./dual-instance-planning.md) — simpler 2-instance pattern
+- [Agent Teams](./agent-teams.md) — native parallel coordination (experimental)
+- [Task Management](./task-management.md) — Tasks API for cross-session coordination
+- [Spec-First Development](./spec-first.md) — CLAUDE.md as specification contract
+- [ADR Writer Agent](../../examples/agents/adr-writer.md) — standalone ADR generation
+- [Plan Challenger Agent](../../examples/agents/plan-challenger.md) — adversarial plan review
--- a/machine-readable/reference.yaml
+++ b/machine-readable/reference.yaml
@ -3,7 +3,7 @@
 # Source: guide/ultimate-guide.md
 # Purpose: Condensed index for LLMs to quickly answer user questions about Claude Code

-version: "3.31.0"
+version: "3.32.0"
 updated: "2026-03-03"

 # ════════════════════════════════════════════════════════════════
@ -685,6 +685,28 @@ deep_dive:
  dual_instance_pattern: "Vertical separation (planner vs implementer) - orthogonal to Boris horizontal scaling"
  dual_instance_cost: "$100-200/month (vs $500-1K Boris pattern)"
  dual_instance_audience: "Solo devs, spec-heavy work, quality > speed"
+  # Plan-Validate-Execute Pipeline (Mar 2026)
+  plan_pipeline_workflow: "guide/workflows/plan-pipeline.md"
+  plan_pipeline_philosophy: "non-prescriptive AI-first + No Bandaids first principles + ADR learning loop"
+  plan_pipeline_commands: "/plan-start → /plan-validate → /plan-execute"
+  plan_pipeline_agent_pool: "12 research agents (trigger-based) + 8 validation agents (trigger-based)"
+  plan_pipeline_adr_loop: "Watching (1) → Emerging (2) → Confirmed (3+) → CLAUDE.md promotion"
+  plan_pipeline_claude_md_limit: "120 lines hard limit + pointer strategy for sub-files"
+  plan_pipeline_context_reset: "/clear between each command (plan, validate, execute)"
+  plan_pipeline_cost: "$2-10 typical Tier 2 feature, compounds down as ADR coverage grows"
+  plan_start_command: "examples/commands/plan-start.md"
+  plan_start_phases: "5 phases: PRD analysis, design analysis, technical decisions, dynamic team, metrics"
+  plan_start_tiers: "Tier 0 Solo (0 agents) → Tier 4 Full Spectrum (10+ agents)"
+  plan_validate_command: "examples/commands/plan-validate.md"
+  plan_validate_layer1: "structural checks inline: format, deps, file existence, ADR consistency, CLAUDE.md"
+  plan_validate_layer2: "0-8 specialist agents triggered by plan content (security Opus, db Opus, integration Opus)"
+  plan_validate_autofix: "Bucket A auto-resolve (ADR/PATTERNS/first-principles ~95%) + Bucket B human input"
+  plan_execute_command: "examples/commands/plan-execute.md"
+  plan_execute_flow: "worktree → TDD scaffold → parallel level execution → drift detection → quality gate → smoke test → PR merge → cleanup"
+  planning_coordinator_agent: "examples/agents/planning-coordinator.md"
+  planning_coordinator_role: "Opus synthesis agent: merges multi-agent reports into coherent task graph, resolves conflicts via ADR precedence"
+  integration_reviewer_agent: "examples/agents/integration-reviewer.md"
+  integration_reviewer_role: "Opus runtime validator: connection params, async/sync, env vars, library API correctness, OTEL pipeline"
  # Boris Tane Pattern (Annotation Cycle, Feb 2026)
  annotation_cycle_pattern: "guide/workflows/plan-driven.md#the-annotation-cycle"
  custom_markdown_plans: "guide/workflows/plan-driven.md#why-custom-plans-over-plan"
@ -1462,7 +1484,7 @@ ecosystem:
      - "Cross-links modified → Update all 4 repos"
    history:
      - date: "2026-01-20"
-        event: "Code Landing sync v3.31.0, 66 templates, cross-links"
+        event: "Code Landing sync v3.32.0, 66 templates, cross-links"
        commit: "5b5ce62"
      - date: "2026-01-20"
        event: "Cowork Landing fix (paths, README, UI badges)"
@ -1474,7 +1496,7 @@ ecosystem:
 onboarding_matrix_meta:
  version: "2.0.0"
  last_updated: "2026-02-05"
-  aligned_with_guide: "3.31.0"
+  aligned_with_guide: "3.32.0"
  changelog:
    - version: "2.0.0"
      date: "2026-02-05"
@ -1502,7 +1524,7 @@ onboarding_matrix:
      core: [rules, sandbox_native_guide, commands]
      time_budget: "5 min"
      topics_max: 3
-      note: "SECURITY FIRST - sandbox before commands (v3.31.0 critical fix)"
+      note: "SECURITY FIRST - sandbox before commands (v3.32.0 critical fix)"

    beginner_15min:
      core: [rules, sandbox_native_guide, workflow, essential_commands]
@ -1587,7 +1609,7 @@ onboarding_matrix:
        - default: agent_validation_checklist
      time_budget: "60 min"
      topics_max: 6
-      note: "Dual-instance pattern for quality workflows (v3.31.0)"
+      note: "Dual-instance pattern for quality workflows (v3.32.0)"

  learn_security:
    intermediate_30min:
@ -1598,7 +1620,7 @@ onboarding_matrix:
        - default: permission_modes
      time_budget: "30 min"
      topics_max: 4
-      note: "NEW goal (v3.31.0) - Security-focused learning path"
+      note: "NEW goal (v3.32.0) - Security-focused learning path"

    power_60min:
      core: [sandbox_native_guide, mcp_secrets_management, security_hardening]
@ -1623,7 +1645,7 @@ onboarding_matrix:
      core: [rules, sandbox_native_guide, workflow, essential_commands, context_management, plan_mode]
      time_budget: "60 min"
      topics_max: 6
-      note: "Security foundation + core workflow (v3.31.0 sandbox added)"
+      note: "Security foundation + core workflow (v3.32.0 sandbox added)"

    intermediate_120min:
      core: [plan_mode, agents, skills, config_hierarchy, git_mcp_guide, hooks, mcp_servers]
 @ -1 +1 @@
 .31.0
 .32.0