claude-code-ultimate-guide/guide/methodologies.md
Florian BRUNIAUX c5fad9f092 docs: add Context Engineering (Thoughtworks) + corporate marketplaces footnotes
- Add Context Engineering framework reference (Thoughtworks Tech Radar Vol 33)
- Add emerging corporate AI marketplaces concept (Hugo 2026)
- Document evaluation in docs/resource-evaluations/hugo-ai-impact-2026.md
- Score: 2/5 (marginal) - minimal integration via footnotes only

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-06 16:09:02 +01:00

23 KiB

Development Methodologies Reference

Confidence: Tier 2 — Validated by multiple production reports and official documentation.

Last updated: January 2026

This is a quick reference for 15 structured development methodologies that have emerged for AI-assisted development in 2025-2026. For hands-on practical workflows, see workflows/.


Table of Contents

  1. Decision Tree
  2. The 15 Methodologies
  3. SDD Tools Reference
  4. Writing Effective Specs
  5. Combination Patterns
  6. Sources

Decision Tree: What Do You Need?

┌─ "I want quality code" ────────────→ workflows/tdd-with-claude.md
│
├─ "I want to spec before code" ─────→ workflows/spec-first.md
│
├─ "I need to plan architecture" ────→ workflows/plan-driven.md
│
├─ "I'm iterating on something" ─────→ workflows/iterative-refinement.md
│
└─ "I need methodology theory" ──────→ Continue reading below

The 15 Methodologies

Organized in a 6-tier pyramid from strategic orchestration down to optimization techniques.

Tier 1: Strategic Orchestration

Name What Best For Claude Fit
BMAD Multi-agent governance with constitution as guardrail Enterprise 10+ teams, long-term projects Niche but powerful
GSD Meta-prompting 6-phase workflow with fresh contexts per task Solo devs, Claude Code CLI Similar to patterns in guide

BMAD (Breakthrough Method for Agile AI-Driven Development) inverts the traditional paradigm: documentation becomes the source of truth, not code. Uses specialized agents (Analyst, PM, Architect, Developer, QA) orchestrated with strict governance.

  • Key concept: Constitution.md as strategic guardrail
  • When to use: Complex enterprise projects needing governance
  • When to avoid: Small teams, MVPs, rapid prototyping

GSD (Get Shit Done) addresses context rot through systematic 6-phase workflow (Initialize → Discuss → Plan → Execute → Verify → Complete) with fresh 200k-token contexts per task. Core concepts (multi-agent orchestration, fresh context management) overlap significantly with existing patterns like Ralph Loop, Gas Town, and BMAD. See resource evaluation for detailed comparison.

Emerging: Ralph Inferno implements autonomous multi-persona workflows (Analyst→PM→UX→Architect→Business) with VM-based execution and self-correcting E2E loops. Experimental but interesting for "vibe coding at scale".


Foundational Discipline: Plan-First Workflow

"Once the plan is good, the code is good." — Boris Cherny, creator of Claude Code

Not just a feature (/plan command) — a systematic discipline.

Context Engineering: Thoughtworks designates this broader approach "Context Engineering" in their Technology Radar (Nov 2025)1 — the systematic design of information provided to LLMs during inference. Three core techniques: context setup (minimal system prompts, few-shot examples), context management for long-horizon tasks (summarization, external memories, sub-agent architectures), and dynamic information retrieval (JIT context loading). Related patterns in Claude Code: AGENTS.md, MCP Context7, Plan Mode.

The Mental Model:

Planning isn't optional for complex tasks. It's the difference between:

  • 8 iterations of "try → fix → retry → fix again"
  • 1 iteration of "plan → validate → execute cleanly"

When to plan first:

Task Complexity Plan First? Why
>3 files modified Yes Cross-file dependencies need architecture
>50 lines changed Yes Enough complexity for mistakes
Architectural changes Yes Impact analysis required
Unfamiliar codebase Yes Need exploration before action
Typo/obvious fix No Planning overhead > task time
Single-line change No Just do it

How plan-first works:

  1. Exploration phase (/plan mode):

    • Claude reads files, explores architecture
    • No edits allowed → forces thinking before action
    • Proposes approach with trade-offs
  2. Validation phase (you review):

    • Plan exposes assumptions and gaps
    • Easier to correct direction now vs after 100 lines written
    • Plan becomes contract for execution
  3. Execution phase (/execute):

    • Plan → code becomes mechanical translation
    • Fewer surprises, cleaner implementation
    • Faster overall despite "slower" start

Boris Cherny workflow:

"I run many sessions, start in plan mode, then switch into execution once the plan looks right. The signature upgrade is verification—giving Claude a way to test and confirm its own output."

Benefits over "just start coding":

  • Fewer correction iterations: Plan catches issues before they become code
  • Better architecture: Forced to think about structure first
  • Clearer communication: Plan is shared understanding with team/Claude
  • Reduced cost: One clean iteration < multiple messy iterations (even if plan phase costs tokens)

Integration with CLAUDE.md:

Document your team's plan-first triggers:

## Planning Policy
- ALWAYS plan first: API changes, database migrations, new features
- OPTIONAL planning: Bug fixes <10 lines, test additions
- NEVER skip: Changes affecting >2 modules

See also: Plan Mode documentation for /plan command usage.


Tier 2: Specification & Architecture

Name What Best For Claude Fit
SDD Specs before code APIs, contracts Core pattern
Doc-Driven Docs = source of truth Cross-team alignment CLAUDE.md native
Req-Driven Rich artifact context (20+ artifacts) Complex requirements Heavy setup
DDD Domain language first Business logic Design-time

SDD (Spec-Driven Development) — Specifications BEFORE code. One well-structured iteration equals 8 unstructured ones. CLAUDE.md IS your spec file.

Doc-Driven Development — Living documentation versioned in git becomes the single source of truth. Changes to specs trigger implementation.

Requirements-Driven Development — Uses CLAUDE.md as comprehensive implementation guide with 20+ structured artifacts.

DDD (Domain-Driven Design) — Aligns software with business language through:

  • Ubiquitous Language: Shared vocabulary in code
  • Bounded Contexts: Isolated domain boundaries
  • Domain Distillation: Core vs Support vs Generic domains

Tier 3: Behavior & Acceptance

Name What Best For Claude Fit
BDD Given-When-Then scenarios Stakeholder collaboration Tests & specs
ATDD Acceptance criteria first Compliance, regulated Process-heavy
CDD API contracts as interface Microservices OpenAPI native

BDD (Behavior-Driven Development) — Beyond testing: a collaboration process.

  1. Discovery: Involve devs and business experts
  2. Formulation: Write Given-When-Then examples
  3. Automation: Convert to executable tests (Gherkin/Cucumber)
Feature: Order Management
  Scenario: Cannot buy without stock
    Given product with 0 stock
    When customer attempts purchase
    Then system refuses with error message

ATDD (Acceptance Test-Driven Development) — Acceptance criteria defined BEFORE coding, collaboratively ("Three Amigos": Business, Dev, Test).

CDD (Contract-Driven Development) — API contracts (OpenAPI specs) as executable interface between teams. Patterns: Contract as Test, Contract as Stub.


Tier 4: Feature Delivery

Name What Best For Claude Fit
FDD Feature-by-feature delivery Large teams 10+ Structure
Context Eng. Context as first-class design Long sessions Fundamental

FDD (Feature-Driven Development) — Five processes:

  1. Develop Overall Model
  2. Build Features List
  3. Plan by Feature
  4. Design by Feature
  5. Build by Feature

Strict iteration: 2 weeks max per feature.

Context Engineering — Treat context as design element:

  • Progressive Disclosure: Let agent discover incrementally
  • Memory Management: Conversation vs persistent memory
  • Dynamic Refresh: Rewrite TODO list before response

Tier 5: Implementation

Name What Best For Claude Fit
TDD Red-Green-Refactor Quality code Core workflow
Eval-Driven Evals for LLM outputs AI products Agents
Multi-Agent Orchestrate sub-agents Complex tasks Task tool

TDD (Test-Driven Development) — The classic cycle:

  1. Red: Write failing test
  2. Green: Minimal code to pass
  3. Refactor: Clean up, tests stay green

With Claude: Be explicit. "Write FAILING tests that don't exist yet."

Verification Loops — A formalized pattern for autonomous iteration (broader than TDD):

Core principle: Give Claude a mechanism to verify its own output.

Code generated → Verification tool → Feedback loop → Improvement

Why it works (Boris Cherny): "An agent that can 'see' what it has done produces better results."

Verification mechanisms by domain:

Domain Verification Tool What Claude "Sees"
Frontend Browser preview (live reload) Visual rendering, layout, interactions
Backend Tests (unit/integration) Pass/fail status, error messages
Types TypeScript compiler Type errors, incompatibilities
Style Linters (ESLint, Prettier) Style violations, formatting issues
Performance Profilers, benchmarks Execution time, memory usage
Accessibility axe-core, screen readers WCAG violations, navigation issues
Security Static analyzers (Semgrep) Vulnerability patterns
UX User testing, recordings Usability problems, confusion points

TDD as canonical example:

  1. Claude writes tests for the feature
  2. Claude iterates code until tests pass
  3. Continue until explicit completion criteria met

Official guidance: "Tell Claude to keep going until all tests pass. It will usually take a few iterations."Anthropic Best Practices

Implementation patterns:

  • Hooks: PostToolUse hook runs verification after each edit
  • Browser extension: Claude in Chrome sees rendered output
  • Test watchers: Jest/Vitest watch mode provides instant feedback
  • CI/CD gates: GitHub Actions runs full validation suite
  • Multi-Claude verification: One Claude codes, another reviews

Anti-pattern: Blind iteration without feedback. Without verification mechanism, Claude can't converge toward correct solution—it guesses.

Eval-Driven Development — TDD for LLMs. Test agent behaviors via evals:

  • Code-based: output == golden_answer
  • LLM-based: Another Claude evaluates
  • Human grading: Reference, slow

Eval Harness — The infrastructure that runs evaluations end-to-end: providing instructions and tools, running tasks concurrently, recording steps, grading outputs, and aggregating results.

See Anthropic's comprehensive guide: Demystifying Evals for AI Agents

Multi-Agent Orchestration — From single assistant to orchestrated team:

Meta-Agent (Orchestrator)
├── Analyst (requirements)
├── Architect (design)
├── Developer (code)
└── Reviewer (validation)

Tier 6: Optimization

Name What Best For Claude Fit
Iterative Loops Autonomous refinement Optimization Core
Fresh Context Reset per task, state in files Long autonomous sessions Power users
Prompt Engineering Technique foundation Everything Prerequisite

Iterative Refinement Loops — Autonomous convergence:

  1. Execute prompt
  2. Observe result
  3. If result ≠ "DONE" → refine and repeat

Prompt Engineering — Foundations for ALL Claude usage:

  • Zero-Shot Chain of Thought: "Think step by step"
  • Few-Shot Learning: 2-3 examples of expected pattern
  • Structured Prompts: XML tags for organization
  • Position Matters: For long docs, place question at end

Fresh Context Pattern (Ralph Loop) — Solves context rot by spawning fresh agent instances per task. State persists in git + progress files, not chat history. Ideal for long autonomous sessions (migrations, overnight runs). See Ultimate Guide - Fresh Context Pattern for implementation.


SDD Tools Reference

Three tools have emerged to formalize Spec-Driven Development:

Tool Use Case Official Docs Claude Integration
Spec Kit Greenfield, governance github.blog/spec-kit /speckit.constitution, /speckit.specify, /speckit.plan
OpenSpec Brownfield, changes github.com/Fission-AI/OpenSpec /openspec:proposal, /openspec:apply, /openspec:archive
Specmatic API contract testing specmatic.io MCP agent available

Spec Kit (Greenfield)

5-phase workflow:

  1. Constitution: /speckit.constitution → guardrails
  2. Specify: /speckit.specify → requirements
  3. Plan: /speckit.plan → architecture
  4. Tasks: /speckit.tasks → decomposition
  5. Implement: /speckit.implement → code

OpenSpec (Brownfield)

Two-folder architecture:

openspec/
├── specs/      ← Current truth (stable)
└── changes/    ← Proposals (temporary)

Workflow: Proposal → Review → Apply → Archive

Specmatic (API Contracts)

  • Contract as Test: Auto-generates 1000s of tests from OpenAPI spec
  • Contract as Stub: Mock server for parallel development
  • Backward Compatibility: Detects breaking changes

Writing Effective Specs

Based on analysis of 2,500+ agent configuration files. Source: Addy Osmani

The Six Essential Components

Component What to Include Example
Commands Executable with flags npm test -- --coverage
Testing Framework, coverage, locations vitest, 80%, tests/
Project structure Explicit directories src/, lib/, tests/
Code style One example > paragraphs Show a real function
Git workflow Branch, commit, PR format feat/name, conventional commits
Boundaries Permission tiers See below

Permission Tiers

Tier Symbol Use For
Always do Safe actions, no approval (lint, format)
Ask first ⚠️ High-impact changes (delete, publish)
Never do 🚫 Hard stops (commit secrets, force push main)

Curse of Instructions

⚠️ Research shows more instructions = worse adherence to each one.

Solution: Feed only relevant spec sections per task, not the entire document.

Monolithic vs Modular Specs

Project Size Approach
Small (<10 files) Single spec file
Medium (10-50 files) Sectioned spec, feed per task
Large (50+ files) Sub-agent routing by domain

Combination Patterns

Recommended stacks by situation:

Situation Recommended Stack Notes
Solo MVP SDD + TDD Minimal overhead, quality focus
Team 5-10, greenfield Spec Kit + TDD + BDD Governance + quality + collaboration
Microservices CDD + Specmatic Contract-first, parallel dev
Existing SaaS (100+ features) OpenSpec + BDD Change tracking, no spec drift
Enterprise 10+ BMAD + Spec Kit + Specmatic Full governance + contracts
LLM-native product Eval-Driven + Multi-Agent Self-improving systems

Quick Reference Table

Methodology Level Primary Focus Team Size Learning Curve
BMAD Orchestration Governance 10+ High
SDD Specification Contracts Any Medium
Doc-Driven Specification Alignment Any Low
Req-Driven Specification Context 5+ Medium
DDD Specification Domain 5+ Very High
BDD Behavior Collaboration 5+ Medium
ATDD Behavior Compliance 5+ Medium
CDD Behavior APIs 5+ Medium
FDD Delivery Features 10+ Medium
Context Eng. Delivery AI sessions Any Low
TDD Implementation Quality Any Low
Eval-Driven Implementation AI outputs Any Medium
Multi-Agent Implementation Complexity Any Medium
Iterative Optimization Refinement Any Low
Prompt Eng. Optimization Foundation Any Very Low

Sources

Official Documentation (Tier 1)

Methodology References (Tier 2)

SDD & Spec-First

BMAD

TDD with AI

BDD & DDD

Context Engineering

Eval-Driven & Multi-Agent

Tools Documentation (Tier 1)

Additional References


See Also


  1. Thoughtworks Technology Radar Vol 33, Nov 2025. PDF. See also: Macro trends blog post. ↩︎