marketing-shibata50/claude-code-ultimate-guide

Florian BRUNIAUX 6d847d24de docs: add Profile-Based Module Assembly pattern (Section 3.5)

- Section 3.5 "Team Configuration at Scale" in ultimate-guide.md:
  profiles YAML + shared modules + skeleton + assembler script;
  59% context token reduction measured on 5-dev production team;
  includes CI drift detection, 5-step replication guide, trade-offs
- New workflow: guide/workflows/team-ai-instructions.md (6 phases,
  scaling thresholds, troubleshooting table)
- New templates: examples/team-config/ (profile-template.yaml,
  claude-skeleton.md, sync-script.ts)
- reference.yaml: 9 new entries for team_ai_instructions_*
- README: templates count 161 → 164, date Feb 19 → Feb 20
- CHANGELOG [Unreleased]: resource evaluations (AGENTS.md ETH Zürich
  4/5, Sylvain Chabaud 3/5), spec-first Task Granularity section,
  methodologies ATDD expansion

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-20 15:04:29 +01:00

26 KiB

Raw Blame History

title

description

Development Methodologies Reference

Confidence: Tier 2 — Validated by multiple production reports and official documentation.

Last updated: February 2026

This is a quick reference for 15 structured development methodologies that have emerged for AI-assisted development in 2025-2026. For hands-on practical workflows, see workflows/.

Decision Tree
The 15 Methodologies
SDD Tools Reference
Writing Effective Specs
Combination Patterns
Sources

Decision Tree: What Do You Need?

┌─ "I want quality code" ────────────→ workflows/tdd-with-claude.md
│
├─ "I want to spec before code" ─────→ workflows/spec-first.md
│
├─ "I need to plan architecture" ────→ workflows/plan-driven.md
│
├─ "I'm iterating on something" ─────→ workflows/iterative-refinement.md
│
└─ "I need methodology theory" ──────→ Continue reading below

The 15 Methodologies

Organized in a 6-tier pyramid from strategic orchestration down to optimization techniques.

Tier 1: Strategic Orchestration

Name	What	Best For	Claude Fit
BMAD	Multi-agent governance with constitution as guardrail	Enterprise 10+ teams, long-term projects	⭐⭐ Niche but powerful
GSD	Meta-prompting 6-phase workflow with fresh contexts per task	Solo devs, Claude Code CLI	⭐⭐ Similar to patterns in guide

BMAD (Breakthrough Method for Agile AI-Driven Development) inverts the traditional paradigm: documentation becomes the source of truth, not code. Uses specialized agents (Analyst, PM, Architect, Developer, QA) orchestrated with strict governance. Note: BMAD's role-based agent naming reflects their methodology; see §9.17 Agent Anti-Patterns for scope-focused alternatives.

Key concept: Constitution.md as strategic guardrail
When to use: Complex enterprise projects needing governance
When to avoid: Small teams, MVPs, rapid prototyping

GSD (Get Shit Done) addresses context rot through systematic 6-phase workflow (Initialize → Discuss → Plan → Execute → Verify → Complete) with fresh 200k-token contexts per task. Core concepts (multi-agent orchestration, fresh context management) overlap significantly with existing patterns like Ralph Loop, Gas Town, and BMAD. See resource evaluation for detailed comparison.

Emerging: Ralph Inferno implements autonomous multi-persona workflows (Analyst→PM→UX→Architect→Business) with VM-based execution and self-correcting E2E loops. Experimental but interesting for "vibe coding at scale".

Foundational Discipline: Plan-First Workflow

"Once the plan is good, the code is good." — Boris Cherny, creator of Claude Code

Not just a feature (/plan command) — a systematic discipline.

Context Engineering: Thoughtworks designates this broader approach "Context Engineering" in their Technology Radar (Nov 2025)¹ — the systematic design of information provided to LLMs during inference. Three core techniques: context setup (minimal system prompts, few-shot examples), context management for long-horizon tasks (summarization, external memories, sub-agent architectures), and dynamic information retrieval (JIT context loading). Related patterns in Claude Code: AGENTS.md, MCP Context7, Plan Mode.

The Mental Model:

Planning isn't optional for complex tasks. It's the difference between:

❌ 8 iterations of "try → fix → retry → fix again"
✅ 1 iteration of "plan → validate → execute cleanly"

When to plan first:

Task Complexity	Plan First?	Why
>3 files modified	✅ Yes	Cross-file dependencies need architecture
>50 lines changed	✅ Yes	Enough complexity for mistakes
Architectural changes	✅ Yes	Impact analysis required
Unfamiliar codebase	✅ Yes	Need exploration before action
Typo/obvious fix	❌ No	Planning overhead > task time
Single-line change	❌ No	Just do it

How plan-first works:

Exploration phase (/plan mode):
- Claude reads files, explores architecture
- No edits allowed → forces thinking before action
- Proposes approach with trade-offs
Validation phase (you review):
- Plan exposes assumptions and gaps
- Easier to correct direction now vs after 100 lines written
- Plan becomes contract for execution
Execution phase (/execute):
- Plan → code becomes mechanical translation
- Fewer surprises, cleaner implementation
- Faster overall despite "slower" start

Boris Cherny workflow:

"I run many sessions, start in plan mode, then switch into execution once the plan looks right. The signature upgrade is verification—giving Claude a way to test and confirm its own output."

Benefits over "just start coding":

Fewer correction iterations: Plan catches issues before they become code
Better architecture: Forced to think about structure first
Clearer communication: Plan is shared understanding with team/Claude
Reduced cost: One clean iteration < multiple messy iterations (even if plan phase costs tokens)

Integration with CLAUDE.md:

Document your team's plan-first triggers:

## Planning Policy
- ALWAYS plan first: API changes, database migrations, new features
- OPTIONAL planning: Bug fixes <10 lines, test additions
- NEVER skip: Changes affecting >2 modules

See also: Plan Mode documentation for /plan command usage.

Tier 2: Specification & Architecture

Name	What	Best For	Claude Fit
SDD	Specs before code	APIs, contracts	⭐⭐⭐ Core pattern
Doc-Driven	Docs = source of truth	Cross-team alignment	⭐⭐⭐ CLAUDE.md native
Req-Driven	Rich artifact context (20+ artifacts)	Complex requirements	⭐⭐ Heavy setup
DDD	Domain language first	Business logic	⭐⭐ Design-time

SDD (Spec-Driven Development) — Specifications BEFORE code. One well-structured iteration equals 8 unstructured ones. CLAUDE.md IS your spec file.

Doc-Driven Development — Living documentation versioned in git becomes the single source of truth. Changes to specs trigger implementation.

Requirements-Driven Development — Uses CLAUDE.md as comprehensive implementation guide with 20+ structured artifacts.

DDD (Domain-Driven Design) — Aligns software with business language through:

Ubiquitous Language: Shared vocabulary in code
Bounded Contexts: Isolated domain boundaries
Domain Distillation: Core vs Support vs Generic domains

Tier 3: Behavior & Acceptance

Name	What	Best For	Claude Fit
BDD	Given-When-Then scenarios	Stakeholder collaboration	⭐⭐⭐ Tests & specs
ATDD	Acceptance criteria first	Compliance, regulated	⭐⭐ Process-heavy
CDD	API contracts as interface	Microservices	⭐⭐⭐ OpenAPI native

BDD (Behavior-Driven Development) — Beyond testing: a collaboration process.

Discovery: Involve devs and business experts
Formulation: Write Given-When-Then examples
Automation: Convert to executable tests (Gherkin/Cucumber)

Feature: Order Management
  Scenario: Cannot buy without stock
    Given product with 0 stock
    When customer attempts purchase
    Then system refuses with error message

ATDD (Acceptance Test-Driven Development) — Acceptance criteria defined BEFORE coding, collaboratively ("Three Amigos": Business, Dev, Test).

In agentic development, ATDD is particularly effective because agents need unambiguous success conditions. The flow maps cleanly to agent tasks:

Define acceptance criteria in Gherkin (human-readable, machine-executable)
Agent writes failing tests based on scenarios (not implementation)
Agent implements until tests pass

Feature: Password Reset
  Scenario: User resets via email
    Given a registered user with email "user@example.com"
    When they request a password reset
    Then they receive a reset email within 60 seconds
    And the reset link expires after 24 hours

This Gherkin scenario is the contract between intent and implementation. The agent cannot misinterpret scope because done is defined before a line of code is written.

Applied to agents: Pass the Gherkin file to Claude Code before implementing. "Write failing tests for this feature file, then implement until they pass." The scenario writer role (human or agent) forces explicit scope before execution starts.

CDD (Contract-Driven Development) — API contracts (OpenAPI specs) as executable interface between teams. Patterns: Contract as Test, Contract as Stub.

Tier 4: Feature Delivery

Name	What	Best For	Claude Fit
FDD	Feature-by-feature delivery	Large teams 10+	⭐⭐ Structure
Context Eng.	Context as first-class design	Long sessions	⭐⭐⭐ Fundamental

FDD (Feature-Driven Development) — Five processes:

Develop Overall Model
Build Features List
Plan by Feature
Design by Feature
Build by Feature

Strict iteration: 2 weeks max per feature.

Context Engineering — Treat context as design element:

Progressive Disclosure: Let agent discover incrementally
Memory Management: Conversation vs persistent memory
Dynamic Refresh: Rewrite TODO list before response

Tier 5: Implementation

Name	What	Best For	Claude Fit
TDD	Red-Green-Refactor	Quality code	⭐⭐⭐ Core workflow
Eval-Driven	Evals for LLM outputs	AI products	⭐⭐⭐ Agents
Multi-Agent	Orchestrate sub-agents	Complex tasks	⭐⭐⭐ Task tool

TDD (Test-Driven Development) — The classic cycle:

Red: Write failing test
Green: Minimal code to pass
Refactor: Clean up, tests stay green

With Claude: Be explicit. "Write FAILING tests that don't exist yet."

Verification Loops — A formalized pattern for autonomous iteration (broader than TDD):

Core principle: Give Claude a mechanism to verify its own output.
Code generated → Verification tool → Feedback loop → Improvement
Why it works (Boris Cherny): "An agent that can 'see' what it has done produces better results."

Verification mechanisms by domain:

Domain Verification Tool What Claude "Sees"

Frontend Browser preview (live reload) Visual rendering, layout, interactions

Backend Tests (unit/integration) Pass/fail status, error messages

Types TypeScript compiler Type errors, incompatibilities

Style Linters (ESLint, Prettier) Style violations, formatting issues

Performance Profilers, benchmarks Execution time, memory usage

Accessibility axe-core, screen readers WCAG violations, navigation issues

Security Static analyzers (Semgrep) Vulnerability patterns

UX User testing, recordings Usability problems, confusion points

TDD as canonical example:

Claude writes tests for the feature

Claude iterates code until tests pass

Continue until explicit completion criteria met

Official guidance: "Tell Claude to keep going until all tests pass. It will usually take a few iterations." — Anthropic Best Practices

Implementation patterns:

Hooks: PostToolUse hook runs verification after each edit

Browser extension: Claude in Chrome sees rendered output

Test watchers: Jest/Vitest watch mode provides instant feedback

CI/CD gates: GitHub Actions runs full validation suite

Multi-Claude verification: One Claude codes, another reviews

Anti-pattern: Blind iteration without feedback. Without verification mechanism, Claude can't converge toward correct solution—it guesses.

Domain	Verification Tool	What Claude "Sees"
Frontend	Browser preview (live reload)	Visual rendering, layout, interactions
Backend	Tests (unit/integration)	Pass/fail status, error messages
Types	TypeScript compiler	Type errors, incompatibilities
Style	Linters (ESLint, Prettier)	Style violations, formatting issues
Performance	Profilers, benchmarks	Execution time, memory usage
Accessibility	axe-core, screen readers	WCAG violations, navigation issues
Security	Static analyzers (Semgrep)	Vulnerability patterns
UX	User testing, recordings	Usability problems, confusion points

Eval-Driven Development — TDD for LLMs. Test agent behaviors via evals:

Code-based: output == golden_answer
LLM-based: Another Claude evaluates
Human grading: Reference, slow

Eval Harness — The infrastructure that runs evaluations end-to-end: providing instructions and tools, running tasks concurrently, recording steps, grading outputs, and aggregating results.

See Anthropic's comprehensive guide: Demystifying Evals for AI Agents

Multi-Agent Orchestration — From single assistant to orchestrated team:

Meta-Agent (Orchestrator)
├── Analyst (requirements)
├── Architect (design)
├── Developer (code)
└── Reviewer (validation)

ADR-Driven Development

Pattern: Write plain English ADRs → Feed to implement-adr skill → Execute natively

Architecture Decision Records (ADRs) combined with Claude Code skills create a workflow where architectural decisions drive implementation directly.

Workflow Steps:

Document decision in ADR format (context, decision, consequences)
Create implementation skill (generic or implement-adr specialized)
Feed ADR as prompt to skill with clear acceptance criteria
Claude executes based on architectural guidance in ADR

Example ADR Template:

# ADR-001: Database Migration Strategy

## Context
Legacy MySQL schema needs migration to PostgreSQL for better JSON support.

## Decision
Use incremental dual-write pattern with feature flags.

## Consequences
- Positive: Zero-downtime migration
- Negative: Temporary code complexity during transition

Implementation Workflow:

# 1. Write ADR (plain English)
vim docs/adr/001-database-migration.md

# 2. Feed to implementation skill
/implement-adr docs/adr/001-database-migration.md

# 3. Claude executes based on ADR guidance
# → Creates migration scripts
# → Updates ORM configuration
# → Adds feature flags
# → Implements dual-write logic

Benefits:

✅ Documentation-driven: Architecture and code stay synchronized
✅ Native execution: No external frameworks needed
✅ Traceable decisions: Clear audit trail from decision to implementation
✅ Team alignment: ADRs communicate intent to both humans and AI

Source: Gur Sannikov embedded engineering workflow

Tier 6: Optimization

Name	What	Best For	Claude Fit
Iterative Loops	Autonomous refinement	Optimization	⭐⭐⭐ Core
Fresh Context	Reset per task, state in files	Long autonomous sessions	⭐⭐⭐ Power users
Prompt Engineering	Technique foundation	Everything	⭐⭐⭐ Prerequisite

Iterative Refinement Loops — Autonomous convergence:

Execute prompt
Observe result
If result ≠ "DONE" → refine and repeat

Prompt Engineering — Foundations for ALL Claude usage:

Zero-Shot Chain of Thought: "Think step by step"
Few-Shot Learning: 2-3 examples of expected pattern
Structured Prompts: XML tags for organization
Position Matters: For long docs, place question at end

Fresh Context Pattern (Ralph Loop) — Solves context rot by spawning fresh agent instances per task. State persists in git + progress files, not chat history. Ideal for long autonomous sessions (migrations, overnight runs). See Ultimate Guide - Fresh Context Pattern for implementation.

SDD Tools Reference

Three tools have emerged to formalize Spec-Driven Development:

Tool	Use Case	Official Docs	Claude Integration
Spec Kit	Greenfield, governance	github.blog/spec-kit	`/speckit.constitution`, `/speckit.specify`, `/speckit.plan`
OpenSpec	Brownfield, changes	github.com/Fission-AI/OpenSpec	`/openspec:proposal`, `/openspec:apply`, `/openspec:archive`
Specmatic	API contract testing	specmatic.io	MCP agent available
Spec-to-Code Factory	Greenfield, enforcement outillé	github.com/SylvainChabaud/spec-to-code-factory	Implémentation référence multi-agents (BREAK→MODEL→ACT→DEBRIEF)

Spec Kit (Greenfield)

5-phase workflow:

Constitution: /speckit.constitution → guardrails
Specify: /speckit.specify → requirements
Plan: /speckit.plan → architecture
Tasks: /speckit.tasks → decomposition
Implement: /speckit.implement → code

OpenSpec (Brownfield)

Two-folder architecture:

openspec/
├── specs/      ← Current truth (stable)
└── changes/    ← Proposals (temporary)

Workflow: Proposal → Review → Apply → Archive

Specmatic (API Contracts)

Contract as Test: Auto-generates 1000s of tests from OpenAPI spec
Contract as Stub: Mock server for parallel development
Backward Compatibility: Detects breaking changes

Writing Effective Specs

Based on analysis of 2,500+ agent configuration files. Source: Addy Osmani

The Six Essential Components

Component	What to Include	Example
Commands	Executable with flags	`npm test -- --coverage`
Testing	Framework, coverage, locations	`vitest, 80%, tests/`
Project structure	Explicit directories	`src/`, `lib/`, `tests/`
Code style	One example > paragraphs	Show a real function
Git workflow	Branch, commit, PR format	`feat/name`, conventional commits
Boundaries	Permission tiers	See below

Permission Tiers

Tier	Symbol	Use For
Always do	✅	Safe actions, no approval (lint, format)
Ask first	⚠️	High-impact changes (delete, publish)
Never do	🚫	Hard stops (commit secrets, force push main)

Curse of Instructions

⚠️ Research shows more instructions = worse adherence to each one.

Solution: Feed only relevant spec sections per task, not the entire document.

Monolithic vs Modular Specs

Project Size	Approach
Small (<10 files)	Single spec file
Medium (10-50 files)	Sectioned spec, feed per task
Large (50+ files)	Sub-agent routing by domain

Combination Patterns

Recommended stacks by situation:

Situation	Recommended Stack	Notes
Solo MVP	SDD + TDD	Minimal overhead, quality focus
Team 5-10, greenfield	Spec Kit + TDD + BDD	Governance + quality + collaboration
Microservices	CDD + Specmatic	Contract-first, parallel dev
Existing SaaS (100+ features)	OpenSpec + BDD	Change tracking, no spec drift
Enterprise 10+	BMAD + Spec Kit + Specmatic	Full governance + contracts
LLM-native product	Eval-Driven + Multi-Agent	Self-improving systems

Quick Reference Table

Methodology	Level	Primary Focus	Team Size	Learning Curve
BMAD	Orchestration	Governance	10+	High
SDD	Specification	Contracts	Any	Medium
Doc-Driven	Specification	Alignment	Any	Low
Req-Driven	Specification	Context	5+	Medium
DDD	Specification	Domain	5+	Very High
BDD	Behavior	Collaboration	5+	Medium
ATDD	Behavior	Compliance	5+	Medium
CDD	Behavior	APIs	5+	Medium
FDD	Delivery	Features	10+	Medium
Context Eng.	Delivery	AI sessions	Any	Low
TDD	Implementation	Quality	Any	Low
Eval-Driven	Implementation	AI outputs	Any	Medium
Multi-Agent	Implementation	Complexity	Any	Medium
Iterative	Optimization	Refinement	Any	Low
Prompt Eng.	Optimization	Foundation	Any	Very Low

Sources

Official Documentation (Tier 1)

Anthropic: Claude Code Best Practices
Anthropic: Effective Context Engineering for AI Agents
Anthropic: Demystifying Evals for AI Agents
GitHub: Spec-Driven Development Toolkit
Microsoft: Spec-Driven Development with Spec Kit

Methodology References (Tier 2)

SDD & Spec-First

Addy Osmani: How to Write Good Specs for AI Agents
Addy Osmani: My AI Coding Workflow in 2026 — End-to-end workflow: spec-first, context packing, TDD, git checkpoints
Martin Fowler: SDD Tools Analysis
InfoQ: Spec-Driven Development
Kinde: Beyond TDD - Why SDD is the Next Step
Tessl.io: Spec-Driven Dev with Claude Code

BMAD

GMO Recruit: The BMAD Method
Benny Cheung: BMAD - Reclaiming Control in AI Dev
GitHub: BMAD-AT-CLAUDE

TDD with AI

Steve Kinney: TDD with Claude
Nathan Fox: Taming GenAI Agents
Alex Op: Custom TDD Workflow Claude Code

BDD & DDD

Alex Soyes: BDD Behavior-Driven Development
Alex Soyes: DDD Domain-Driven Design
Inflectra: Behavior-Driven Development

Context Engineering

Intuition Labs: What is Context Engineering
Manus.im: Context Engineering for AI Agents

Eval-Driven & Multi-Agent

Fireworks AI: Eval-Driven Development with Claude Code
Brandon Casci: Transform into a Dev Team using Claude Code Agents
The Unwind AI: Claude Code's Multi-Agent Orchestration

Tools Documentation (Tier 1)

OpenSpec: github.com/Fission-AI/OpenSpec
Spec Kit: github.com/github/spec-kit
Specmatic: specmatic.io
Specmatic Article: Spec-Driven Development with GitHub Spec Kit and Specmatic MCP

Additional References

Talent500: Claude Code TDD Guide
Testlio: Acceptance Test-Driven Development
Monday.com: Feature-Driven Development
Paddo.dev: Ralph Wiggum Autonomous Loops
Walturn: Prompt Engineering for Claude
AWS: Prompt Engineering with Claude on Bedrock

26 KiB

Raw Blame History

Development Methodologies Reference

Table of Contents

Decision Tree: What Do You Need?

The 15 Methodologies

Tier 1: Strategic Orchestration

Foundational Discipline: Plan-First Workflow

Tier 2: Specification & Architecture

Tier 3: Behavior & Acceptance

Tier 4: Feature Delivery

Tier 5: Implementation

ADR-Driven Development

Tier 6: Optimization

SDD Tools Reference

Spec Kit (Greenfield)

OpenSpec (Brownfield)

Specmatic (API Contracts)

Writing Effective Specs

The Six Essential Components

Permission Tiers

Curse of Instructions

Monolithic vs Modular Specs

Combination Patterns

Quick Reference Table

Sources

Official Documentation (Tier 1)

Methodology References (Tier 2)

Tools Documentation (Tier 1)

Additional References

See Also

26 KiB Raw Blame History

Development Methodologies Reference

Table of Contents

Decision Tree: What Do You Need?

The 15 Methodologies

Tier 1: Strategic Orchestration

Foundational Discipline: Plan-First Workflow

Tier 2: Specification & Architecture

Tier 3: Behavior & Acceptance

Tier 4: Feature Delivery

Tier 5: Implementation

ADR-Driven Development

Tier 6: Optimization

SDD Tools Reference

Spec Kit (Greenfield)

OpenSpec (Brownfield)

Specmatic (API Contracts)

Writing Effective Specs

The Six Essential Components

Permission Tiers

Curse of Instructions

Monolithic vs Modular Specs

Combination Patterns

Quick Reference Table

Sources

Official Documentation (Tier 1)

Methodology References (Tier 2)

Tools Documentation (Tier 1)

Additional References

See Also

26 KiB

Raw Blame History