- Section 3.5 "Team Configuration at Scale" in ultimate-guide.md: profiles YAML + shared modules + skeleton + assembler script; 59% context token reduction measured on 5-dev production team; includes CI drift detection, 5-step replication guide, trade-offs - New workflow: guide/workflows/team-ai-instructions.md (6 phases, scaling thresholds, troubleshooting table) - New templates: examples/team-config/ (profile-template.yaml, claude-skeleton.md, sync-script.ts) - reference.yaml: 9 new entries for team_ai_instructions_* - README: templates count 161 → 164, date Feb 19 → Feb 20 - CHANGELOG [Unreleased]: resource evaluations (AGENTS.md ETH Zürich 4/5, Sylvain Chabaud 3/5), spec-first Task Granularity section, methodologies ATDD expansion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
26 KiB
| title | description | tags | ||||
|---|---|---|---|---|---|---|
| Development Methodologies Reference | Quick reference for 15 structured AI-assisted development methodologies including TDD, SDD, and BDD |
|
Development Methodologies Reference
Confidence: Tier 2 — Validated by multiple production reports and official documentation.
Last updated: February 2026
This is a quick reference for 15 structured development methodologies that have emerged for AI-assisted development in 2025-2026. For hands-on practical workflows, see workflows/.
Table of Contents
- Decision Tree
- The 15 Methodologies
- SDD Tools Reference
- Writing Effective Specs
- Combination Patterns
- Sources
Decision Tree: What Do You Need?
┌─ "I want quality code" ────────────→ workflows/tdd-with-claude.md
│
├─ "I want to spec before code" ─────→ workflows/spec-first.md
│
├─ "I need to plan architecture" ────→ workflows/plan-driven.md
│
├─ "I'm iterating on something" ─────→ workflows/iterative-refinement.md
│
└─ "I need methodology theory" ──────→ Continue reading below
The 15 Methodologies
Organized in a 6-tier pyramid from strategic orchestration down to optimization techniques.
Tier 1: Strategic Orchestration
| Name | What | Best For | Claude Fit |
|---|---|---|---|
| BMAD | Multi-agent governance with constitution as guardrail | Enterprise 10+ teams, long-term projects | ⭐⭐ Niche but powerful |
| GSD | Meta-prompting 6-phase workflow with fresh contexts per task | Solo devs, Claude Code CLI | ⭐⭐ Similar to patterns in guide |
BMAD (Breakthrough Method for Agile AI-Driven Development) inverts the traditional paradigm: documentation becomes the source of truth, not code. Uses specialized agents (Analyst, PM, Architect, Developer, QA) orchestrated with strict governance. Note: BMAD's role-based agent naming reflects their methodology; see §9.17 Agent Anti-Patterns for scope-focused alternatives.
- Key concept: Constitution.md as strategic guardrail
- When to use: Complex enterprise projects needing governance
- When to avoid: Small teams, MVPs, rapid prototyping
GSD (Get Shit Done) addresses context rot through systematic 6-phase workflow (Initialize → Discuss → Plan → Execute → Verify → Complete) with fresh 200k-token contexts per task. Core concepts (multi-agent orchestration, fresh context management) overlap significantly with existing patterns like Ralph Loop, Gas Town, and BMAD. See resource evaluation for detailed comparison.
Emerging: Ralph Inferno implements autonomous multi-persona workflows (Analyst→PM→UX→Architect→Business) with VM-based execution and self-correcting E2E loops. Experimental but interesting for "vibe coding at scale".
Foundational Discipline: Plan-First Workflow
"Once the plan is good, the code is good." — Boris Cherny, creator of Claude Code
Not just a feature (/plan command) — a systematic discipline.
Context Engineering: Thoughtworks designates this broader approach "Context Engineering" in their Technology Radar (Nov 2025)1 — the systematic design of information provided to LLMs during inference. Three core techniques: context setup (minimal system prompts, few-shot examples), context management for long-horizon tasks (summarization, external memories, sub-agent architectures), and dynamic information retrieval (JIT context loading). Related patterns in Claude Code: AGENTS.md, MCP Context7, Plan Mode.
The Mental Model:
Planning isn't optional for complex tasks. It's the difference between:
- ❌ 8 iterations of "try → fix → retry → fix again"
- ✅ 1 iteration of "plan → validate → execute cleanly"
When to plan first:
| Task Complexity | Plan First? | Why |
|---|---|---|
| >3 files modified | ✅ Yes | Cross-file dependencies need architecture |
| >50 lines changed | ✅ Yes | Enough complexity for mistakes |
| Architectural changes | ✅ Yes | Impact analysis required |
| Unfamiliar codebase | ✅ Yes | Need exploration before action |
| Typo/obvious fix | ❌ No | Planning overhead > task time |
| Single-line change | ❌ No | Just do it |
How plan-first works:
-
Exploration phase (
/planmode):- Claude reads files, explores architecture
- No edits allowed → forces thinking before action
- Proposes approach with trade-offs
-
Validation phase (you review):
- Plan exposes assumptions and gaps
- Easier to correct direction now vs after 100 lines written
- Plan becomes contract for execution
-
Execution phase (
/execute):- Plan → code becomes mechanical translation
- Fewer surprises, cleaner implementation
- Faster overall despite "slower" start
Boris Cherny workflow:
"I run many sessions, start in plan mode, then switch into execution once the plan looks right. The signature upgrade is verification—giving Claude a way to test and confirm its own output."
Benefits over "just start coding":
- Fewer correction iterations: Plan catches issues before they become code
- Better architecture: Forced to think about structure first
- Clearer communication: Plan is shared understanding with team/Claude
- Reduced cost: One clean iteration < multiple messy iterations (even if plan phase costs tokens)
Integration with CLAUDE.md:
Document your team's plan-first triggers:
## Planning Policy
- ALWAYS plan first: API changes, database migrations, new features
- OPTIONAL planning: Bug fixes <10 lines, test additions
- NEVER skip: Changes affecting >2 modules
See also: Plan Mode documentation for /plan command usage.
Tier 2: Specification & Architecture
| Name | What | Best For | Claude Fit |
|---|---|---|---|
| SDD | Specs before code | APIs, contracts | ⭐⭐⭐ Core pattern |
| Doc-Driven | Docs = source of truth | Cross-team alignment | ⭐⭐⭐ CLAUDE.md native |
| Req-Driven | Rich artifact context (20+ artifacts) | Complex requirements | ⭐⭐ Heavy setup |
| DDD | Domain language first | Business logic | ⭐⭐ Design-time |
SDD (Spec-Driven Development) — Specifications BEFORE code. One well-structured iteration equals 8 unstructured ones. CLAUDE.md IS your spec file.
Doc-Driven Development — Living documentation versioned in git becomes the single source of truth. Changes to specs trigger implementation.
Requirements-Driven Development — Uses CLAUDE.md as comprehensive implementation guide with 20+ structured artifacts.
DDD (Domain-Driven Design) — Aligns software with business language through:
- Ubiquitous Language: Shared vocabulary in code
- Bounded Contexts: Isolated domain boundaries
- Domain Distillation: Core vs Support vs Generic domains
Tier 3: Behavior & Acceptance
| Name | What | Best For | Claude Fit |
|---|---|---|---|
| BDD | Given-When-Then scenarios | Stakeholder collaboration | ⭐⭐⭐ Tests & specs |
| ATDD | Acceptance criteria first | Compliance, regulated | ⭐⭐ Process-heavy |
| CDD | API contracts as interface | Microservices | ⭐⭐⭐ OpenAPI native |
BDD (Behavior-Driven Development) — Beyond testing: a collaboration process.
- Discovery: Involve devs and business experts
- Formulation: Write Given-When-Then examples
- Automation: Convert to executable tests (Gherkin/Cucumber)
Feature: Order Management
Scenario: Cannot buy without stock
Given product with 0 stock
When customer attempts purchase
Then system refuses with error message
ATDD (Acceptance Test-Driven Development) — Acceptance criteria defined BEFORE coding, collaboratively ("Three Amigos": Business, Dev, Test).
In agentic development, ATDD is particularly effective because agents need unambiguous success conditions. The flow maps cleanly to agent tasks:
- Define acceptance criteria in Gherkin (human-readable, machine-executable)
- Agent writes failing tests based on scenarios (not implementation)
- Agent implements until tests pass
Feature: Password Reset
Scenario: User resets via email
Given a registered user with email "user@example.com"
When they request a password reset
Then they receive a reset email within 60 seconds
And the reset link expires after 24 hours
This Gherkin scenario is the contract between intent and implementation. The agent cannot misinterpret scope because done is defined before a line of code is written.
Applied to agents: Pass the Gherkin file to Claude Code before implementing. "Write failing tests for this feature file, then implement until they pass." The scenario writer role (human or agent) forces explicit scope before execution starts.
CDD (Contract-Driven Development) — API contracts (OpenAPI specs) as executable interface between teams. Patterns: Contract as Test, Contract as Stub.
Tier 4: Feature Delivery
| Name | What | Best For | Claude Fit |
|---|---|---|---|
| FDD | Feature-by-feature delivery | Large teams 10+ | ⭐⭐ Structure |
| Context Eng. | Context as first-class design | Long sessions | ⭐⭐⭐ Fundamental |
FDD (Feature-Driven Development) — Five processes:
- Develop Overall Model
- Build Features List
- Plan by Feature
- Design by Feature
- Build by Feature
Strict iteration: 2 weeks max per feature.
Context Engineering — Treat context as design element:
- Progressive Disclosure: Let agent discover incrementally
- Memory Management: Conversation vs persistent memory
- Dynamic Refresh: Rewrite TODO list before response
Tier 5: Implementation
| Name | What | Best For | Claude Fit |
|---|---|---|---|
| TDD | Red-Green-Refactor | Quality code | ⭐⭐⭐ Core workflow |
| Eval-Driven | Evals for LLM outputs | AI products | ⭐⭐⭐ Agents |
| Multi-Agent | Orchestrate sub-agents | Complex tasks | ⭐⭐⭐ Task tool |
TDD (Test-Driven Development) — The classic cycle:
- Red: Write failing test
- Green: Minimal code to pass
- Refactor: Clean up, tests stay green
With Claude: Be explicit. "Write FAILING tests that don't exist yet."
Verification Loops — A formalized pattern for autonomous iteration (broader than TDD):
Core principle: Give Claude a mechanism to verify its own output.
Code generated → Verification tool → Feedback loop → ImprovementWhy it works (Boris Cherny): "An agent that can 'see' what it has done produces better results."
Verification mechanisms by domain:
Domain Verification Tool What Claude "Sees" Frontend Browser preview (live reload) Visual rendering, layout, interactions Backend Tests (unit/integration) Pass/fail status, error messages Types TypeScript compiler Type errors, incompatibilities Style Linters (ESLint, Prettier) Style violations, formatting issues Performance Profilers, benchmarks Execution time, memory usage Accessibility axe-core, screen readers WCAG violations, navigation issues Security Static analyzers (Semgrep) Vulnerability patterns UX User testing, recordings Usability problems, confusion points TDD as canonical example:
- Claude writes tests for the feature
- Claude iterates code until tests pass
- Continue until explicit completion criteria met
Official guidance: "Tell Claude to keep going until all tests pass. It will usually take a few iterations." — Anthropic Best Practices
Implementation patterns:
- Hooks: PostToolUse hook runs verification after each edit
- Browser extension: Claude in Chrome sees rendered output
- Test watchers: Jest/Vitest watch mode provides instant feedback
- CI/CD gates: GitHub Actions runs full validation suite
- Multi-Claude verification: One Claude codes, another reviews
Anti-pattern: Blind iteration without feedback. Without verification mechanism, Claude can't converge toward correct solution—it guesses.
Eval-Driven Development — TDD for LLMs. Test agent behaviors via evals:
- Code-based:
output == golden_answer - LLM-based: Another Claude evaluates
- Human grading: Reference, slow
Eval Harness — The infrastructure that runs evaluations end-to-end: providing instructions and tools, running tasks concurrently, recording steps, grading outputs, and aggregating results.
See Anthropic's comprehensive guide: Demystifying Evals for AI Agents
Multi-Agent Orchestration — From single assistant to orchestrated team:
Meta-Agent (Orchestrator)
├── Analyst (requirements)
├── Architect (design)
├── Developer (code)
└── Reviewer (validation)
ADR-Driven Development
Pattern: Write plain English ADRs → Feed to implement-adr skill → Execute natively
Architecture Decision Records (ADRs) combined with Claude Code skills create a workflow where architectural decisions drive implementation directly.
Workflow Steps:
- Document decision in ADR format (context, decision, consequences)
- Create implementation skill (generic or
implement-adrspecialized) - Feed ADR as prompt to skill with clear acceptance criteria
- Claude executes based on architectural guidance in ADR
Example ADR Template:
# ADR-001: Database Migration Strategy
## Context
Legacy MySQL schema needs migration to PostgreSQL for better JSON support.
## Decision
Use incremental dual-write pattern with feature flags.
## Consequences
- Positive: Zero-downtime migration
- Negative: Temporary code complexity during transition
Implementation Workflow:
# 1. Write ADR (plain English)
vim docs/adr/001-database-migration.md
# 2. Feed to implementation skill
/implement-adr docs/adr/001-database-migration.md
# 3. Claude executes based on ADR guidance
# → Creates migration scripts
# → Updates ORM configuration
# → Adds feature flags
# → Implements dual-write logic
Benefits:
- ✅ Documentation-driven: Architecture and code stay synchronized
- ✅ Native execution: No external frameworks needed
- ✅ Traceable decisions: Clear audit trail from decision to implementation
- ✅ Team alignment: ADRs communicate intent to both humans and AI
Source: Gur Sannikov embedded engineering workflow
Tier 6: Optimization
| Name | What | Best For | Claude Fit |
|---|---|---|---|
| Iterative Loops | Autonomous refinement | Optimization | ⭐⭐⭐ Core |
| Fresh Context | Reset per task, state in files | Long autonomous sessions | ⭐⭐⭐ Power users |
| Prompt Engineering | Technique foundation | Everything | ⭐⭐⭐ Prerequisite |
Iterative Refinement Loops — Autonomous convergence:
- Execute prompt
- Observe result
- If result ≠ "DONE" → refine and repeat
Prompt Engineering — Foundations for ALL Claude usage:
- Zero-Shot Chain of Thought: "Think step by step"
- Few-Shot Learning: 2-3 examples of expected pattern
- Structured Prompts: XML tags for organization
- Position Matters: For long docs, place question at end
Fresh Context Pattern (Ralph Loop) — Solves context rot by spawning fresh agent instances per task. State persists in git + progress files, not chat history. Ideal for long autonomous sessions (migrations, overnight runs). See Ultimate Guide - Fresh Context Pattern for implementation.
SDD Tools Reference
Three tools have emerged to formalize Spec-Driven Development:
| Tool | Use Case | Official Docs | Claude Integration |
|---|---|---|---|
| Spec Kit | Greenfield, governance | github.blog/spec-kit | /speckit.constitution, /speckit.specify, /speckit.plan |
| OpenSpec | Brownfield, changes | github.com/Fission-AI/OpenSpec | /openspec:proposal, /openspec:apply, /openspec:archive |
| Specmatic | API contract testing | specmatic.io | MCP agent available |
| Spec-to-Code Factory | Greenfield, enforcement outillé | github.com/SylvainChabaud/spec-to-code-factory | Implémentation référence multi-agents (BREAK→MODEL→ACT→DEBRIEF) |
Spec Kit (Greenfield)
5-phase workflow:
- Constitution:
/speckit.constitution→ guardrails - Specify:
/speckit.specify→ requirements - Plan:
/speckit.plan→ architecture - Tasks:
/speckit.tasks→ decomposition - Implement:
/speckit.implement→ code
OpenSpec (Brownfield)
Two-folder architecture:
openspec/
├── specs/ ← Current truth (stable)
└── changes/ ← Proposals (temporary)
Workflow: Proposal → Review → Apply → Archive
Specmatic (API Contracts)
- Contract as Test: Auto-generates 1000s of tests from OpenAPI spec
- Contract as Stub: Mock server for parallel development
- Backward Compatibility: Detects breaking changes
Writing Effective Specs
Based on analysis of 2,500+ agent configuration files. Source: Addy Osmani
The Six Essential Components
| Component | What to Include | Example |
|---|---|---|
| Commands | Executable with flags | npm test -- --coverage |
| Testing | Framework, coverage, locations | vitest, 80%, tests/ |
| Project structure | Explicit directories | src/, lib/, tests/ |
| Code style | One example > paragraphs | Show a real function |
| Git workflow | Branch, commit, PR format | feat/name, conventional commits |
| Boundaries | Permission tiers | See below |
Permission Tiers
| Tier | Symbol | Use For |
|---|---|---|
| Always do | ✅ | Safe actions, no approval (lint, format) |
| Ask first | ⚠️ | High-impact changes (delete, publish) |
| Never do | 🚫 | Hard stops (commit secrets, force push main) |
Curse of Instructions
⚠️ Research shows more instructions = worse adherence to each one.
Solution: Feed only relevant spec sections per task, not the entire document.
Monolithic vs Modular Specs
| Project Size | Approach |
|---|---|
| Small (<10 files) | Single spec file |
| Medium (10-50 files) | Sectioned spec, feed per task |
| Large (50+ files) | Sub-agent routing by domain |
Combination Patterns
Recommended stacks by situation:
| Situation | Recommended Stack | Notes |
|---|---|---|
| Solo MVP | SDD + TDD | Minimal overhead, quality focus |
| Team 5-10, greenfield | Spec Kit + TDD + BDD | Governance + quality + collaboration |
| Microservices | CDD + Specmatic | Contract-first, parallel dev |
| Existing SaaS (100+ features) | OpenSpec + BDD | Change tracking, no spec drift |
| Enterprise 10+ | BMAD + Spec Kit + Specmatic | Full governance + contracts |
| LLM-native product | Eval-Driven + Multi-Agent | Self-improving systems |
Quick Reference Table
| Methodology | Level | Primary Focus | Team Size | Learning Curve |
|---|---|---|---|---|
| BMAD | Orchestration | Governance | 10+ | High |
| SDD | Specification | Contracts | Any | Medium |
| Doc-Driven | Specification | Alignment | Any | Low |
| Req-Driven | Specification | Context | 5+ | Medium |
| DDD | Specification | Domain | 5+ | Very High |
| BDD | Behavior | Collaboration | 5+ | Medium |
| ATDD | Behavior | Compliance | 5+ | Medium |
| CDD | Behavior | APIs | 5+ | Medium |
| FDD | Delivery | Features | 10+ | Medium |
| Context Eng. | Delivery | AI sessions | Any | Low |
| TDD | Implementation | Quality | Any | Low |
| Eval-Driven | Implementation | AI outputs | Any | Medium |
| Multi-Agent | Implementation | Complexity | Any | Medium |
| Iterative | Optimization | Refinement | Any | Low |
| Prompt Eng. | Optimization | Foundation | Any | Very Low |
Sources
Official Documentation (Tier 1)
- Anthropic: Claude Code Best Practices
- Anthropic: Effective Context Engineering for AI Agents
- Anthropic: Demystifying Evals for AI Agents
- GitHub: Spec-Driven Development Toolkit
- Microsoft: Spec-Driven Development with Spec Kit
Methodology References (Tier 2)
SDD & Spec-First
- Addy Osmani: How to Write Good Specs for AI Agents
- Addy Osmani: My AI Coding Workflow in 2026 — End-to-end workflow: spec-first, context packing, TDD, git checkpoints
- Martin Fowler: SDD Tools Analysis
- InfoQ: Spec-Driven Development
- Kinde: Beyond TDD - Why SDD is the Next Step
- Tessl.io: Spec-Driven Dev with Claude Code
BMAD
- GMO Recruit: The BMAD Method
- Benny Cheung: BMAD - Reclaiming Control in AI Dev
- GitHub: BMAD-AT-CLAUDE
TDD with AI
- Steve Kinney: TDD with Claude
- Nathan Fox: Taming GenAI Agents
- Alex Op: Custom TDD Workflow Claude Code
BDD & DDD
- Alex Soyes: BDD Behavior-Driven Development
- Alex Soyes: DDD Domain-Driven Design
- Inflectra: Behavior-Driven Development
Context Engineering
- Intuition Labs: What is Context Engineering
- Manus.im: Context Engineering for AI Agents
Eval-Driven & Multi-Agent
- Fireworks AI: Eval-Driven Development with Claude Code
- Brandon Casci: Transform into a Dev Team using Claude Code Agents
- The Unwind AI: Claude Code's Multi-Agent Orchestration
Tools Documentation (Tier 1)
- OpenSpec: github.com/Fission-AI/OpenSpec
- Spec Kit: github.com/github/spec-kit
- Specmatic: specmatic.io
- Specmatic Article: Spec-Driven Development with GitHub Spec Kit and Specmatic MCP
Additional References
- Talent500: Claude Code TDD Guide
- Testlio: Acceptance Test-Driven Development
- Monday.com: Feature-Driven Development
- Paddo.dev: Ralph Wiggum Autonomous Loops
- Walturn: Prompt Engineering for Claude
- AWS: Prompt Engineering with Claude on Bedrock
See Also
- workflows/tdd-with-claude.md — Practical TDD guide
- workflows/spec-first.md — Spec-first development
- workflows/plan-driven.md — Using /plan mode
- workflows/iterative-refinement.md — Refinement loops
- ultimate-guide.md#912 — Section 9.12 summary
-
Thoughtworks Technology Radar Vol 33, Nov 2025. PDF. See also: Macro trends blog post. ↩︎