docs: add everything-claude-code to ecosystem + verification loops pattern

- Add affaan-m/everything-claude-code to ecosystem (16k+ stars) - Note: author hackathon win was for Zenith project, not this repo - Caveat: Node.js hooks not officially recommended by Anthropic - Document "Verification Loops" pattern in methodologies.md - Official Anthropic guidance: iterate until tests pass - Document "Eval Harness" concept with source link - Reference: anthropic.com/engineering/demystifying-evals-for-ai-agents - Add deep_dive index entries for quick lookup Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-23 08:55:36 +01:00 · 2026-01-23 08:55:36 +01:00 · 73e371e237
commit 73e371e237
parent 2a63230c95
2 changed files with 29 additions and 0 deletions
--- a/guide/methodologies.md
+++ b/guide/methodologies.md
@ -142,11 +142,26 @@ Strict iteration: 2 weeks max per feature.

 With Claude: Be explicit. "Write FAILING tests that don't exist yet."

+> **Verification Loops** — A formalized pattern for autonomous iteration:
+>
+> Use testing as termination condition:
+> 1. Claude writes tests for the feature
+> 2. Claude iterates code until tests pass
+> 3. Continue until explicit completion criteria met
+>
+> **Official guidance**: *"Tell Claude to keep going until all tests pass. It will usually take a few iterations."* — [Anthropic Best Practices](https://www.anthropic.com/engineering/claude-code-best-practices)
+>
+> Implementation: Can be enforced via Stop hooks, multi-Claude verification, or explicit "DONE" markers in prompts.
+
 **Eval-Driven Development** — TDD for LLMs. Test agent behaviors via evals:
 - Code-based: `output == golden_answer`
 - LLM-based: Another Claude evaluates
 - Human grading: Reference, slow

+> **Eval Harness** — The infrastructure that runs evaluations end-to-end: providing instructions and tools, running tasks concurrently, recording steps, grading outputs, and aggregating results.
+>
+> See Anthropic's comprehensive guide: [Demystifying Evals for AI Agents](https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents)
+
 **Multi-Agent Orchestration** — From single assistant to orchestrated team:
 ```
 Meta-Agent (Orchestrator)
--- a/machine-readable/reference.yaml
+++ b/machine-readable/reference.yaml
@ -105,6 +105,11 @@ deep_dive:
  # Automatic skill generation (meta-skill)
  claudeception: "https://github.com/blader/Claudeception"
  claudeception_guide: 5095
+  # Verification Loops & Eval Harness (added 2026-01-23)
+  verification_loops: "guide/methodologies.md:145"
+  verification_loops_source: "https://www.anthropic.com/engineering/claude-code-best-practices"
+  eval_harness: "guide/methodologies.md:161"
+  eval_harness_source: "https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents"
  # DevOps/SRE Guide (guide/devops-sre.md)
  devops_sre_guide: "guide/devops-sre.md"
  devops_fire_framework: "guide/devops-sre.md:50"
@ -453,6 +458,15 @@ ecosystem:
    claude_code_everything:
      url: "github.com/wesammustafa/Claude-Code-Everything"
      focus: "Visual walkthrough - Screenshots, BMAD method"
+    everything_claude_code:
+      url: "github.com/affaan-m/everything-claude-code"
+      author: "Affaan Mustafa (Anthropic hackathon winner - Zenith project)"
+      focus: "Distribution - Plugin-ready configs collection"
+      stars: "16k+"
+      created: "2026-01-18"
+      unique: ["Node.js cross-platform hooks", "15 MCP configs", "Plugin marketplace format"]
+      note: "Consolidates existing patterns; author hackathon win was for Zenith project, not this repo"
+      caveat: "Node.js hooks not officially recommended by Anthropic (shell preferred)"
    coding_agents_matrix:
      url: "coding-agents-matrix.dev"
      github: "github.com/PackmindHub/coding-agents-matrix"