From 73e371e237ef1a5e9549267ddbde4bc604d869b8 Mon Sep 17 00:00:00 2001 From: Florian BRUNIAUX Date: Fri, 23 Jan 2026 08:55:36 +0100 Subject: [PATCH] docs: add everything-claude-code to ecosystem + verification loops pattern - Add affaan-m/everything-claude-code to ecosystem (16k+ stars) - Note: author hackathon win was for Zenith project, not this repo - Caveat: Node.js hooks not officially recommended by Anthropic - Document "Verification Loops" pattern in methodologies.md - Official Anthropic guidance: iterate until tests pass - Document "Eval Harness" concept with source link - Reference: anthropic.com/engineering/demystifying-evals-for-ai-agents - Add deep_dive index entries for quick lookup Co-Authored-By: Claude --- guide/methodologies.md | 15 +++++++++++++++ machine-readable/reference.yaml | 14 ++++++++++++++ 2 files changed, 29 insertions(+) diff --git a/guide/methodologies.md b/guide/methodologies.md index 2739a69..1cafa4d 100644 --- a/guide/methodologies.md +++ b/guide/methodologies.md @@ -142,11 +142,26 @@ Strict iteration: 2 weeks max per feature. With Claude: Be explicit. "Write FAILING tests that don't exist yet." +> **Verification Loops** — A formalized pattern for autonomous iteration: +> +> Use testing as termination condition: +> 1. Claude writes tests for the feature +> 2. Claude iterates code until tests pass +> 3. Continue until explicit completion criteria met +> +> **Official guidance**: *"Tell Claude to keep going until all tests pass. It will usually take a few iterations."* — [Anthropic Best Practices](https://www.anthropic.com/engineering/claude-code-best-practices) +> +> Implementation: Can be enforced via Stop hooks, multi-Claude verification, or explicit "DONE" markers in prompts. + **Eval-Driven Development** — TDD for LLMs. Test agent behaviors via evals: - Code-based: `output == golden_answer` - LLM-based: Another Claude evaluates - Human grading: Reference, slow +> **Eval Harness** — The infrastructure that runs evaluations end-to-end: providing instructions and tools, running tasks concurrently, recording steps, grading outputs, and aggregating results. +> +> See Anthropic's comprehensive guide: [Demystifying Evals for AI Agents](https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents) + **Multi-Agent Orchestration** — From single assistant to orchestrated team: ``` Meta-Agent (Orchestrator) diff --git a/machine-readable/reference.yaml b/machine-readable/reference.yaml index 070382a..e27c447 100644 --- a/machine-readable/reference.yaml +++ b/machine-readable/reference.yaml @@ -105,6 +105,11 @@ deep_dive: # Automatic skill generation (meta-skill) claudeception: "https://github.com/blader/Claudeception" claudeception_guide: 5095 + # Verification Loops & Eval Harness (added 2026-01-23) + verification_loops: "guide/methodologies.md:145" + verification_loops_source: "https://www.anthropic.com/engineering/claude-code-best-practices" + eval_harness: "guide/methodologies.md:161" + eval_harness_source: "https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents" # DevOps/SRE Guide (guide/devops-sre.md) devops_sre_guide: "guide/devops-sre.md" devops_fire_framework: "guide/devops-sre.md:50" @@ -453,6 +458,15 @@ ecosystem: claude_code_everything: url: "github.com/wesammustafa/Claude-Code-Everything" focus: "Visual walkthrough - Screenshots, BMAD method" + everything_claude_code: + url: "github.com/affaan-m/everything-claude-code" + author: "Affaan Mustafa (Anthropic hackathon winner - Zenith project)" + focus: "Distribution - Plugin-ready configs collection" + stars: "16k+" + created: "2026-01-18" + unique: ["Node.js cross-platform hooks", "15 MCP configs", "Plugin marketplace format"] + note: "Consolidates existing patterns; author hackathon win was for Zenith project, not this repo" + caveat: "Node.js hooks not officially recommended by Anthropic (shell preferred)" coding_agents_matrix: url: "coding-agents-matrix.dev" github: "github.com/PackmindHub/coding-agents-matrix"