release: v3.24.0 - Agent Evaluation Framework

Major addition: Complete agent evaluation framework with production-ready template. ## Added - **Resource Evaluation**: nao framework (score 3/5) - Identified critical gap: agent evaluation not documented - Technical challenge adjusted score 2/5 → 3/5 - All claims fact-checked (TypeScript 58.9%, Python 38.5%) - **Guide Section**: Agent Evaluation (guide/agent-evaluation.md, ~3K tokens) - Metrics: response quality, tool usage, performance, satisfaction - Patterns: logging hooks, unit tests, A/B testing, feedback loops - Example: analytics agent with built-in metrics - Tools: nao framework reference, Claude Code hooks integration - **AI Ecosystem**: Section 8.2 Domain-Specific Agent Frameworks - nao (Analytics Agents): Database-agnostic, built-in evaluation - Transposable patterns: context builder, evaluation hooks, DB integrations - **Template**: Analytics Agent with Evaluation (5 files, ~1K lines) - README: setup, usage, troubleshooting - Agent: SQL generator with evaluation criteria, safety rules - Hook: automated metrics logging (safety, performance, errors) - Script: analysis with stats, safety reports, recommendations - Report template: monthly evaluation format ## Changed - Agent Evaluation Guide: updated template references, verified links - Landing Site: templates count 110 → 114 - Version: 3.23.5 → 3.24.0 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 11:52:13 +01:00 · 2026-02-10 11:52:13 +01:00 · ef7cdd899e
commit ef7cdd899e
parent 1fb783ebb8
15 changed files with 1782 additions and 16 deletions
--- a/machine-readable/reference.yaml
+++ b/machine-readable/reference.yaml
@ -3,7 +3,7 @@
 # Source: guide/ultimate-guide.md
 # Purpose: Condensed index for LLMs to quickly answer user questions about Claude Code

-version: "3.23.5"
+version: "3.24.0"
 updated: "2026-02-09"

 # ════════════════════════════════════════════════════════════════
@ -173,7 +173,7 @@ deep_dive:
  third_party_toad: "https://github.com/batrachianai/toad"
  third_party_conductor: "https://docs.conductor.build"
  # Configuration Management & Backup (Added 2026-02-02)
-  config_management_guide: "guide/ultimate-guide.md:4085"  # Section 3.23.5
+  config_management_guide: "guide/ultimate-guide.md:4085"  # Section 3.24.0
  config_hierarchy: "guide/ultimate-guide.md:4095"  # Global → Project → Local precedence
  config_git_strategy_project: "guide/ultimate-guide.md:4110"  # What to commit in .claude/
  config_git_strategy_global: "guide/ultimate-guide.md:4133"  # Version control ~/.claude/
@ -1169,7 +1169,7 @@ ecosystem:
      - "Cross-links modified → Update all 4 repos"
    history:
      - date: "2026-01-20"
-        event: "Code Landing sync v3.23.5, 66 templates, cross-links"
+        event: "Code Landing sync v3.24.0, 66 templates, cross-links"
        commit: "5b5ce62"
      - date: "2026-01-20"
        event: "Cowork Landing fix (paths, README, UI badges)"
@ -1181,7 +1181,7 @@ ecosystem:
 onboarding_matrix_meta:
  version: "2.0.0"
  last_updated: "2026-02-05"
-  aligned_with_guide: "3.23.5"
+  aligned_with_guide: "3.24.0"
  changelog:
    - version: "2.0.0"
      date: "2026-02-05"
@ -1209,7 +1209,7 @@ onboarding_matrix:
      core: [rules, sandbox_native_guide, commands]
      time_budget: "5 min"
      topics_max: 3
-      note: "SECURITY FIRST - sandbox before commands (v3.23.5 critical fix)"
+      note: "SECURITY FIRST - sandbox before commands (v3.24.0 critical fix)"

    beginner_15min:
      core: [rules, sandbox_native_guide, workflow, essential_commands]
@ -1294,7 +1294,7 @@ onboarding_matrix:
        - default: agent_validation_checklist
      time_budget: "60 min"
      topics_max: 6
-      note: "Dual-instance pattern for quality workflows (v3.23.5)"
+      note: "Dual-instance pattern for quality workflows (v3.24.0)"

  learn_security:
    intermediate_30min:
@ -1305,7 +1305,7 @@ onboarding_matrix:
        - default: permission_modes
      time_budget: "30 min"
      topics_max: 4
-      note: "NEW goal (v3.23.5) - Security-focused learning path"
+      note: "NEW goal (v3.24.0) - Security-focused learning path"

    power_60min:
      core: [sandbox_native_guide, mcp_secrets_management, security_hardening]
@ -1330,7 +1330,7 @@ onboarding_matrix:
      core: [rules, sandbox_native_guide, workflow, essential_commands, context_management, plan_mode]
      time_budget: "60 min"
      topics_max: 6
-      note: "Security foundation + core workflow (v3.23.5 sandbox added)"
+      note: "Security foundation + core workflow (v3.24.0 sandbox added)"

    intermediate_120min:
      core: [plan_mode, agents, skills, config_hierarchy, git_mcp_guide, hooks, mcp_servers]