claude-code-ultimate-guide/guide/README.md
Florian BRUNIAUX ef7cdd899e release: v3.24.0 - Agent Evaluation Framework
Major addition: Complete agent evaluation framework with production-ready template.

## Added

- **Resource Evaluation**: nao framework (score 3/5)
  - Identified critical gap: agent evaluation not documented
  - Technical challenge adjusted score 2/5 → 3/5
  - All claims fact-checked (TypeScript 58.9%, Python 38.5%)

- **Guide Section**: Agent Evaluation (guide/agent-evaluation.md, ~3K tokens)
  - Metrics: response quality, tool usage, performance, satisfaction
  - Patterns: logging hooks, unit tests, A/B testing, feedback loops
  - Example: analytics agent with built-in metrics
  - Tools: nao framework reference, Claude Code hooks integration

- **AI Ecosystem**: Section 8.2 Domain-Specific Agent Frameworks
  - nao (Analytics Agents): Database-agnostic, built-in evaluation
  - Transposable patterns: context builder, evaluation hooks, DB integrations

- **Template**: Analytics Agent with Evaluation (5 files, ~1K lines)
  - README: setup, usage, troubleshooting
  - Agent: SQL generator with evaluation criteria, safety rules
  - Hook: automated metrics logging (safety, performance, errors)
  - Script: analysis with stats, safety reports, recommendations
  - Report template: monthly evaluation format

## Changed

- Agent Evaluation Guide: updated template references, verified links
- Landing Site: templates count 110 → 114
- Version: 3.23.5 → 3.24.0

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 11:52:13 +01:00

65 lines
4.5 KiB
Markdown

# Guide Documentation
Core documentation for mastering Claude Code.
## Contents
| File | Description | Time |
|------|-------------|------|
| [ultimate-guide.md](./ultimate-guide.md) | Complete reference covering all Claude Code features | ~3 hours |
| [mcp-servers-ecosystem.md](./mcp-servers-ecosystem.md) | **Community MCP servers**: 8 validated servers (Playwright, Semgrep, Kubernetes, etc.) with production configs | 25 min |
| [third-party-tools.md](./third-party-tools.md) | **Community tools**: GUIs, TUIs, config managers, token trackers, alternative UIs | 15 min |
| [claude-code-releases.md](./claude-code-releases.md) | Official release history (condensed) | 10 min |
| [known-issues.md](./known-issues.md) | **Critical bugs tracker**: security issues, token consumption, verified community reports | 15 min |
| [cheatsheet.md](./cheatsheet.md) | 1-page printable quick reference | 5 min |
| [visual-reference.md](./visual-reference.md) | Visual cheatsheet — ASCII diagrams for key concepts | 5 min |
| [architecture.md](./architecture.md) | How Claude Code works internally (master loop, tools, context) | 25 min |
| [learning-with-ai.md](./learning-with-ai.md) | Guide for juniors on using AI without losing skills | 15 min |
| [adoption-approaches.md](./adoption-approaches.md) | Implementation strategies for teams | 15 min |
| [agent-evaluation.md](./agent-evaluation.md) | **Agent quality metrics**: Measuring custom agent effectiveness with hooks, tests, and feedback loops | 20 min |
| [data-privacy.md](./data-privacy.md) | Data retention and privacy guide | 10 min |
| [observability.md](./observability.md) | Session monitoring and cost tracking | 15 min |
| [methodologies.md](./methodologies.md) | 15 development methodologies reference (TDD, SDD, BDD, etc.) | 20 min |
| [security-hardening.md](./security-hardening.md) | Security threats, MCP vetting, injection defense | 25 min |
| [ai-traceability.md](./ai-traceability.md) | AI attribution, disclosure policies, git-ai, compliance | 20 min |
| [devops-sre.md](./devops-sre.md) | FIRE framework for infrastructure diagnosis and incident response | 30 min |
| [sandbox-isolation.md](./sandbox-isolation.md) | Docker Sandboxes, cloud alternatives, safe autonomy workflows | 10 min |
| [ai-ecosystem.md](./ai-ecosystem.md) | Complementary AI tools (Perplexity, Gemini, Kimi, NotebookLM, TTS) | 30 min |
| [cowork.md](./cowork.md) | Claude Cowork: Summary (see [dedicated repo](https://github.com/FlorianBruniaux/claude-cowork-guide) for full docs) | 5 min |
| [workflows/](./workflows/) | Practical workflow guides for Claude Code | 30 min |
### Cowork Documentation
For knowledge workers using Claude Cowork (agentic desktop):
| Resource | Description |
|----------|-------------|
| **[Cowork Hub](https://github.com/FlorianBruniaux/claude-cowork-guide/blob/main/README.md)** | Complete Cowork documentation |
| [Getting Started](https://github.com/FlorianBruniaux/claude-cowork-guide/blob/main/guide/01-getting-started.md) | Setup and first workflow |
| [Capabilities](https://github.com/FlorianBruniaux/claude-cowork-guide/blob/main/guide/02-capabilities.md) | What Cowork can/cannot do |
| [Security Guide](https://github.com/FlorianBruniaux/claude-cowork-guide/blob/main/guide/03-security.md) | Safe usage practices |
| [Prompt Library](https://github.com/FlorianBruniaux/claude-cowork-guide/tree/main/prompts) | 50+ ready-to-use prompts |
| [Cheatsheet](https://github.com/FlorianBruniaux/claude-cowork-guide/blob/main/reference/cheatsheet.md) | 1-page quick reference |
### Workflows
Hands-on guides for effective development patterns:
| File | Description |
|------|-------------|
| [workflows/tdd-with-claude.md](./workflows/tdd-with-claude.md) | Test-Driven Development with Claude |
| [workflows/spec-first.md](./workflows/spec-first.md) | Spec-First Development (SDD) |
| [workflows/plan-driven.md](./workflows/plan-driven.md) | Using /plan mode effectively |
| [workflows/iterative-refinement.md](./workflows/iterative-refinement.md) | Iterative improvement loops |
| [workflows/tts-setup.md](./workflows/tts-setup.md) | Add text-to-speech narration to Claude Code (18 min) |
| [workflows/task-management.md](./workflows/task-management.md) | Multi-session task tracking, TodoWrite migration |
## Recommended Reading Order
1. **New users**: Start with Quick Start section in `ultimate-guide.md`
2. **Daily reference**: Print `cheatsheet.md`
3. **Team leads**: Read `adoption-approaches.md` for rollout strategies
---
*Back to [main README](../README.md)*