marketing-shibata50/claude-code-ultimate-guide

Florian BRUNIAUX e60b24d27c docs(examples): add YAML frontmatter to 19 miscellaneous example files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-15 19:21:00 +01:00

5.3 KiB

Raw Permalink Blame History

title

description

Analytics Agent Evaluation Report

Month: [YYYY-MM] Report Date: [YYYY-MM-DD] Evaluator: [Your Name] Agent Version: 1.0

Executive Summary

[2-3 sentence overview of agent performance this month]

Key Metrics:

Total queries: [X]
Safety pass rate: [Y]%
Avg execution time: [Z]s

Status: 🟢 Healthy / 🟡 Needs Attention / 🔴 Critical

Metrics Overview

Volume

Metric	Value
Total queries generated	[X]
Unique users/sessions	[Y]
Queries per day (avg)	[Z]
Growth vs last month	[+/-]%

Quality Metrics

Metric	Target	Actual	Status
Safety pass rate	>95%	[X]%	🟢/🟡/🔴
Query correctness	>90%	[Y]%	🟢/🟡/🔴
User satisfaction	>4.0/5	[Z]/5	🟢/🟡/🔴

Performance Metrics

Metric	Target	Actual	Status
Mean execution time	<3s	[X]s	🟢/🟡/🔴
P95 execution time	<5s	[Y]s	🟢/🟡/🔴
P99 execution time	<10s	[Z]s	🟢/🟡/🔴

Safety Analysis

Safety Check Results

Total: [X] queries
- PASS: [Y] ([Z]%)
- FAIL: [A] ([B]%)

Top Safety Failures

[Failure Type] - [X] occurrences
- Example: [SQL query snippet]
- Root cause: [Brief explanation]
- Action: [What was done to fix]
[Failure Type] - [Y] occurrences
- Example: [SQL query snippet]
- Root cause: [Brief explanation]
- Action: [What was done to fix]

Trends

[Graph or description showing safety pass rate over time]

Performance Analysis

Execution Time Distribution

Mean:   [X]s
Median: [Y]s
P95:    [Z]s
P99:    [A]s
Max:    [B]s

Slowest Queries

[Query description] - [X]s
```
[SQL query]
```
- Reason: [Why slow]
- Optimization: [What could improve it]
[Query description] - [Y]s
```
[SQL query]
```
- Reason: [Why slow]
- Optimization: [What could improve it]

User Feedback

Explicit Feedback

Positive: [X] responses
- Common praise: "[Theme 1]", "[Theme 2]"
Negative: [Y] responses
- Common complaints: "[Theme 1]", "[Theme 2]"

Implicit Signals

Query retry rate: [X]% (users re-running queries)
Query modification rate: [Y]% (users editing generated queries)
Adoption rate: [Z] queries/user/week

Notable Feedback

"[User quote 1]" — [User name/role, if available]

"[User quote 2]" — [User name/role, if available]

Incident Log

Critical Issues

Date	Issue	Impact	Resolution
[YYYY-MM-DD]	[Brief description]	[High/Medium/Low]	[What was done]

Near-Misses

[List of queries that almost caused problems but were caught by safety checks]

Improvements Made

Agent Instruction Updates

[Update 1]
- Reason: [Why needed]
- Change: [What was modified in agent instructions]
- Impact: [Expected improvement]
[Update 2]
- Reason: [Why needed]
- Change: [What was modified]
- Impact: [Expected improvement]

Hook/Metrics Updates

[Any changes to metrics collection or analysis]

A/B Test Results (if applicable)

Test: [Description]

Period: [Start date] to [End date]

Variants:

Control (A): [Description]
Experiment (B): [Description]

Metrics:

Metric	Control (A)	Experiment (B)	Change
Safety pass rate	[X]%	[Y]%	[+/-]%
Avg exec time	[X]s	[Y]s	[+/-]s
User satisfaction	[X]/5	[Y]/5	[+/-]

Decision: ✅ Promote B / ❌ Keep A / ⏸️ Needs more data

Rationale: [Why this decision]

Recommendations

High Priority

[Recommendation 1]
- Current state: [Problem description]
- Proposed change: [What to do]
- Expected impact: [Improvement estimate]
- Effort: Low/Medium/High

Medium Priority

[Recommendation 2]
- Current state: [Problem description]
- Proposed change: [What to do]
- Expected impact: [Improvement estimate]
- Effort: Low/Medium/High

Low Priority / Future

[Quick list of nice-to-have improvements]

Next Month Goals

[Goal 1]: [Specific, measurable target]
[Goal 2]: [Specific, measurable target]
[Goal 3]: [Specific, measurable target]

Appendix

Methodology

Data sources:

.claude/logs/analytics-metrics.jsonl (automated metrics)
User feedback forms
Manual query reviews

Analysis tools:

eval/metrics.sh for automated reporting
SQL queries for deep-dive analysis
Manual review of safety failures

Limitations:

[Any known gaps in data collection]
[Potential biases in analysis]

Raw Data

Export: analytics-metrics-[YYYY-MM].json

Query:

jq 'select(.timestamp >= "2026-MM-01" and .timestamp < "2026-MM+1-01")' \
  .claude/logs/analytics-metrics.jsonl > analytics-metrics-2026-MM.json

Previous Reports: [Link to folder with past reports]

Questions? Contact [evaluation team email/slack]

5.3 KiB Raw Permalink Blame History