feat: improve skill descriptions from PR #9 (selective merge)
Cherry-pick description improvements and allowed-tools fixes from @popey's PR #9, while preserving reference documentation in skills that serve as templates (audit-agents-skills, ccboard, design-patterns). Co-Authored-By: Alan Pope <alan@popey.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
be52e232b3
commit
40213f0a7e
23 changed files with 1994 additions and 197 deletions
76
examples/memory/icm-session-starter.md
Normal file
76
examples/memory/icm-session-starter.md
Normal file
|
|
@ -0,0 +1,76 @@
|
|||
# ICM Session Starter
|
||||
> Paste this at the beginning of any Claude Code session to activate ICM context.
|
||||
> Requires ICM installed and configured: `brew tap rtk-ai/tap && brew install icm`
|
||||
> then `icm init --mode mcp && icm init --mode hook && icm init --mode skill`
|
||||
|
||||
---
|
||||
|
||||
# Context — ICM (Infinite Context Memory) active in this session
|
||||
|
||||
ICM is installed and configured on this machine. Use it to store and retrieve persistent
|
||||
memory across sessions, bypassing context window limits.
|
||||
|
||||
## Available MCP tools
|
||||
|
||||
The `icm` MCP server is running. You have access to 22 `icm_*` tools for storing,
|
||||
recalling, and managing persistent memory.
|
||||
|
||||
**Direct CLI** (via Bash if needed):
|
||||
|
||||
```bash
|
||||
# Store a memory
|
||||
icm store --topic "<project-slug>" --content "<fact>" --importance high|medium|low|critical
|
||||
|
||||
# Recall by semantic query
|
||||
icm recall "<natural language query>"
|
||||
|
||||
# Inspect
|
||||
icm stats # count, topics, avg weight
|
||||
icm topics # list all topics
|
||||
icm list # list recent memories
|
||||
|
||||
# Manage
|
||||
icm forget <id> # delete by ID
|
||||
icm decay # apply temporal decay
|
||||
icm prune # remove low-weight entries
|
||||
```
|
||||
|
||||
**Important (v0.5.0 syntax)**:
|
||||
- `--importance` is an enum: `critical / high / medium / low` — not a float
|
||||
- No `memory` subcommand — use `icm store`, `icm recall` directly
|
||||
- Permanent knowledge graph: `icm memoir` (separate layer, no decay)
|
||||
|
||||
## Slash commands
|
||||
|
||||
- `/recall <query>` — search ICM memory
|
||||
- `/remember <content>` — store a memory in ICM
|
||||
|
||||
## How memories work
|
||||
|
||||
Two layers:
|
||||
- **Memories** (episodic): timestamped entries with temporal decay based on importance.
|
||||
`critical` importance never decays. `low` fades over time.
|
||||
- **Memoir** (semantic): permanent knowledge graph with typed relations
|
||||
(`depends_on`, `contradicts`, `superseded_by`, `part_of`, and 5 others).
|
||||
|
||||
Search is hybrid: BM25 full-text (30%) + vector similarity (70%).
|
||||
|
||||
A `PostToolUse` hook runs automatically — every N tool calls, ICM extracts context
|
||||
and stores it without any explicit action from you.
|
||||
|
||||
## Suggested usage in this session
|
||||
|
||||
```bash
|
||||
# At session start — recall relevant context
|
||||
icm recall "<current feature or topic>"
|
||||
|
||||
# When making a key decision
|
||||
icm store --topic "<project>" --content "<decision and rationale>" --importance high
|
||||
|
||||
# For permanent architectural facts
|
||||
icm memoir add-concept -m "<project>" -n "<concept>"
|
||||
```
|
||||
|
||||
## DB location
|
||||
|
||||
`~/Library/Application Support/dev.icm.icm/memories.db`
|
||||
|
|
@ -8,9 +8,25 @@ version: 1.0.0
|
|||
tags: [quality, audit, agents, skills, validation, production-readiness]
|
||||
---
|
||||
|
||||
# Audit Agents/Skills/Commands
|
||||
# Audit Agents/Skills/Commands (Advanced Skill)
|
||||
|
||||
Score Claude Code agents, skills, and commands across 16 weighted criteria. Outputs production readiness grades (A-F) with actionable fix suggestions.
|
||||
Comprehensive quality audit system for Claude Code agents, skills, and commands. Provides quantitative scoring, comparative analysis, and production readiness grading based on industry best practices.
|
||||
|
||||
## Purpose
|
||||
|
||||
**Problem**: Manual validation of agents/skills is error-prone and inconsistent. According to the LangChain Agent Report 2026, 29.5% of organizations deploy agents without systematic evaluation, leading to "agent bugs" as the top challenge (18% of teams).
|
||||
|
||||
**Solution**: Automated quality scoring across 16 weighted criteria with production readiness thresholds (80% = Grade B minimum for production deployment).
|
||||
|
||||
**Key Features**:
|
||||
- Quantitative scoring (32 points for agents/skills, 20 for commands)
|
||||
- Weighted criteria (Identity 3x, Prompt 2x, Validation 1x, Design 2x)
|
||||
- Production readiness grading (A-F scale with 80% threshold)
|
||||
- Comparative analysis vs reference templates
|
||||
- JSON/Markdown dual output for programmatic integration
|
||||
- Fix suggestions for failing criteria
|
||||
|
||||
---
|
||||
|
||||
## Modes
|
||||
|
||||
|
|
@ -22,126 +38,420 @@ Score Claude Code agents, skills, and commands across 16 weighted criteria. Outp
|
|||
|
||||
**Default**: Full Audit (recommended for first run)
|
||||
|
||||
---
|
||||
|
||||
## Methodology
|
||||
|
||||
### Why These Criteria?
|
||||
|
||||
The 16-criteria framework is derived from:
|
||||
1. **Claude Code Best Practices** (Ultimate Guide line 4921: Agent Validation Checklist)
|
||||
2. **Industry Data** (LangChain Agent Report 2026: evaluation gaps)
|
||||
3. **Production Failures** (Community feedback on hardcoded paths, missing error handling)
|
||||
4. **Composition Patterns** (Skills should reference other skills, agents should be modular)
|
||||
|
||||
### Scoring Philosophy
|
||||
|
||||
**Weight Rationale**:
|
||||
- **Identity (3x)**: If users can't find/invoke the agent, quality is irrelevant (discoverability > quality)
|
||||
- **Prompt (2x)**: Determines reliability and accuracy of outputs
|
||||
- **Validation (1x)**: Improves robustness but is secondary to core functionality
|
||||
- **Design (2x)**: Impacts long-term maintainability and scalability
|
||||
|
||||
**Grade Standards**:
|
||||
- **A (90-100%)**: Production-ready, minimal risk
|
||||
- **B (80-89%)**: Good, meets production threshold
|
||||
- **C (70-79%)**: Needs improvement before production
|
||||
- **D (60-69%)**: Significant gaps, not production-ready
|
||||
- **F (<60%)**: Critical issues, requires major refactoring
|
||||
|
||||
**Industry Alignment**: The 80% threshold aligns with software engineering best practices for production deployment (e.g., code coverage >80%, security scan pass rates).
|
||||
|
||||
---
|
||||
|
||||
## Workflow
|
||||
|
||||
### Phase 1: Discovery
|
||||
|
||||
Scan and classify files from:
|
||||
```
|
||||
.claude/agents/ .claude/skills/ .claude/commands/
|
||||
examples/agents/ examples/skills/ examples/commands/
|
||||
```
|
||||
1. **Scan directories**:
|
||||
```
|
||||
.claude/agents/
|
||||
.claude/skills/
|
||||
.claude/commands/
|
||||
examples/agents/ (if exists)
|
||||
examples/skills/ (if exists)
|
||||
examples/commands/ (if exists)
|
||||
```
|
||||
|
||||
**Checkpoint**: Confirm file count and types before proceeding to scoring.
|
||||
2. **Classify files** by type (agent/skill/command)
|
||||
|
||||
### Phase 2: Scoring
|
||||
3. **Load reference templates** (for Comparative mode):
|
||||
```
|
||||
guide/examples/agents/ (benchmark files)
|
||||
guide/examples/skills/ (benchmark files)
|
||||
guide/examples/commands/ (benchmark files)
|
||||
```
|
||||
|
||||
### Phase 2: Scoring Engine
|
||||
|
||||
Load scoring criteria from `scoring/criteria.yaml`:
|
||||
|
||||
```yaml
|
||||
agents:
|
||||
max_points: 32
|
||||
categories:
|
||||
identity:
|
||||
weight: 3
|
||||
criteria:
|
||||
- id: A1.1
|
||||
name: "Clear name"
|
||||
points: 3
|
||||
detection: "frontmatter.name exists and is descriptive"
|
||||
# ... (16 total criteria)
|
||||
```
|
||||
|
||||
For each file:
|
||||
1. Parse YAML frontmatter
|
||||
1. Parse frontmatter (YAML)
|
||||
2. Extract content sections
|
||||
3. Run detection patterns (regex, keyword search)
|
||||
4. Calculate score: `(points / max_points) x 100`
|
||||
5. Assign grade: A (90-100%), B (80-89%), C (70-79%), D (60-69%), F (<60%)
|
||||
|
||||
**Checkpoint**: Verify at least one file scores successfully before batch processing.
|
||||
|
||||
**Grade threshold**: 80% (Grade B) = minimum for production deployment.
|
||||
4. Calculate score: `(points / max_points) × 100`
|
||||
5. Assign grade (A-F)
|
||||
|
||||
### Phase 3: Comparative Analysis (Comparative Mode Only)
|
||||
|
||||
1. Match each project file to closest reference template
|
||||
For each project file:
|
||||
1. Find closest matching template (by description similarity)
|
||||
2. Compare scores per criterion
|
||||
3. Flag gaps >10 points
|
||||
3. Identify gaps: `template_score - project_score`
|
||||
4. Flag significant gaps (>10 points difference)
|
||||
|
||||
**Example**:
|
||||
```
|
||||
Project file: .claude/agents/debugging-specialist.md (Score: 78%, Grade C)
|
||||
Closest template: examples/agents/debugging-specialist.md (Score: 94%, Grade A)
|
||||
|
||||
Gaps:
|
||||
- Anti-hallucination measures: -2 points (template has, project missing)
|
||||
- Edge cases documented: -1 point (template has 5 examples, project has 1)
|
||||
- Integration documented: -1 point (template references 3 skills, project none)
|
||||
|
||||
Total gap: 16 points (explains C vs A difference)
|
||||
```
|
||||
|
||||
### Phase 4: Report Generation
|
||||
|
||||
Output both `audit-report.md` (human-readable) and `audit-report.json` (programmatic):
|
||||
**Markdown Report** (`audit-report.md`):
|
||||
- Summary table (overall + by type)
|
||||
- Individual scores with top issues
|
||||
- Detailed breakdown per file (collapsible)
|
||||
- Prioritized recommendations
|
||||
|
||||
**JSON Output** (`audit-report.json`):
|
||||
```json
|
||||
{
|
||||
"metadata": {
|
||||
"project_path": "/path/to/project",
|
||||
"audit_date": "2026-02-07",
|
||||
"mode": "full",
|
||||
"version": "1.0.0"
|
||||
},
|
||||
"summary": {
|
||||
"overall_score": 82.5,
|
||||
"overall_grade": "B",
|
||||
"total_files": 15,
|
||||
"production_ready_count": 10
|
||||
"production_ready_count": 10,
|
||||
"production_ready_percentage": 66.7
|
||||
},
|
||||
"by_type": {
|
||||
"agents": { "count": 5, "avg_score": 85.2, "grade": "B" },
|
||||
"skills": { "count": 8, "avg_score": 78.9, "grade": "C" },
|
||||
"commands": { "count": 2, "avg_score": 92.0, "grade": "A" }
|
||||
},
|
||||
"files": [
|
||||
{
|
||||
"path": ".claude/agents/debugging-specialist.md",
|
||||
"type": "agent",
|
||||
"score": 78.1,
|
||||
"grade": "C",
|
||||
"points_obtained": 25,
|
||||
"points_max": 32,
|
||||
"failed_criteria": [
|
||||
{ "id": "A2.4", "name": "Anti-hallucination measures", "points_lost": 2 }
|
||||
{
|
||||
"id": "A2.4",
|
||||
"name": "Anti-hallucination measures",
|
||||
"points_lost": 2,
|
||||
"recommendation": "Add section on source verification"
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"top_issues": [
|
||||
{
|
||||
"issue": "Missing error handling",
|
||||
"affected_files": 8,
|
||||
"impact": "Runtime failures unhandled",
|
||||
"priority": "high"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Checkpoint**: Verify report file is written and contains all scanned files.
|
||||
|
||||
### Phase 5: Fix Suggestions (Optional)
|
||||
|
||||
For each failing criterion, generate actionable fix with the specific section to add and detection keywords to verify the fix.
|
||||
For each failing criterion, generate **actionable fix**:
|
||||
|
||||
```markdown
|
||||
### File: .claude/agents/debugging-specialist.md
|
||||
**Issue**: Missing anti-hallucination measures (2 points lost)
|
||||
|
||||
**Fix**:
|
||||
Add this section after "Methodology":
|
||||
|
||||
## Source Verification
|
||||
|
||||
- Always cite sources for technical claims
|
||||
- Use phrases: "According to [documentation]...", "Based on [tool output]..."
|
||||
- If uncertain, state: "I don't have verified information on..."
|
||||
- Never invent: statistics, version numbers, API signatures, stack traces
|
||||
|
||||
**Detection**: Grep for keywords: "verify", "cite", "source", "evidence"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Scoring Criteria
|
||||
|
||||
See `scoring/criteria.yaml` for complete definitions. Summary:
|
||||
|
||||
### Agents (32 points max)
|
||||
|
||||
| Category | Weight | Max Points | Key Criteria |
|
||||
|----------|--------|------------|--------------|
|
||||
| Identity | 3x | 12 | Clear name, description with triggers, role defined |
|
||||
| Prompt Quality | 2x | 8 | 3+ examples, anti-hallucination measures |
|
||||
| Validation | 1x | 4 | Error handling, no hardcoded paths |
|
||||
| Design | 2x | 8 | Single responsibility, integration documented |
|
||||
| Category | Weight | Criteria Count | Max Points |
|
||||
|----------|--------|----------------|------------|
|
||||
| Identity | 3x | 4 | 12 |
|
||||
| Prompt Quality | 2x | 4 | 8 |
|
||||
| Validation | 1x | 4 | 4 |
|
||||
| Design | 2x | 4 | 8 |
|
||||
|
||||
**Key Criteria**:
|
||||
- Clear name (3 pts): Not generic like "agent1"
|
||||
- Description with triggers (3 pts): Contains "when"/"use"
|
||||
- Role defined (2 pts): "You are..." statement
|
||||
- 3+ examples (1 pt): Usage scenarios documented
|
||||
- Single responsibility (2 pts): Focused, not "general purpose"
|
||||
|
||||
### Skills (32 points max)
|
||||
|
||||
| Category | Weight | Max Points | Key Criteria |
|
||||
|----------|--------|------------|--------------|
|
||||
| Structure | 3x | 12 | Valid SKILL.md, valid name, methodology section |
|
||||
| Content | 2x | 8 | Clear triggers, usage examples |
|
||||
| Technical | 1x | 4 | No hardcoded paths, token budget |
|
||||
| Design | 2x | 8 | Modular, references other skills |
|
||||
| Category | Weight | Criteria Count | Max Points |
|
||||
|----------|--------|----------------|------------|
|
||||
| Structure | 3x | 4 | 12 |
|
||||
| Content | 2x | 4 | 8 |
|
||||
| Technical | 1x | 4 | 4 |
|
||||
| Design | 2x | 4 | 8 |
|
||||
|
||||
**Key Criteria**:
|
||||
- Valid SKILL.md (3 pts): Proper naming
|
||||
- Name valid (3 pts): Lowercase, 1-64 chars, no spaces
|
||||
- Methodology described (2 pts): Workflow section exists
|
||||
- No hardcoded paths (1 pt): No `/Users/`, `/home/`
|
||||
- Clear triggers (2 pts): "When to use" section
|
||||
|
||||
### Commands (20 points max)
|
||||
|
||||
| Category | Weight | Max Points | Key Criteria |
|
||||
|----------|--------|------------|--------------|
|
||||
| Structure | 3x | 12 | Valid frontmatter, argument hint, step-by-step |
|
||||
| Quality | 2x | 8 | Error handling, mentions failure modes |
|
||||
| Category | Weight | Criteria Count | Max Points |
|
||||
|----------|--------|----------------|------------|
|
||||
| Structure | 3x | 4 | 12 |
|
||||
| Quality | 2x | 4 | 8 |
|
||||
|
||||
**Key Criteria**:
|
||||
- Valid frontmatter (3 pts): name + description
|
||||
- Argument hint (3 pts): If uses `$ARGUMENTS`
|
||||
- Step-by-step workflow (3 pts): Numbered sections
|
||||
- Error handling (2 pts): Mentions failure modes
|
||||
|
||||
---
|
||||
|
||||
## Detection Patterns
|
||||
|
||||
### Frontmatter Parsing
|
||||
|
||||
```python
|
||||
import yaml
|
||||
import re
|
||||
|
||||
def parse_frontmatter(content):
|
||||
match = re.search(r'^---\n(.*?)\n---', content, re.DOTALL)
|
||||
if match:
|
||||
return yaml.safe_load(match.group(1))
|
||||
return None
|
||||
```
|
||||
|
||||
### Keyword Detection
|
||||
|
||||
```python
|
||||
def has_keywords(text, keywords):
|
||||
text_lower = text.lower()
|
||||
return any(kw in text_lower for kw in keywords)
|
||||
|
||||
# Example
|
||||
has_trigger = has_keywords(description, ['when', 'use', 'trigger'])
|
||||
has_error_handling = has_keywords(content, ['error', 'failure', 'fallback'])
|
||||
```
|
||||
|
||||
### Overlap Detection (Duplication Check)
|
||||
|
||||
```python
|
||||
def jaccard_similarity(text1, text2):
|
||||
words1 = set(text1.lower().split())
|
||||
words2 = set(text2.lower().split())
|
||||
intersection = words1 & words2
|
||||
union = words1 | words2
|
||||
return len(intersection) / len(union) if union else 0
|
||||
|
||||
# Flag if similarity > 0.5 (50% keyword overlap)
|
||||
if jaccard_similarity(desc1, desc2) > 0.5:
|
||||
issues.append("High overlap with another file")
|
||||
```
|
||||
|
||||
### Token Counting (Approximate)
|
||||
|
||||
```python
|
||||
def estimate_tokens(text):
|
||||
# Rough estimate: 1 token ≈ 0.75 words
|
||||
word_count = len(text.split())
|
||||
return int(word_count * 1.3)
|
||||
|
||||
# Check budget
|
||||
tokens = estimate_tokens(file_content)
|
||||
if tokens > 5000:
|
||||
issues.append("File too large (>5K tokens)")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Industry Context
|
||||
|
||||
**Source**: LangChain Agent Report 2026 (public report, page 14-22)
|
||||
|
||||
**Key Findings**:
|
||||
- **29.5%** of organizations deploy agents without systematic evaluation
|
||||
- **18%** cite "agent bugs" as their primary challenge
|
||||
- **Only 12%** use automated quality checks (88% manual or none)
|
||||
- **43%** report difficulty maintaining agent quality over time
|
||||
- **Top issues**: Hallucinations (31%), poor error handling (28%), unclear triggers (22%)
|
||||
|
||||
**Implications**:
|
||||
1. **Automation gap**: Most teams rely on manual checklists (error-prone at scale)
|
||||
2. **Quality debt**: Agents deployed without validation accumulate technical debt
|
||||
3. **Maintenance burden**: 43% struggle with quality over time (no tracking system)
|
||||
|
||||
**This skill addresses**:
|
||||
- Automation: Replaces manual checklists with quantitative scoring
|
||||
- Tracking: JSON output enables trend analysis over time
|
||||
- Standards: 80% threshold provides clear production gate
|
||||
|
||||
---
|
||||
|
||||
## Output Examples
|
||||
|
||||
### Quick Audit (Top-5 Criteria)
|
||||
|
||||
```markdown
|
||||
# Quick Audit: Agents/Skills/Commands
|
||||
|
||||
**Files**: 15 (5 agents, 8 skills, 2 commands)
|
||||
**Critical Issues**: 3 files fail top-5 criteria
|
||||
|
||||
## Top-5 Criteria (Pass/Fail)
|
||||
|
||||
| File | Valid Name | Has Triggers | Error Handling | No Hardcoded Paths | Examples |
|
||||
|------|------------|--------------|----------------|--------------------|----------|
|
||||
| agent1.md | ✅ | ✅ | ❌ | ✅ | ❌ |
|
||||
| skill2/ | ✅ | ❌ | ✅ | ❌ | ✅ |
|
||||
|
||||
## Action Required
|
||||
|
||||
1. **Add error handling**: 5 files
|
||||
2. **Remove hardcoded paths**: 3 files
|
||||
3. **Add usage examples**: 4 files
|
||||
```
|
||||
|
||||
### Full Audit
|
||||
|
||||
See Phase 4: Report Generation above for full structure.
|
||||
|
||||
### Comparative (Full + Benchmarks)
|
||||
|
||||
```markdown
|
||||
# Comparative Audit
|
||||
|
||||
## Project vs Templates
|
||||
|
||||
| File | Project Score | Template Score | Gap | Top Missing |
|
||||
|------|---------------|----------------|-----|-------------|
|
||||
| debugging-specialist.md | 78% (C) | 94% (A) | -16 pts | Anti-hallucination, edge cases |
|
||||
| testing-expert/ | 85% (B) | 91% (A) | -6 pts | Integration docs |
|
||||
|
||||
## Recommendations
|
||||
|
||||
Focus on these gaps to reach template quality:
|
||||
1. **Anti-hallucination measures** (8 files): Add source verification sections
|
||||
2. **Edge case documentation** (5 files): Add failure scenario examples
|
||||
3. **Integration documentation** (4 files): List compatible agents/skills
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic (Full Audit)
|
||||
|
||||
```bash
|
||||
# Full audit (default)
|
||||
# In Claude Code
|
||||
Use skill: audit-agents-skills
|
||||
|
||||
# Specify path
|
||||
Use skill: audit-agents-skills for ~/projects/my-app
|
||||
```
|
||||
|
||||
# Quick audit
|
||||
### With Options
|
||||
|
||||
```bash
|
||||
# Quick audit (fast)
|
||||
Use skill: audit-agents-skills with mode=quick
|
||||
|
||||
# Comparative with benchmarks
|
||||
# Comparative (benchmark analysis)
|
||||
Use skill: audit-agents-skills with mode=comparative
|
||||
|
||||
# Generate fixes
|
||||
Use skill: audit-agents-skills with fixes=true
|
||||
|
||||
# JSON output for CI/CD
|
||||
# Custom output path
|
||||
Use skill: audit-agents-skills with output=~/Desktop/audit.json
|
||||
```
|
||||
|
||||
### JSON Output Only
|
||||
|
||||
```bash
|
||||
# For programmatic integration
|
||||
Use skill: audit-agents-skills with format=json output=audit.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Integration with CI/CD
|
||||
|
||||
### Pre-commit Hook
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# .git/hooks/pre-commit
|
||||
|
||||
# Run quick audit on changed agent/skill/command files
|
||||
changed_files=$(git diff --cached --name-only | grep -E "^\.claude/(agents|skills|commands)/")
|
||||
|
||||
if [ -n "$changed_files" ]; then
|
||||
echo "Running quick audit on changed files..."
|
||||
# Run audit (requires Claude Code CLI wrapper)
|
||||
# Exit with 1 if any file scores <80%
|
||||
fi
|
||||
```
|
||||
|
|
@ -158,10 +468,80 @@ jobs:
|
|||
- uses: actions/checkout@v3
|
||||
- name: Run quality audit
|
||||
run: |
|
||||
# Run audit skill, parse JSON, fail if overall_score < 80
|
||||
# Run audit skill
|
||||
# Parse JSON output
|
||||
# Fail if overall_score < 80
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Comparison: Command vs Skill
|
||||
|
||||
| Aspect | Command (`/audit-agents-skills`) | Skill (this file) |
|
||||
|--------|----------------------------------|-------------------|
|
||||
| **Scope** | Current project only | Multi-project, comparative |
|
||||
| **Output** | Markdown report | Markdown + JSON |
|
||||
| **Speed** | Fast (5-10 min) | Slower (10-20 min with comparative) |
|
||||
| **Depth** | Standard 16 criteria | Same + benchmark analysis |
|
||||
| **Fix suggestions** | Via `--fix` flag | Built-in with recommendations |
|
||||
| **Programmatic** | Terminal output | JSON for CI/CD integration |
|
||||
| **Best for** | Quick checks, dev workflow | Deep audits, quality tracking |
|
||||
|
||||
**Recommendation**: Use command for daily checks, skill for release gates and quality tracking.
|
||||
|
||||
---
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Updating Criteria
|
||||
|
||||
Edit `scoring/criteria.yaml`:
|
||||
```yaml
|
||||
agents:
|
||||
categories:
|
||||
identity:
|
||||
criteria:
|
||||
- id: A1.5 # New criterion
|
||||
name: "API versioning specified"
|
||||
points: 3
|
||||
detection: "mentions API version or compatibility"
|
||||
```
|
||||
|
||||
Version bump: Increment `version` in frontmatter when criteria change.
|
||||
|
||||
### Adding File Types
|
||||
|
||||
To support new file types (e.g., "workflows"):
|
||||
1. Add to `scoring/criteria.yaml`:
|
||||
```yaml
|
||||
workflows:
|
||||
max_points: 24
|
||||
categories: [...]
|
||||
```
|
||||
2. Update detection logic (file path patterns)
|
||||
3. Update report templates
|
||||
|
||||
---
|
||||
|
||||
## Related
|
||||
|
||||
- **Command version**: `.claude/commands/audit-agents-skills.md` (quick checks, dev workflow)
|
||||
- **Command version**: `.claude/commands/audit-agents-skills.md`
|
||||
- **Agent Validation Checklist**: guide line 4921 (manual 16 criteria)
|
||||
- **Skill Validation**: guide line 5491 (spec documentation)
|
||||
- **Reference templates**: `examples/agents/`, `examples/skills/`, `examples/commands/`
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
**v1.0.0** (2026-02-07):
|
||||
- Initial release
|
||||
- 16-criteria framework (agents/skills/commands)
|
||||
- 3 audit modes (quick/full/comparative)
|
||||
- JSON + Markdown output
|
||||
- Fix suggestions
|
||||
- Industry context (LangChain 2026 report)
|
||||
|
||||
---
|
||||
|
||||
**Skill ready for use**: `audit-agents-skills`
|
||||
|
|
|
|||
|
|
@ -9,20 +9,38 @@ tags: [dashboard, tui, monitoring, claude-code, costs]
|
|||
|
||||
# ccboard - Claude Code Dashboard
|
||||
|
||||
TUI/Web dashboard for monitoring Claude Code usage: sessions, costs, tokens, MCP servers, and configuration.
|
||||
Comprehensive TUI/Web dashboard for monitoring and managing your Claude Code usage.
|
||||
|
||||
## Prerequisites
|
||||
## Overview
|
||||
|
||||
ccboard provides a unified interface to visualize and explore all your Claude Code data:
|
||||
|
||||
- **Sessions**: Browse all conversations across your projects
|
||||
- **Statistics**: Real-time token usage, cache hit rates, activity trends
|
||||
- **MCP Servers**: Monitor and manage Model Context Protocol servers
|
||||
- **Costs**: Track spending with detailed token breakdown and pricing
|
||||
- **Configuration**: View cascading settings (Global > Project > Local)
|
||||
- **Hooks**: Explore pre/post execution hooks and automation
|
||||
- **Agents**: Manage custom agents, commands, and skills
|
||||
- **History**: Search across all messages with full-text search
|
||||
|
||||
## Installation
|
||||
|
||||
### Via Cargo (Recommended)
|
||||
|
||||
```bash
|
||||
# Using Claude Code command
|
||||
/ccboard-install
|
||||
|
||||
# Or manually
|
||||
cargo install ccboard
|
||||
```
|
||||
|
||||
### Requirements
|
||||
|
||||
- Rust 1.70+ and Cargo
|
||||
- Claude Code installed (reads from `~/.claude/`)
|
||||
|
||||
```bash
|
||||
# Install
|
||||
cargo install ccboard
|
||||
# Or via Claude Code command
|
||||
/ccboard-install
|
||||
```
|
||||
|
||||
## Commands
|
||||
|
||||
| Command | Description | Shortcut |
|
||||
|
|
@ -34,78 +52,347 @@ cargo install ccboard
|
|||
| `/ccboard-web` | Launch web UI | `ccboard web` |
|
||||
| `/ccboard-install` | Install/update ccboard | - |
|
||||
|
||||
## Tabs Overview
|
||||
## Features
|
||||
|
||||
| Tab | Key | What It Shows |
|
||||
|-----|-----|---------------|
|
||||
| Dashboard | `1` | Token stats, cache ratio, 7-day sparkline, model gauges |
|
||||
| Sessions | `2` | Project tree + session list, search with `/`, edit with `e` |
|
||||
| Config | `3` | Cascading settings: Global / Project / Local / Merged |
|
||||
| Hooks | `4` | Event-based hooks, script preview, match patterns |
|
||||
| Agents | `5` | Agents, commands, skills with frontmatter extraction |
|
||||
| Costs | `6` | Overview, by-model breakdown, daily trend |
|
||||
| History | `7` | Full-text search across all sessions |
|
||||
| MCP | `8` | Server status (Running/Stopped), details, quick actions |
|
||||
### 8 Interactive Tabs
|
||||
|
||||
## Navigation
|
||||
#### 1. Dashboard (Press `1`)
|
||||
- Token usage statistics
|
||||
- Session count
|
||||
- Messages sent
|
||||
- Cache hit ratio
|
||||
- MCP server count
|
||||
- 7-day activity sparkline
|
||||
- Top 5 models usage gauges
|
||||
|
||||
| Keys | Action |
|
||||
|------|--------|
|
||||
| `1-8` | Jump to tab |
|
||||
| `Tab` / `Shift+Tab` | Navigate tabs |
|
||||
| `h/j/k/l` or arrows | Navigate within tab |
|
||||
| `Enter` | View details / Focus pane |
|
||||
| `e` | Edit file in `$EDITOR` |
|
||||
| `o` | Reveal file in finder |
|
||||
| `/` | Search (Sessions/History) |
|
||||
| `F5` | Refresh data |
|
||||
| `q` | Quit |
|
||||
#### 2. Sessions (Press `2`)
|
||||
- Dual-pane: Project tree + Session list
|
||||
- Metadata: timestamps, duration, tokens, models
|
||||
- Search: Filter by project, message, or model (press `/`)
|
||||
- File operations: `e` to edit JSONL, `o` to reveal in finder
|
||||
|
||||
#### 3. Config (Press `3`)
|
||||
- 4-column cascading view: Global | Project | Local | Merged
|
||||
- Settings inheritance visualization
|
||||
- MCP servers configuration
|
||||
- Rules (CLAUDE.md) preview
|
||||
- Permissions, hooks, environment variables
|
||||
- Edit config with `e` key
|
||||
|
||||
#### 4. Hooks (Press `4`)
|
||||
- Event-based hook browsing (PreToolUse, UserPromptSubmit)
|
||||
- Hook bash script preview
|
||||
- Match patterns and conditions
|
||||
- File path tracking for easy editing
|
||||
|
||||
#### 5. Agents (Press `5`)
|
||||
- 3 sub-tabs: Agents (12) | / Commands (5) | ★ Skills (0)
|
||||
- Frontmatter metadata extraction
|
||||
- File preview and editing
|
||||
- Recursive directory scanning
|
||||
|
||||
#### 6. Costs (Press `6`)
|
||||
- 3 views: Overview | By Model | Daily Trend
|
||||
- Token breakdown: input, output, cache read/write
|
||||
- Pricing: total estimated costs
|
||||
- Model distribution breakdown
|
||||
|
||||
#### 7. History (Press `7`)
|
||||
- Full-text search across all sessions
|
||||
- Activity by hour histogram (24h)
|
||||
- 7-day sparkline
|
||||
- All messages searchable
|
||||
|
||||
#### 8. MCP (Press `8`) **NEW**
|
||||
- Dual-pane: Server list (35%) | Details (65%)
|
||||
- Live status detection: ● Running, ○ Stopped, ? Unknown
|
||||
- Full server details: command, args, environment vars
|
||||
- Quick actions: `e` edit config, `o` reveal file, `r` refresh status
|
||||
|
||||
### Navigation
|
||||
|
||||
**Global Keys**:
|
||||
- `1-8` : Jump to tab
|
||||
- `Tab` / `Shift+Tab` : Navigate tabs
|
||||
- `q` : Quit
|
||||
- `F5` : Refresh data
|
||||
|
||||
**Vim-style**:
|
||||
- `h/j/k/l` : Navigate (left/down/up/right)
|
||||
- `←/→/↑/↓` : Arrow alternatives
|
||||
|
||||
**Common Actions**:
|
||||
- `Enter` : View details / Focus pane
|
||||
- `e` : Edit file in $EDITOR
|
||||
- `o` : Reveal file in finder
|
||||
- `/` : Search (in Sessions/History tabs)
|
||||
- `Esc` : Close popup / Cancel
|
||||
|
||||
### Real-time Monitoring
|
||||
|
||||
ccboard includes a file watcher that monitors `~/.claude/` for changes:
|
||||
|
||||
- **Stats updates**: Live refresh when `stats-cache.json` changes
|
||||
- **Session updates**: New sessions appear automatically
|
||||
- **Config updates**: Settings changes reflected in UI
|
||||
- **500ms debounce**: Prevents excessive updates
|
||||
|
||||
### File Editing
|
||||
|
||||
Press `e` on any item to open in your preferred editor:
|
||||
|
||||
- Uses `$VISUAL` > `$EDITOR` > platform default (nano/notepad)
|
||||
- Supports: Sessions (JSONL), Config (JSON), Hooks (Shell), Agents (Markdown)
|
||||
- Terminal state preserved (alternate screen mode)
|
||||
- Cross-platform (macOS, Linux, Windows)
|
||||
|
||||
### MCP Server Management
|
||||
|
||||
The MCP tab provides comprehensive server monitoring:
|
||||
|
||||
**Status Detection** (Unix):
|
||||
- Checks running processes via `ps aux`
|
||||
- Extracts package name from command
|
||||
- Displays PID when running
|
||||
- Windows shows "Unknown" status
|
||||
|
||||
**Server Details**:
|
||||
- Full command and arguments
|
||||
- Environment variables with values
|
||||
- Config file path (`~/.claude/claude_desktop_config.json`)
|
||||
- Quick edit/reveal actions
|
||||
|
||||
**Navigation**:
|
||||
- `h/l` or `←/→` : Switch between list and details
|
||||
- `j/k` or `↑/↓` : Select server
|
||||
- `Enter` : Focus detail pane
|
||||
- `e` : Edit MCP config
|
||||
- `o` : Reveal config in finder
|
||||
- `r` : Refresh server status
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Daily Monitoring
|
||||
|
||||
```bash
|
||||
# Launch dashboard
|
||||
/dashboard
|
||||
# Press '1' for overview, '6' for costs, '7' for history
|
||||
|
||||
# Check activity and costs
|
||||
# Press '1' for overview
|
||||
# Press '6' for costs breakdown
|
||||
# Press '7' for recent history
|
||||
```
|
||||
|
||||
### MCP Troubleshooting
|
||||
|
||||
```bash
|
||||
# Open MCP tab
|
||||
/mcp-status
|
||||
# Check server status (green = running)
|
||||
# Press 'e' to edit config, 'r' to refresh status
|
||||
|
||||
# Or: ccboard then press '8'
|
||||
|
||||
# Check server status (● green = running)
|
||||
# Press 'e' to edit config if needed
|
||||
# Press 'r' to refresh status after changes
|
||||
```
|
||||
|
||||
### Session Analysis
|
||||
|
||||
```bash
|
||||
# Browse sessions
|
||||
/sessions
|
||||
# Press '/' to search by project, model, or message content
|
||||
# Press 'e' on a session to view full JSONL
|
||||
|
||||
# Press '/' to search
|
||||
# Filter by project: /my-project
|
||||
# Filter by model: /opus
|
||||
# Press 'e' on session to view full JSONL
|
||||
```
|
||||
|
||||
### Cost Tracking
|
||||
|
||||
```bash
|
||||
# View costs
|
||||
/costs
|
||||
|
||||
# Press '1' for overview
|
||||
# Press '2' for breakdown by model
|
||||
# Press '3' for daily trend
|
||||
|
||||
# Identify expensive sessions
|
||||
# Track cache efficiency (99.9% hit rate)
|
||||
```
|
||||
|
||||
## Web Interface
|
||||
|
||||
Launch browser-based interface for remote monitoring:
|
||||
|
||||
```bash
|
||||
/ccboard-web # Launch at http://localhost:3333
|
||||
ccboard web --port 8080 # Custom port
|
||||
ccboard both --port 3333 # TUI + Web simultaneously
|
||||
# Launch web UI
|
||||
/ccboard-web
|
||||
|
||||
# Or with custom port
|
||||
ccboard web --port 8080
|
||||
|
||||
# Access at http://localhost:3333
|
||||
```
|
||||
|
||||
## Validation
|
||||
**Features**:
|
||||
- Same data as TUI (shared backend)
|
||||
- Server-Sent Events (SSE) for live updates
|
||||
- Responsive design (desktop/tablet/mobile)
|
||||
- Concurrent multi-user access
|
||||
|
||||
After launching, verify ccboard is working:
|
||||
**Run both simultaneously**:
|
||||
```bash
|
||||
ccboard both --port 3333
|
||||
```
|
||||
|
||||
1. Run `/dashboard` and confirm token stats load on tab `1`
|
||||
2. Press `2` and verify sessions are listed
|
||||
3. Press `6` and confirm cost data appears
|
||||
4. If no data: check `ls ~/.claude/` and `cat ~/.claude/stats-cache.json`
|
||||
## Architecture
|
||||
|
||||
ccboard is a single Rust binary with dual frontends:
|
||||
|
||||
```
|
||||
ccboard/
|
||||
├── ccboard-core/ # Parsers, models, data store, watcher
|
||||
├── ccboard-tui/ # Ratatui frontend (8 tabs)
|
||||
└── ccboard-web/ # Axum + Leptos frontend
|
||||
```
|
||||
|
||||
**Data Sources**:
|
||||
- `~/.claude/stats-cache.json` - Statistics
|
||||
- `~/.claude/claude_desktop_config.json` - MCP config
|
||||
- `~/.claude/projects/*/` - Session JSONL files
|
||||
- `~/.claude/settings.json` - Global settings
|
||||
- `.claude/settings.json` - Project settings
|
||||
- `.claude/settings.local.json` - Local overrides
|
||||
- `.claude/CLAUDE.md` - Rules and behavior
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **ccboard not found**: Run `/ccboard-install` or `cargo install ccboard`
|
||||
- **No data visible**: Verify `~/.claude/` exists and contains `stats-cache.json`
|
||||
- **MCP shows "Unknown"**: Status detection requires Unix; Windows shows "Unknown" by default
|
||||
- **File watcher issues**: Check file permissions on `~/.claude/`, restart ccboard
|
||||
### ccboard not found
|
||||
|
||||
```bash
|
||||
# Check installation
|
||||
which ccboard
|
||||
|
||||
# Install if needed
|
||||
/ccboard-install
|
||||
```
|
||||
|
||||
### No data visible
|
||||
|
||||
```bash
|
||||
# Verify Claude Code is installed
|
||||
ls ~/.claude/
|
||||
|
||||
# Check stats file exists
|
||||
cat ~/.claude/stats-cache.json
|
||||
|
||||
# Run with specific project
|
||||
ccboard --project ~/path/to/project
|
||||
```
|
||||
|
||||
### MCP status shows "Unknown"
|
||||
|
||||
- Status detection requires Unix (macOS/Linux)
|
||||
- Windows shows "Unknown" by default
|
||||
- Check if server process is actually running: `ps aux | grep <server-name>`
|
||||
|
||||
### File watcher not working
|
||||
|
||||
- Ensure `notify` crate supports your platform
|
||||
- Check file permissions on `~/.claude/`
|
||||
- Restart ccboard if file system events missed
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
### Command-line Options
|
||||
|
||||
```bash
|
||||
ccboard --help # Show all options
|
||||
ccboard --claude-home PATH # Custom Claude directory
|
||||
ccboard --project PATH # Specific project
|
||||
ccboard stats # Print stats and exit
|
||||
ccboard web --port 8080 # Web UI on port 8080
|
||||
ccboard both # TUI + Web simultaneously
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
```bash
|
||||
# Editor preference
|
||||
export EDITOR=vim
|
||||
export VISUAL=code
|
||||
|
||||
# Custom Claude home
|
||||
export CLAUDE_HOME=~/custom/.claude
|
||||
```
|
||||
|
||||
### Integration with Claude Code
|
||||
|
||||
ccboard reads **read-only** from Claude Code directories:
|
||||
|
||||
- Non-invasive monitoring
|
||||
- No modifications to Claude data
|
||||
- Safe to run concurrently with Claude Code
|
||||
- File watcher detects changes in real-time
|
||||
|
||||
## Performance
|
||||
|
||||
- **Binary size**: 2.4MB (release build)
|
||||
- **Initial load**: <2s for 1,000+ sessions
|
||||
- **Memory**: ~50MB typical usage
|
||||
- **CPU**: <5% during monitoring
|
||||
- **Lazy loading**: Session content loaded on-demand
|
||||
|
||||
## Limitations
|
||||
|
||||
Current version (0.1.0):
|
||||
|
||||
- **Read-only**: No write operations to Claude data
|
||||
- **MCP status**: Unix only (Windows shows "Unknown")
|
||||
- **Web UI**: In development (TUI is primary interface)
|
||||
- **Search**: Basic substring matching (no fuzzy search yet)
|
||||
|
||||
Future roadmap:
|
||||
|
||||
- Enhanced MCP server management (start/stop)
|
||||
- MCP protocol health checks
|
||||
- Export reports (PDF, JSON, CSV)
|
||||
- Config editing (write settings.json)
|
||||
- Session resume integration
|
||||
- Enhanced search with fuzzy matching
|
||||
|
||||
## Contributing
|
||||
|
||||
ccboard is open source (MIT OR Apache-2.0).
|
||||
|
||||
Repository: https://github.com/{OWNER}/ccboard
|
||||
|
||||
Contributions welcome:
|
||||
- Bug reports and feature requests
|
||||
- Pull requests for new features
|
||||
- Documentation improvements
|
||||
- Platform-specific testing (Windows, Linux)
|
||||
|
||||
## Credits
|
||||
|
||||
Built with:
|
||||
- [Ratatui](https://ratatui.rs/) - Terminal UI framework
|
||||
- [Axum](https://github.com/tokio-rs/axum) - Web framework
|
||||
- [Leptos](https://leptos.dev/) - Reactive frontend
|
||||
- [Notify](https://github.com/notify-rs/notify) - File watcher
|
||||
- [Serde](https://serde.rs/) - Serialization
|
||||
|
||||
## License
|
||||
|
||||
MIT OR Apache-2.0
|
||||
|
||||
---
|
||||
|
||||
**Questions?**
|
||||
|
||||
- GitHub Issues: https://github.com/{OWNER}/ccboard/issues
|
||||
- Documentation: https://github.com/{OWNER}/ccboard
|
||||
- Claude Code: https://claude.ai/code
|
||||
|
|
|
|||
|
|
@ -38,30 +38,39 @@ Check that the log file exists (or that log content was provided inline). If the
|
|||
|
||||
### Step 2 — Spawn Log Ingestor
|
||||
|
||||
Use the Agent tool to spawn the `log-ingestor` agent:
|
||||
|
||||
```
|
||||
Agent(tool="Task", prompt="Parse the log file at [log_path] and write structured events to cyber-defense-events.json.", agent="log-ingestor", model="haiku")
|
||||
Task: Parse the log file at [log_path] and write structured events to cyber-defense-events.json.
|
||||
Log path: [log_path]
|
||||
```
|
||||
|
||||
Wait for completion. Confirm `cyber-defense-events.json` was created.
|
||||
|
||||
### Step 3 — Spawn Anomaly Detector
|
||||
|
||||
Use the Agent tool to spawn the `anomaly-detector` agent:
|
||||
|
||||
```
|
||||
Agent(tool="Task", prompt="Read cyber-defense-events.json and detect anomalies. Write results to cyber-defense-anomalies.json.", agent="anomaly-detector", model="sonnet")
|
||||
Task: Read cyber-defense-events.json and detect anomalies. Write results to cyber-defense-anomalies.json.
|
||||
```
|
||||
|
||||
Wait for completion. If `anomalies_found: 0`, skip to Step 5 (reporter still runs).
|
||||
|
||||
### Step 4 — Spawn Risk Classifier
|
||||
|
||||
Use the Agent tool to spawn the `risk-classifier` agent:
|
||||
|
||||
```
|
||||
Agent(tool="Task", prompt="Read cyber-defense-anomalies.json and classify overall risk. Write result to cyber-defense-risk.json.", agent="risk-classifier", model="sonnet")
|
||||
Task: Read cyber-defense-anomalies.json and classify overall risk. Write result to cyber-defense-risk.json.
|
||||
```
|
||||
|
||||
### Step 5 — Spawn Threat Reporter
|
||||
|
||||
Use the Agent tool to spawn the `threat-reporter` agent:
|
||||
|
||||
```
|
||||
Agent(tool="Task", prompt="Read all 3 JSON files (events, anomalies, risk). Generate a complete incident report and save to cyber-defense-report.md.", agent="threat-reporter", model="sonnet")
|
||||
Task: Read cyber-defense-events.json, cyber-defense-anomalies.json, and cyber-defense-risk.json. Generate a complete incident report and save it to cyber-defense-report.md.
|
||||
```
|
||||
|
||||
### Step 6 — Summarize for User
|
||||
|
|
|
|||
|
|
@ -33,7 +33,11 @@ agent: specialist
|
|||
5. JSON Report Generation
|
||||
```
|
||||
|
||||
**Example**: `/design-patterns detect src/`
|
||||
**Example invocation**:
|
||||
```
|
||||
/design-patterns detect src/
|
||||
/design-patterns analyze --format=json
|
||||
```
|
||||
|
||||
### Mode 2: Suggestion
|
||||
|
||||
|
|
@ -49,7 +53,11 @@ agent: specialist
|
|||
5. Markdown Report with Code Examples
|
||||
```
|
||||
|
||||
**Example**: `/design-patterns suggest src/payment/`
|
||||
**Example invocation**:
|
||||
```
|
||||
/design-patterns suggest src/payment/
|
||||
/design-patterns refactor --focus=creational
|
||||
```
|
||||
|
||||
### Mode 3: Evaluation
|
||||
|
||||
|
|
@ -65,7 +73,11 @@ agent: specialist
|
|||
5. JSON Report with Recommendations
|
||||
```
|
||||
|
||||
**Example**: `/design-patterns evaluate src/services/singleton.ts`
|
||||
**Example invocation**:
|
||||
```
|
||||
/design-patterns evaluate src/services/singleton.ts
|
||||
/design-patterns quality --pattern=observer
|
||||
```
|
||||
|
||||
## Methodology
|
||||
|
||||
|
|
@ -274,6 +286,134 @@ ELSE IF pattern_implemented_incorrectly:
|
|||
}
|
||||
```
|
||||
|
||||
### Suggestion Mode (Markdown)
|
||||
|
||||
```markdown
|
||||
# Design Pattern Suggestions
|
||||
|
||||
**Scope**: `src/payment/`
|
||||
**Stack**: React 18 + TypeScript + Stripe
|
||||
**Date**: 2026-01-21
|
||||
|
||||
---
|
||||
|
||||
## High Priority
|
||||
|
||||
### 1. Strategy Pattern → `src/payment/processor.ts:45-89`
|
||||
|
||||
**Code Smell**: Switch statement on payment type (4 cases, 78 lines)
|
||||
|
||||
**Current Implementation** (lines 52-87):
|
||||
```typescript
|
||||
switch (paymentType) {
|
||||
case 'credit':
|
||||
// 20 lines of credit card logic
|
||||
break;
|
||||
case 'paypal':
|
||||
// 15 lines of PayPal logic
|
||||
break;
|
||||
case 'crypto':
|
||||
// 18 lines of crypto logic
|
||||
break;
|
||||
case 'bank':
|
||||
// 12 lines of bank transfer logic
|
||||
break;
|
||||
}
|
||||
```
|
||||
|
||||
**Recommended (React-adapted Strategy)**:
|
||||
```typescript
|
||||
// Define strategy interface
|
||||
interface PaymentStrategy {
|
||||
process: (amount: number) => Promise<PaymentResult>;
|
||||
}
|
||||
|
||||
// Custom hooks as strategies
|
||||
const useCreditPayment = (): PaymentStrategy => ({
|
||||
process: async (amount) => { /* credit logic */ }
|
||||
});
|
||||
|
||||
const usePaypalPayment = (): PaymentStrategy => ({
|
||||
process: async (amount) => { /* PayPal logic */ }
|
||||
});
|
||||
|
||||
// Strategy selection hook
|
||||
const usePaymentStrategy = (type: PaymentType): PaymentStrategy => {
|
||||
const strategies = {
|
||||
credit: useCreditPayment(),
|
||||
paypal: usePaypalPayment(),
|
||||
crypto: useCryptoPayment(),
|
||||
bank: useBankPayment(),
|
||||
};
|
||||
return strategies[type];
|
||||
};
|
||||
|
||||
// Usage in component
|
||||
const PaymentForm = ({ type }: Props) => {
|
||||
const strategy = usePaymentStrategy(type);
|
||||
const handlePay = () => strategy.process(amount);
|
||||
// ...
|
||||
};
|
||||
```
|
||||
|
||||
**Impact**:
|
||||
- **Complexity**: Reduces cyclomatic complexity from 12 to 2
|
||||
- **Extensibility**: New payment methods = new hook, no modification to existing code
|
||||
- **Testability**: Each strategy hook can be tested in isolation
|
||||
- **Effort**: ~2 hours (extract logic into hooks, add tests)
|
||||
|
||||
---
|
||||
|
||||
## Medium Priority
|
||||
|
||||
### 2. Observer Pattern → `src/cart/CartManager.ts:23-156`
|
||||
|
||||
**Code Smell**: Manual notification logic scattered across 8 methods
|
||||
|
||||
**Current**: Manual loops calling update functions
|
||||
**Recommended**: Use Zustand store (already in dependencies)
|
||||
|
||||
```typescript
|
||||
// Instead of custom observer:
|
||||
import create from 'zustand';
|
||||
|
||||
interface CartStore {
|
||||
items: CartItem[];
|
||||
addItem: (item: CartItem) => void;
|
||||
removeItem: (id: string) => void;
|
||||
// Zustand automatically notifies subscribers
|
||||
}
|
||||
|
||||
export const useCartStore = create<CartStore>((set) => ({
|
||||
items: [],
|
||||
addItem: (item) => set((state) => ({ items: [...state.items, item] })),
|
||||
removeItem: (id) => set((state) => ({ items: state.items.filter(i => i.id !== id) })),
|
||||
}));
|
||||
|
||||
// Components auto-subscribe:
|
||||
const CartDisplay = () => {
|
||||
const items = useCartStore((state) => state.items);
|
||||
// Re-renders automatically on cart changes
|
||||
};
|
||||
```
|
||||
|
||||
**Impact**:
|
||||
- **LOC**: Reduces from 156 to ~25 lines
|
||||
- **Stack-native**: Uses existing Zustand dependency
|
||||
- **Testability**: Zustand stores are easily tested
|
||||
- **Effort**: ~1.5 hours
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
- **Total suggestions**: 4
|
||||
- **High priority**: 2 (Strategy, Observer)
|
||||
- **Medium priority**: 2 (Builder, Facade)
|
||||
- **Estimated total effort**: ~6 hours
|
||||
- **Primary benefits**: Reduced complexity, improved testability, stack-native idioms
|
||||
```
|
||||
|
||||
### Evaluation Mode (JSON)
|
||||
|
||||
```json
|
||||
|
|
@ -363,11 +503,40 @@ ELSE IF pattern_implemented_incorrectly:
|
|||
|
||||
## Usage Examples
|
||||
|
||||
### Basic Detection
|
||||
```bash
|
||||
/design-patterns detect src/ # Detect all patterns
|
||||
/design-patterns detect src/ --category=creational # Creational only
|
||||
/design-patterns suggest src/payment/ # Suggestions for module
|
||||
/design-patterns evaluate src/services/api-client.ts # Evaluate specific file
|
||||
# Detect all patterns in src/
|
||||
/design-patterns detect src/
|
||||
|
||||
# Detect only creational patterns
|
||||
/design-patterns detect src/ --category=creational
|
||||
|
||||
# Focus on specific pattern
|
||||
/design-patterns detect src/ --pattern=singleton
|
||||
```
|
||||
|
||||
### Targeted Suggestions
|
||||
```bash
|
||||
# Get suggestions for payment module
|
||||
/design-patterns suggest src/payment/
|
||||
|
||||
# Focus on specific smell
|
||||
/design-patterns suggest src/ --smell=switch-on-type
|
||||
|
||||
# High priority only
|
||||
/design-patterns suggest src/ --priority=high
|
||||
```
|
||||
|
||||
### Quality Evaluation
|
||||
```bash
|
||||
# Evaluate specific file
|
||||
/design-patterns evaluate src/services/api-client.ts
|
||||
|
||||
# Evaluate all singletons
|
||||
/design-patterns evaluate src/ --pattern=singleton
|
||||
|
||||
# Full quality report
|
||||
/design-patterns evaluate src/ --detailed
|
||||
```
|
||||
|
||||
## Integration with Other Skills
|
||||
|
|
|
|||
|
|
@ -115,11 +115,33 @@ Scan each open PR body for references to the issue number:
|
|||
|
||||
#### 3. Duplicate Detection via Jaccard Similarity
|
||||
|
||||
Compare each open issue against all other open issues AND the 20 most recent closed issues using Jaccard similarity (self-contained, no external library).
|
||||
**Algorithm (self-contained — no external library)**:
|
||||
|
||||
**Steps**: Normalize text (lowercase, strip prefixes like "feat:"/"fix:", remove punctuation) → Tokenize (split on whitespace, remove stop words and tokens <3 chars) → Compute `|A ∩ B| / |A ∪ B|` on token sets from title + first 300 chars of body.
|
||||
For each open issue, compute Jaccard similarity against all other open issues AND the 20 most recent closed issues.
|
||||
|
||||
**Threshold**: Jaccard >= 0.60 → flag as potential duplicate. Keep the older issue as canonical. Report: "Similar to #N (Jaccard: 0.72)". Computed at runtime on fetched data — no additional API calls.
|
||||
```
|
||||
Step 1 — Normalize title + first 300 chars of body:
|
||||
- Lowercase the full text
|
||||
- Strip category prefixes: "feat:", "fix:", "bug:", "chore:", "docs:", "test:", "refactor:"
|
||||
- Remove punctuation: .,!?;:'"()[]{}-_/\@#
|
||||
|
||||
Step 2 — Tokenize:
|
||||
- Split on whitespace
|
||||
- Remove stop words: the a an is in on to for of and or with this that it can not no be
|
||||
- Remove tokens shorter than 3 characters
|
||||
|
||||
Step 3 — Compute Jaccard:
|
||||
tokens_A = set of tokens from issue A
|
||||
tokens_B = set of tokens from issue B
|
||||
jaccard = |tokens_A ∩ tokens_B| / |tokens_A ∪ tokens_B|
|
||||
|
||||
Step 4 — Flag:
|
||||
- If jaccard >= 0.60: mark as potential duplicate
|
||||
- Report: "Similar to #N (Jaccard: 0.72)"
|
||||
- Keep the OLDER issue as canonical; newer = duplicate candidate
|
||||
```
|
||||
|
||||
Jaccard is computed at runtime using the fetched data — no API calls beyond Phase 1 gather.
|
||||
|
||||
#### 4. Risk Classification
|
||||
|
||||
|
|
@ -383,14 +405,18 @@ If "None" → `No actions executed. Workflow complete.`
|
|||
|
||||
## Edge Cases
|
||||
|
||||
- **0 open issues**: Display `No open issues.` and stop
|
||||
- **Empty body**: Category = Unclear, always request details first
|
||||
- **Collaborator reporter**: Protect from auto-close, flag in table
|
||||
- **Jaccard 0.55–0.65**: Flag as "possible duplicate — verify manually"
|
||||
- **Label not in repo**: Skip label action, notify user to create it
|
||||
- **Collaborators API 403/404**: Fallback to last 10 merged PR authors
|
||||
- **Large body (>5000 chars)**: Truncate with `[truncated]` note
|
||||
- **Milestoned issues**: Never close without explicit confirmation
|
||||
| Situation | Behavior |
|
||||
|-----------|----------|
|
||||
| 0 open issues | Display `No open issues.` + stop |
|
||||
| Body empty | Category = Unclear, action = request details, never assume |
|
||||
| Collaborator as reporter | Protect from auto-close, flag explicitly in table |
|
||||
| Jaccard inconclusive (0.55–0.65) | Flag as "possible duplicate — verify manually" |
|
||||
| Label not in repo | Skip label action, notify user to create the label first |
|
||||
| Issue already closed during workflow | Skip silently, note in summary |
|
||||
| `gh api .../collaborators` 403/404 | Fallback to last 10 merged PR authors |
|
||||
| Parallel agents unavailable | Run sequential analysis, notify user |
|
||||
| Very large body (>5000 chars) | Truncate to 5000 chars with `[truncated]` note |
|
||||
| Milestone assigned | Include in table, never close milestoned issues without confirmation |
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -7,86 +7,138 @@ tags: [optimization, tokens, efficiency, git]
|
|||
|
||||
# RTK Optimizer Skill
|
||||
|
||||
Automatically suggest and apply RTK (Rust Token Killer) wrappers for verbose commands, reducing token usage by ~73% on average.
|
||||
**Purpose**: Automatically suggest RTK wrappers for high-verbosity commands to reduce token consumption.
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **Detect high-verbosity commands** in user requests
|
||||
2. **Suggest RTK wrapper** with expected savings
|
||||
2. **Suggest RTK wrapper** if applicable
|
||||
3. **Execute with RTK** when user confirms
|
||||
4. **Track savings** over session via `rtk gain`
|
||||
|
||||
## Prerequisites
|
||||
|
||||
```bash
|
||||
rtk --version # Requires rtk 0.16.0+
|
||||
|
||||
# Install if needed:
|
||||
brew install rtk-ai/tap/rtk # macOS/Linux
|
||||
cargo install rtk # All platforms
|
||||
```
|
||||
4. **Track savings** over session
|
||||
|
||||
## Supported Commands
|
||||
|
||||
| Command | RTK Equivalent | Reduction |
|
||||
|---------|---------------|-----------|
|
||||
| `git log` | `rtk git log` | 92% (13,994 -> 1,076 chars) |
|
||||
| `git status` | `rtk git status` | 76% |
|
||||
| `git diff` | `rtk git diff` | 56% (15,815 -> 6,982 chars) |
|
||||
| `find` | `rtk find` | 76% |
|
||||
| `cat <large-file>` | `rtk read <file>` | 63% (163K -> 61K chars) |
|
||||
| `pnpm list` | `rtk pnpm list` | 82% |
|
||||
| `vitest run` / `pnpm test` | `rtk vitest run` | 90% |
|
||||
| `cargo test` | `rtk cargo test` | 90% |
|
||||
| `cargo build` | `rtk cargo build` | 80% |
|
||||
| `cargo clippy` | `rtk cargo clippy` | 80% |
|
||||
| `pytest` | `rtk python pytest` | 90% |
|
||||
| `go test` | `rtk go test` | 90% |
|
||||
| `gh pr view` | `rtk gh pr view` | 87% |
|
||||
| `gh pr checks` | `rtk gh pr checks` | 79% |
|
||||
| `ls` | `rtk ls` | condensed |
|
||||
| `grep` | `rtk grep` | filtered |
|
||||
### Git (>70% reduction)
|
||||
- `git log` → `rtk git log` (92.3% reduction)
|
||||
- `git status` → `rtk git status` (76.0% reduction)
|
||||
- `find` → `rtk find` (76.3% reduction)
|
||||
|
||||
## Usage Pattern
|
||||
### Medium-Value (50-70% reduction)
|
||||
- `git diff` → `rtk git diff` (55.9% reduction)
|
||||
- `cat <large-file>` → `rtk read <file>` (62.5% reduction)
|
||||
|
||||
```markdown
|
||||
# When user requests a verbose command:
|
||||
### JS/TS Stack (70-90% reduction)
|
||||
- `pnpm list` → `rtk pnpm list` (82% reduction)
|
||||
- `pnpm test` / `vitest run` → `rtk vitest run` (90% reduction)
|
||||
|
||||
1. Acknowledge the request
|
||||
2. Suggest RTK: "I'll use `rtk git log` to reduce token usage by ~92%"
|
||||
3. Execute the RTK-wrapped command
|
||||
4. Report savings: "Saved ~13K tokens (baseline: 14K, RTK: 1K)"
|
||||
```
|
||||
### Rust Toolchain (80-90% reduction)
|
||||
- `cargo test` → `rtk cargo test` (90% reduction)
|
||||
- `cargo build` → `rtk cargo build` (80% reduction)
|
||||
- `cargo clippy` → `rtk cargo clippy` (80% reduction)
|
||||
|
||||
### Python & Go (90% reduction)
|
||||
- `pytest` → `rtk python pytest` (90% reduction)
|
||||
- `go test` → `rtk go test` (90% reduction)
|
||||
|
||||
### GitHub CLI (79-87% reduction)
|
||||
- `gh pr view` → `rtk gh pr view` (87% reduction)
|
||||
- `gh pr checks` → `rtk gh pr checks` (79% reduction)
|
||||
|
||||
### File Operations
|
||||
- `ls` → `rtk ls` (condensed output)
|
||||
- `grep` → `rtk grep` (filtered output)
|
||||
|
||||
## Activation Examples
|
||||
|
||||
**User**: "Show me the git history"
|
||||
**Action**: Detect `git log` -> execute `rtk git log` -> report 92% savings
|
||||
**Skill**: Detects `git log` → Suggests `rtk git log` → Explains 92.3% token savings
|
||||
|
||||
**User**: "Run the test suite"
|
||||
**Action**: Detect `cargo test` / `pytest` -> execute `rtk cargo test` -> report 90% savings
|
||||
**User**: "Find all markdown files"
|
||||
**Skill**: Detects `find` → Suggests `rtk find "*.md" .` → Explains 76.3% savings
|
||||
|
||||
## When to Skip RTK
|
||||
## Installation Check
|
||||
|
||||
- **Small outputs** (<100 chars): Overhead not worth it
|
||||
- **Claude built-in tools**: Grep/Read tools are already optimized
|
||||
- **Interactive commands**: RTK is for batch/non-interactive output only
|
||||
- **Multiple piped commands**: Wrap the outermost command, not each step
|
||||
Before first use, verify RTK is installed:
|
||||
```bash
|
||||
rtk --version # Should output: rtk 0.16.0+
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
If not installed:
|
||||
```bash
|
||||
# Homebrew (macOS/Linux)
|
||||
brew install rtk-ai/tap/rtk
|
||||
|
||||
- If `rtk` is not found, fall back to the raw command and suggest installation
|
||||
- If RTK output is empty or malformed, re-run without RTK and report the issue
|
||||
- If RTK version is outdated, warn about potential breaking changes (rapid release cadence)
|
||||
# Cargo (all platforms)
|
||||
cargo install rtk
|
||||
```
|
||||
|
||||
## Usage Pattern
|
||||
|
||||
```markdown
|
||||
# When user requests high-verbosity command:
|
||||
|
||||
1. Acknowledge request
|
||||
2. Suggest RTK optimization:
|
||||
"I'll use `rtk git log` to reduce token usage by ~92%"
|
||||
3. Execute RTK command
|
||||
4. Track savings (optional):
|
||||
"Saved ~13K tokens (baseline: 14K, RTK: 1K)"
|
||||
```
|
||||
|
||||
## Session Tracking
|
||||
|
||||
Optional: Track cumulative savings across session:
|
||||
|
||||
```bash
|
||||
rtk gain # Shows cumulative token savings for the session (SQLite-backed)
|
||||
# At session end
|
||||
rtk gain # Shows total token savings for session (SQLite-backed)
|
||||
```
|
||||
|
||||
## Edge Cases
|
||||
|
||||
- **Small outputs** (<100 chars): Skip RTK (overhead not worth it)
|
||||
- **Already using Claude tools**: Grep/Read tools are already optimized
|
||||
- **Multiple commands**: Batch with RTK wrapper once, not per command
|
||||
|
||||
## Configuration
|
||||
|
||||
Enable via CLAUDE.md:
|
||||
```markdown
|
||||
## Token Optimization
|
||||
|
||||
Use RTK (Rust Token Killer) for high-verbosity commands:
|
||||
- git operations (log, status, diff)
|
||||
- package managers (pnpm, npm)
|
||||
- build tools (cargo, go)
|
||||
- test frameworks (vitest, pytest)
|
||||
- file finding and reading
|
||||
```
|
||||
|
||||
## Metrics (Verified)
|
||||
|
||||
Based on real-world testing:
|
||||
- `git log`: 13,994 chars → 1,076 chars (92.3% reduction)
|
||||
- `git status`: 100 chars → 24 chars (76.0% reduction)
|
||||
- `find`: 780 chars → 185 chars (76.3% reduction)
|
||||
- `git diff`: 15,815 chars → 6,982 chars (55.9% reduction)
|
||||
- `read file`: 163,587 chars → 61,339 chars (62.5% reduction)
|
||||
|
||||
**Average: 72.6% token reduction**
|
||||
|
||||
## Limitations
|
||||
|
||||
- 446 stars on GitHub, actively maintained (30 releases in 23 days)
|
||||
- Not suitable for interactive commands
|
||||
- Rapid development cadence (check for breaking changes)
|
||||
|
||||
## Recommendation
|
||||
|
||||
**Use RTK for**: git workflows, file operations, test frameworks, build tools, package managers
|
||||
**Skip RTK for**: small outputs, quick exploration, interactive commands
|
||||
|
||||
## References
|
||||
|
||||
- RTK GitHub: https://github.com/rtk-ai/rtk
|
||||
- RTK Website: https://www.rtk-ai.app/
|
||||
- Evaluation: `docs/resource-evaluations/rtk-evaluation.md`
|
||||
- CLAUDE.md template: `examples/claude-md/rtk-optimized.md`
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue