feat: improve skill descriptions from PR #9 (selective merge)
Cherry-pick description improvements and allowed-tools fixes from @popey's PR #9, while preserving reference documentation in skills that serve as templates (audit-agents-skills, ccboard, design-patterns). Co-Authored-By: Alan Pope <alan@popey.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
be52e232b3
commit
40213f0a7e
23 changed files with 1994 additions and 197 deletions
25
CHANGELOG.md
25
CHANGELOG.md
|
|
@ -6,8 +6,33 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
|
|||
|
||||
## [Unreleased]
|
||||
|
||||
- **New guide section §5.5 — Registry-based Discovery: ctx7 CLI** (`guide/ultimate-guide.md`): Context7's CLI companion (`npx ctx7`) for automated skill discovery and MCP setup. Documents `ctx7 skills suggest` (dependency-aware skill recommendations), `ctx7 skills install owner/repo`, `ctx7 setup --claude` wizard, and `ctx7 docs` terminal lookup. Clarifies agentskills.io (open spec) vs context7.com/skills (registry) relationship. Cross-reference note added to `guide/ecosystem/mcp-servers-ecosystem.md` Context7 section. Resource evaluation: `docs/resource-evaluations/2026-03-17-context7-cli.md` (score 4/5).
|
||||
|
||||
- **Doc audit — stats sync**: corrected stale counts across guide + landing. Templates: 204/216/217/218/222/232 → unified to 217 (per `check-landing-sync.sh` logic). Guide lines: "22K" → "23K+" (actual: 23,422). Quiz: reference.yaml `quiz_count` and llms*.txt had 311 → corrected to 271 (actual count). Version in llms.txt / llms-full.txt / machine-readable/llms.txt bumped 3.36.0 → 3.37.0. Landing updated: FeaturesGrid, GuideComparison, WhyGuide, McpDemo, cheatsheet page, index.astro, compare page, and guide content files (00-introduction, index, 09-advanced-patterns, 12-appendices).
|
||||
|
||||
## [3.37.0] - 2026-03-17
|
||||
|
||||
- **ICM v0.5.0 — setup guide + session starter template**: corrected `icm init` documentation (3 explicit modes: `--mode mcp`, `--mode hook`, `--mode skill` — not a single interactive command); fixed CLI syntax (`--importance` is an enum `critical|high|medium|low`, not a float; no `memory` subcommand); added `examples/memory/icm-session-starter.md` ready-to-use onboarding prompt to paste at the start of any session.
|
||||
|
||||
- **New guide section — MCP vs CLI Decision Guide** (`guide/ecosystem/mcp-vs-cli.md`): full comparison of MCP servers vs CLI tools across 4 decision dimensions (user type, model capability, observability needs, schema stability), decision matrix by situation, token cost analysis, tooling overview (RTK, MCPorter, mcp2cli), and practitioner quotes. Cross-linked from `mcp-servers-ecosystem.md`. Landing page published at `cc.bruniaux.com/ecosystem/mcp-vs-cli/` with 4 decision cards, 15-row collapsible guidance table, practitioner quotes section, and tooling micro-section. `check-landing-sync.sh` extended with section 7 for MCP vs CLI sync tracking.
|
||||
|
||||
- **Resource evaluations (3)**: mcp2cli (3/5, MCP/OpenAPI/GraphQL to runtime CLI, 96-99% token savings claim, 8-day-old tool with structural mismatch against Claude Code native MCP architecture — watch list + document schema overhead insight); MCPorter by Steinberger (3/5, TypeScript MCP toolkit with auto-discovery, CLI generation and TS codegen — useful companion for testing MCP servers and hook scripts); CircleCI MCP vs CLI blog (3/5, inner loop / outer loop decision framework, 6-question guide, directional browser automation benchmark — worth borrowing the vocabulary, not the benchmark numbers).
|
||||
|
||||
- **WP10 v1.2.0 — Marc Sélince feedback (DAF/finance)**: 6 corrections FR+EN on `10-budget-ia.qmd` / `10-ai-budget.qmd`. New `## Pour le DAF/CFO` section (ROI + OpEx/CapEx framing, replaces placeholder callout). New `## Freins COMEX au-delà du coût` Q&A section (vendor dependency, IP risk, lock-in pricing). §3.1 reframed "Attraction et rétention des top performers" (market tight for seniors/experts, not all profiles). §3.2 CTO: new "ROI des heures d'ingénieur" sub-point (LLMs on mechanical code free engineering time for architecture). §4.1 Budget: option 4 added (replace paid tool with OS equivalent for net-zero pilot), "200-500$/mois" figure removed from discretionary budget.
|
||||
|
||||
- **Recap cards — EN translations created + FR fixes**: 57 EN recap cards created from scratch (`whitepapers/recap-cards/en/`) by translating all FR cards. FR cards batch-updated: `guide-version` and `version` fields bumped `3.32.1` → `3.36.0` across all 57 FR cards. Factual fixes: T19 (context window) corrected "1M beta, API only" → "1M GA for Max/Team/Enterprise CC plans (v2.1.75, no header needed)"; T01 (essential commands) updated with `/plan`, `/effort`, `/branch`, `/rename`, `/loop`, `/voice`, `/fast`, removed non-existent `/cost`, corrected keyboard shortcuts. `docs/for-cto.md` updated: "whitepapers coming soon" → links to `florian.bruniaux.com/guides` in all 4 occurrences.
|
||||
|
||||
- **Fix dead link** (`guide/ultimate-guide.md` §3.5): Packmind anchor `../ecosystem/third-party-tools.md#packmind` corrected to `ecosystem/third-party-tools.md#packmind` (wrong `../` prefix was resolving outside `guide/`).
|
||||
|
||||
- **Whitepapers v2.2 — Guide content sync (7 WPs updated)**: synced WP content with guide v3.27.6 → v3.36.0 delta.
|
||||
- **WP00** (v1.2.0): 1M context corrected "beta" → GA (v2.1.75); 7 major features table added (Tasks API, Auto-memories, Agent Teams, LSP Tool, Remote Control, MCP Elicitation)
|
||||
- **WP03** (v1.1.0): PreToolUse security fix callout (v2.1.77 — `"allow"` bypassed enterprise `deny`); `allowRead` sandbox parameter added
|
||||
- **WP05** (v1.2.0): Native Code Review section (Research Preview, Teams/Enterprise) — multi-agent, 3 trigger modes, `REVIEW.md`, ~$15-25/PR pricing
|
||||
- **WP07** (v1.1.0): 12 new slash commands, 7 new hook events, extended CLI flags, Remote Control section, 1M GA correction
|
||||
- **WP08** (v1.2.0): Identity drift after compaction pattern added (UserPromptSubmit hook + agent-identity.txt re-injection)
|
||||
- **WP09** (v1.1.0): Review bottleneck inversion section; Regulatory Exposure section (EU AI Act GPAI/high-risk, FDA AI/ML Guidance)
|
||||
- WP02: no hook events section in scope; WP01/WP04/WP06/WP10: no gaps identified
|
||||
|
||||
- **Cheatsheet + reference.yaml maintenance**: date updated February → March 2026 in `guide/cheatsheet.md`; "Command not found" fix updated to use native installer (`curl | sh`); `machine-readable/reference.yaml` `updated` field bumped to 2026-03-17.
|
||||
|
||||
- **Whitepapers v2 — Reviewer corrections** (6 relecteurs: Edouard, Mat, Nicolas, Marc, Anthony, Emmanuel): 8-phase correction plan applied across 10 whitepapers (WP00–WP10) FR+EN.
|
||||
|
|
|
|||
|
|
@ -6,7 +6,7 @@
|
|||
|
||||
<p align="center">
|
||||
<a href="https://github.com/FlorianBruniaux/claude-code-ultimate-guide/stargazers"><img src="https://img.shields.io/github/stars/FlorianBruniaux/claude-code-ultimate-guide?style=for-the-badge" alt="Stars"/></a>
|
||||
<a href="./CHANGELOG.md"><img src="https://img.shields.io/badge/Updated-Mar_17,_2026_·_v3.36.0-brightgreen?style=for-the-badge" alt="Last Update"/></a>
|
||||
<a href="./CHANGELOG.md"><img src="https://img.shields.io/badge/Updated-Mar_17,_2026_·_v3.37.0-brightgreen?style=for-the-badge" alt="Last Update"/></a>
|
||||
<a href="./quiz/"><img src="https://img.shields.io/badge/Quiz-271_questions-orange?style=for-the-badge" alt="Quiz"/></a>
|
||||
<a href="./examples/"><img src="https://img.shields.io/badge/Templates-204-green?style=for-the-badge" alt="Templates"/></a>
|
||||
<a href="./guide/security/security-hardening.md"><img src="https://img.shields.io/badge/🛡️_Threat_DB-15_vulnerabilities_·_655_malicious_skills-red?style=for-the-badge" alt="Threat Database"/></a>
|
||||
|
|
@ -872,7 +872,7 @@ See [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.
|
|||
|
||||
---
|
||||
|
||||
*Version 3.36.0 | Updated daily · Mar 17, 2026 | Crafted with Claude*
|
||||
*Version 3.37.0 | Updated daily · Mar 17, 2026 | Crafted with Claude*
|
||||
|
||||
<!-- SEO Keywords -->
|
||||
<!-- claude code, claude code tutorial, anthropic cli, ai coding assistant, claude code mcp,
|
||||
|
|
|
|||
2
VERSION
2
VERSION
|
|
@ -1 +1 @@
|
|||
3.36.0
|
||||
3.37.0
|
||||
|
|
|
|||
|
|
@ -25,7 +25,7 @@ Claude Code runs locally. It does **not** send your codebase to Anthropic — on
|
|||
- Access control: granular permissions per project, per user, per tool
|
||||
- Audit trail: every action logged via hooks
|
||||
|
||||
Full breakdown: WP06 — Privacy & GDPR Compliance *(whitepaper, coming soon)* (20 min)
|
||||
Full breakdown: WP06 — Privacy & GDPR Compliance (20 min) — [florian.bruniaux.com/guides](https://www.florian.bruniaux.com/guides)
|
||||
|
||||
### Threat landscape
|
||||
|
||||
|
|
@ -35,7 +35,7 @@ This is the only public resource tracking AI coding tool vulnerabilities: **15 v
|
|||
- Supply chain attacks via MCP servers (treat like npm packages)
|
||||
- Overpermissive configs in CI/CD pipelines
|
||||
|
||||
Mitigation framework: WP03 — Security in Production *(whitepaper, coming soon)* (25 min)
|
||||
Mitigation framework: WP03 — Security in Production (25 min) — [florian.bruniaux.com/guides](https://www.florian.bruniaux.com/guides)
|
||||
|
||||
### Team adoption
|
||||
|
||||
|
|
@ -43,13 +43,13 @@ The ROI scales with structure. An individual developer gets 2-3× productivity o
|
|||
|
||||
Realistic adoption timeline: 4-6 weeks to full team competency with structured onboarding.
|
||||
|
||||
WP05 — Deploying with a Team *(whitepaper, coming soon)* (25 min)
|
||||
WP05 — Deploying with a Team (25 min) — [florian.bruniaux.com/guides](https://www.florian.bruniaux.com/guides)
|
||||
|
||||
---
|
||||
|
||||
## Recommended reading path (60 min total)
|
||||
|
||||
> Whitepapers are currently in private access — public release coming soon.
|
||||
> Whitepapers are available at [florian.bruniaux.com/guides](https://www.florian.bruniaux.com/guides)
|
||||
|
||||
| Document | Time | What you'll get |
|
||||
|----------|------|----------------|
|
||||
|
|
@ -102,7 +102,7 @@ If you want to accelerate adoption or get an independent assessment of your curr
|
|||
|
||||
## Quick links
|
||||
|
||||
- Whitepapers — 10 focused deep-dives *(coming soon)*
|
||||
- Whitepapers — 10 focused deep-dives: [florian.bruniaux.com/guides](https://www.florian.bruniaux.com/guides)
|
||||
- [Security Hardening Guide](../guide/security/security-hardening.md)
|
||||
|
||||
← [Back to main README](../README.md)
|
||||
|
|
|
|||
|
|
@ -0,0 +1,133 @@
|
|||
# Resource Evaluation #081 — Rippletide Code: Runtime Rule Enforcement for Claude Code
|
||||
|
||||
**Source:** LinkedIn post (Patrick Joubert, CEO Rippletide) + [rippletide.com/dev](https://www.rippletide.com/dev)
|
||||
**Type:** Commercial tool — hook-native rule enforcement layer for Claude Code
|
||||
**Evaluated:** 2026-03-17
|
||||
**Note:** Distinct from eval 072 (2026-02-28) which covered Rippletide's MCP/eval/decision runtime SaaS. This is a different product: a CLI enforcement tool (`npx rippletide-code`), hook-native, no MCP overhead.
|
||||
|
||||
---
|
||||
|
||||
## 📄 Content Summary
|
||||
|
||||
1. **Problem addressed**: CLAUDE.md rules degrade at scale — after ~40 rules, Claude Code follows them inconsistently; context compaction causes rule loss between sessions. Per Rippletide: "50% of Claude Code CLAUDE.md issues are about rules being ignored" (18+ public GitHub reports cited).
|
||||
|
||||
2. **Core mechanism**: Reads codebase and existing CLAUDE.md → builds a "Context Graph" stored outside the LLM context window → uses Claude Code hooks to intercept tool calls pre-execution → blocks violations before they run.
|
||||
|
||||
3. **Architecture**: Hook-native (not MCP), avoiding token injection overhead. Pre-execution blocking (not post-execution logging). Example output: `[BLOCKED] Rule: "DO NOT modify .env files"`.
|
||||
|
||||
4. **Installation**: `npx rippletide-code` — free beta, no API key, no sign-up required. Graph builds in "less than 5 seconds" (company claim).
|
||||
|
||||
5. **Background**: Founded February 2024, SF + Paris, team of 8. Won OpenAI Codex Hackathon. Co-founders: Patrick Joubert (CEO) + Yann Bilien (Chief Scientist). Enterprise tier available (custom pricing, "2-week validation sprint").
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Relevance Score
|
||||
|
||||
| Score | Meaning |
|
||||
|-------|---------|
|
||||
| 5 | Essential — Major gap in the guide |
|
||||
| 4 | Very relevant — Significant improvement |
|
||||
| **3** | **Pertinent — Useful complement** |
|
||||
| 2 | Marginal — Secondary info |
|
||||
| 1 | Out of scope |
|
||||
|
||||
**Score: 3/5**
|
||||
|
||||
**Justification**: Addresses a real, documented community pain point (CLAUDE.md rule degradation at scale, compaction-driven rule loss) that the guide acknowledges but does not cover with solutions. The hook-based enforcement pattern is genuinely novel in the Claude Code ecosystem — no other documented tool does pre-execution blocking. However, multiple claims are unverified (Context Graph compaction-resistance, "less than 5 seconds" graph build, "50% of issues"), the product is in free beta with no adoption signals, and the company has a prior pattern of publishing unverifiable performance claims (eval 072: "<1% hallucinations" without methodology).
|
||||
|
||||
Score does not exceed 3 because: the guide's credibility requires holding commercial tools to evidence standards, and this tool fails that bar without independent corroboration.
|
||||
|
||||
---
|
||||
|
||||
## ⚖️ Comparative
|
||||
|
||||
| Aspect | This resource | Our guide |
|
||||
|--------|--------------|-----------|
|
||||
| CLAUDE.md rule degradation at scale | ✅ Documented as core problem | ⚠️ Mentioned briefly (context compaction section), no dedicated coverage |
|
||||
| Hook-based pre-execution blocking | ✅ Core feature | ✅ Hooks documented, but no enforcement pattern described |
|
||||
| Rule enforcement tools | ✅ Full solution | ❌ No tool covers this (Known Gaps table has no "rule enforcement" entry) |
|
||||
| Context compaction rule loss | ✅ Problem + solution claimed | ⚠️ Problem mentioned, no mitigation strategy |
|
||||
| Security surface of enforcement layer | ❌ Not addressed | ✅ Security section covers hook security |
|
||||
| Verifiable performance claims | ❌ Marketing without methodology | ✅ Stats with sources only |
|
||||
|
||||
---
|
||||
|
||||
## 📍 Recommendations
|
||||
|
||||
**Score 3 — integrate as limited entry with explicit caveats.**
|
||||
|
||||
### What to integrate (and how)
|
||||
|
||||
**Priority 1 — Document the pattern, not just the tool.**
|
||||
|
||||
The guide should cover "runtime rule enforcement via hooks" as a concept in the CLAUDE.md limitations section of ultimate-guide.md. This section currently documents compaction behavior and path-scoped CLAUDE.md files as mitigations, but has no entry for pre-execution enforcement. This gap exists regardless of Rippletide. The pattern: use `PreToolUse` hooks to validate tool calls against a rule set and exit non-zero to block. Rippletide is then one commercial implementation of this pattern.
|
||||
|
||||
Do NOT create a section in the guide that only exists to justify one beta product.
|
||||
|
||||
**Priority 2 — Add "Rule enforcement" gap to third-party-tools.md Known Gaps table.**
|
||||
|
||||
The Known Gaps table has no entry for runtime rule enforcement. This should be added first. Then Rippletide can be cited under a new "Rule Enforcement" section as the only known implementation, with clear watch caveats (beta, unverified claims, no adoption signals).
|
||||
|
||||
**Where to integrate:**
|
||||
- `guide/ultimate-guide.md` CLAUDE.md limitations section: add 3-4 lines on the enforcement pattern + Rippletide reference
|
||||
- `guide/ecosystem/third-party-tools.md`: add "Rule Enforcement" section (after Hook Utilities or after Engineering Standards Distribution) + update Known Gaps table
|
||||
|
||||
**What NOT to integrate:**
|
||||
- Do not cite "50% of issues are about rule ignoring" as a fact — it is Rippletide's own framing
|
||||
- Do not cite "Context Graph persists across compaction" as confirmed — it is unverified
|
||||
- Do not use "less than 5 seconds" build time as a guide stat
|
||||
- Do not create a section solely for this tool without the Known Gaps entry first
|
||||
|
||||
---
|
||||
|
||||
## 🔥 Challenge (technical-writer)
|
||||
|
||||
**Score after challenge: 3/5 (held)**
|
||||
|
||||
Key points raised by challenge:
|
||||
|
||||
1. **Unverified claims embedded as facts**: The evaluation initially treated "Context Graph compaction-resistance" as a verified feature. It is Rippletide's own claim. The guide must not repeat it without qualification — same error that kept eval 072 at 2/5.
|
||||
|
||||
2. **Security surface not addressed**: A pre-execution hook in the critical path of every tool call has a real attack surface. Fail-open vs fail-closed behavior when the Context Graph service is unavailable is unspecified. Whether `npx rippletide-code` runs a persistent background process with access to tool inputs (which may contain secrets) is undocumented.
|
||||
|
||||
3. **"Free beta" is a risk flag**: No pricing page, no post-beta plan, no stated data handling policy for convention scanning. The guide documents Straude with data transmission caveats — Rippletide deserves identical scrutiny.
|
||||
|
||||
4. **"Auto-detects implicit conventions" is unexamined**: How? Does it send code to an external service? Local only? This is a security and privacy question before it is a feature.
|
||||
|
||||
5. **Integration sequence matters**: Do not add a "Rule Enforcement" section to third-party-tools.md without first adding "Rule enforcement" to the Known Gaps table. Category before tool, not tool creating category.
|
||||
|
||||
6. **The stronger integration point is the PATTERN**: The guide should document hook-based pre-execution enforcement as a concept. Rippletide is one implementation. A minimal DIY example (PreToolUse hook that checks a rule list and exits non-zero) would serve readers better than a commercial product endorsement.
|
||||
|
||||
**Risks of NOT integrating**: Low-medium. The rule degradation problem is real and under-documented in the guide. Not covering it leaves a gap that practitioners regularly encounter. But the pattern can be documented without Rippletide — the risk is solved by covering the concept, not the product.
|
||||
|
||||
---
|
||||
|
||||
## ✅ Fact-Check
|
||||
|
||||
| Claim | Verified | Source |
|
||||
|-------|----------|--------|
|
||||
| `npx rippletide-code` installation command | ✅ | rippletide.com/dev — confirmed |
|
||||
| Free beta, no API key required | ✅ | rippletide.com/dev — confirmed |
|
||||
| Hook-native architecture (not MCP) | ✅ | rippletide.com/dev — confirmed |
|
||||
| Co-founders: Patrick Joubert + Yann Bilien | ✅ | rippletide.com/dev — confirmed |
|
||||
| Founded February 2024, SF + Paris, team of 8 | ✅ | rippletide.com/dev — confirmed |
|
||||
| Won OpenAI Codex Hackathon | ✅ | rippletide.com/dev — confirmed |
|
||||
| "50% of CLAUDE.md issues are about rules being ignored" | ⚠️ | Rippletide's own framing, no external source |
|
||||
| "18+ public GitHub reports of non-compliance" | ⚠️ | Not linked, not verifiable from evaluation |
|
||||
| Context Graph persists across compaction | ⚠️ | Rippletide claim only — no external confirmation |
|
||||
| Graph builds in "less than 5 seconds" | ⚠️ | Rippletide claim — no benchmark published |
|
||||
| Pre-execution blocking (not post-execution logging) | ✅ | Confirmed by hook architecture description |
|
||||
| Coming soon: Cursor, Windsurf, Cline | ✅ | rippletide.com/dev — confirmed |
|
||||
| Perplexity search returns no external coverage | ✅ | No independent coverage found as of 2026-03-17 |
|
||||
|
||||
**Corrections applied**: Claims marked ⚠️ removed from factual statements. "Context Graph compaction-resistance" and the "50%" stat are presented as Rippletide claims, not guide facts.
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Final Decision
|
||||
|
||||
- **Score**: 3/5
|
||||
- **Action**: Integrate with caveats — pattern documentation in ultimate-guide.md + limited entry in third-party-tools.md + Known Gaps table update
|
||||
- **Confidence**: Medium (product verified to exist and work as described; performance and persistence claims unverified)
|
||||
- **Prerequisite**: Add "Rule enforcement" to Known Gaps table *before* adding tool entry
|
||||
- **Watch trigger for upgrade to 4/5**: GitHub repo becomes public + >100 stars OR independent practitioner write-up from production use + Context Graph compaction claim independently verified
|
||||
106
docs/resource-evaluations/2026-03-17-circleci-mcp-vs-cli-blog.md
Normal file
106
docs/resource-evaluations/2026-03-17-circleci-mcp-vs-cli-blog.md
Normal file
|
|
@ -0,0 +1,106 @@
|
|||
# Resource Evaluation: "MCP vs. CLI" (CircleCI Blog)
|
||||
|
||||
**Date**: 2026-03-17
|
||||
**Evaluator**: Claude Sonnet 4.6
|
||||
**Resource URL**: https://circleci.com/blog/mcp-vs-cli/
|
||||
**Resource Type**: Technical blog post
|
||||
**Author**: Jacob Schmitt (CircleCI)
|
||||
**Published**: 2026-03-11
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Jacob Schmitt proposes a decision framework for choosing between MCP servers and CLI tools in agentic workflows, using the inner loop / outer loop distinction as the organizing principle. The post includes a browser automation benchmark (CLI 33% better token efficiency, 77 vs 60 task completion), a 6-question decision guide, and a hybrid architecture example from CircleCI's own tooling. The framework aligns with how the guide already positions RTK and the CLI+MCP hybrid approach. The post adds useful external validation and a cleaner decision vocabulary than what currently exists in the guide, but does not introduce new technical ground for experienced Claude Code users.
|
||||
|
||||
---
|
||||
|
||||
## Content Summary
|
||||
|
||||
- **Core thesis**: inner loop (frequent, local, low-latency dev iteration) favors CLI; outer loop (shared systems, CI/CD, cross-team infrastructure) favors MCP
|
||||
- **Browser automation benchmark**: single test comparing agentic browser automation via CLI vs MCP. CLI: 77% task completion, 33% better token efficiency. MCP: 60% task completion. Methodology not detailed (single test, CircleCI-internal)
|
||||
- **6-question decision framework**:
|
||||
1. Who owns the feedback loop? (developer alone → CLI; multiple agents/team → MCP)
|
||||
2. How often does the schema change? (frequently → CLI overhead lower; stable → MCP investment worthwhile)
|
||||
3. Does the tool require auth/secrets management at runtime? (yes → MCP; no → CLI simpler)
|
||||
4. Do you need structured output consumed by another agent? (yes → MCP; no → CLI)
|
||||
5. Is this a team or individual tool? (team → MCP standardization; individual → CLI flexibility)
|
||||
6. How much context budget do you have? (tight → CLI; ample → MCP acceptable)
|
||||
- **CircleCI hybrid model**: Chunk CLI (local file ops) + Local CLI (shell + git) + CircleCI MCP Server (CI/CD system access) — each layer mapped to inner/outer loop
|
||||
- **Does NOT mention**: Claude Code, Anthropic, RTK, or any specific tool outside CircleCI's stack
|
||||
|
||||
---
|
||||
|
||||
## Gap Analysis vs. Guide
|
||||
|
||||
| Area | CircleCI post | Guide coverage |
|
||||
|------|---------------|----------------|
|
||||
| Inner loop / outer loop vocabulary | ✅ Clean framework | ⚠️ Concept exists implicitly, not named this way |
|
||||
| Decision framework (when CLI vs MCP) | ✅ 6-question guide | ⚠️ Philosophy covered, structured decision tool not present |
|
||||
| Token cost of MCP tool schema | ✅ Mentioned as key driver | ❌ Not quantified anywhere in guide |
|
||||
| Browser automation benchmark | ✅ Single data point | ❌ No benchmark data in guide |
|
||||
| Hybrid CLI + MCP architecture | ✅ CircleCI example | ✅ Guide covers this philosophically |
|
||||
| Claude Code-specific guidance | ❌ | ✅ Guide's primary differentiator |
|
||||
|
||||
**Real gap**: the guide lacks a structured decision framework for "should I use an MCP server or a CLI tool for this workflow?" The inner loop / outer loop vocabulary is clean and could be adopted directly, or serve as source inspiration for adding this framing to the guide.
|
||||
|
||||
---
|
||||
|
||||
## Quality Assessment
|
||||
|
||||
**Strengths**:
|
||||
- The inner loop / outer loop distinction is well-established in dev productivity literature (ring-fencing fast local iteration vs. shared system operations) and applies cleanly to MCP vs. CLI
|
||||
- The 6-question framework is actionable and maps directly to real workflow decisions
|
||||
- CircleCI's own hybrid architecture is a credible worked example
|
||||
- Published on a high-traffic engineering blog — will be referenced by practitioners
|
||||
|
||||
**Weaknesses**:
|
||||
- The browser automation benchmark is a single internal test with no methodology disclosure. 77% vs 60% task completion difference could reflect implementation quality as much as CLI vs MCP architecture
|
||||
- The post does not distinguish between different LLM hosts. Claude Code's MCP integration has different overhead characteristics than, say, a custom agent using the raw API
|
||||
- The CircleCI MCP server recommendation at the end is vendor content (mild but present)
|
||||
- Does not address cost (token price per call) — only token count, not dollars
|
||||
|
||||
---
|
||||
|
||||
## Score
|
||||
|
||||
**Score: 3/5** (Moderate — reference as external validation)
|
||||
|
||||
Solid framework from a credible source. The inner loop / outer loop vocabulary is worth borrowing. The benchmark data is too thin to cite as evidence but useful as a directional signal. The decision framework would be a meaningful addition to the guide's cost optimization or MCP section — either as inspiration for a new section or as an external reference link.
|
||||
|
||||
---
|
||||
|
||||
## Challenge
|
||||
|
||||
**Challenge**: "The benchmark is methodologically thin and CircleCI is selling their MCP server. This is marketing dressed as engineering. Score should be 2/5."
|
||||
|
||||
**Response**: The marketing angle is real but mild — the post's core content (decision framework, inner/outer loop model) stands independently of the CircleCI MCP product pitch at the end. The benchmark is not cited here as evidence; it's noted as a directional signal with the caveat that methodology is undisclosed. The decision framework and vocabulary are the primary value, and those are clean. 3/5 stands. If the guide cites this resource, it should reference the framework, not the benchmark numbers.
|
||||
|
||||
---
|
||||
|
||||
## Fact-Check
|
||||
|
||||
| Claim | Verified | Source |
|
||||
|-------|----------|--------|
|
||||
| Author: Jacob Schmitt, CircleCI | ✅ | Blog byline |
|
||||
| Published 2026-03-11 | ✅ | Blog post date |
|
||||
| CLI: 77% task completion, 33% better token efficiency | ⚠️ | Cited in post, no methodology link — treat as directional |
|
||||
| MCP: 60% task completion | ⚠️ | Same benchmark — same caveat |
|
||||
| 6-question decision framework | ✅ | Read from post directly |
|
||||
| CircleCI hybrid model (3 layers) | ✅ | Post section "How CircleCI uses both" |
|
||||
|
||||
---
|
||||
|
||||
## Decision
|
||||
|
||||
**Score: 3/5 — Reference as external validation; borrow the inner loop / outer loop vocabulary.**
|
||||
|
||||
**Immediate actions**:
|
||||
1. Consider adding "inner loop / outer loop" framing to the guide's section on CLI vs. MCP tradeoffs — it is a cleaner mental model than what the guide currently uses
|
||||
2. The 6-question decision framework is a good template; a Claude Code-specific version would be higher value than citing the original (the guide's audience needs Claude Code-specific guidance, not generic agentic framework advice)
|
||||
|
||||
**What NOT to do**: do not cite the benchmark numbers (77% vs 60%) without disclosing that the methodology is undisclosed and the test is CircleCI-internal.
|
||||
|
||||
**Placement**: footnote or "See also" in `guide/core/` cost optimization section, or in the MCP ecosystem section when discussing when to use MCP vs. CLI patterns.
|
||||
|
||||
**Confidence**: High on framework quality. Low on benchmark reliability.
|
||||
34
docs/resource-evaluations/2026-03-17-context7-cli.md
Normal file
34
docs/resource-evaluations/2026-03-17-context7-cli.md
Normal file
|
|
@ -0,0 +1,34 @@
|
|||
# Resource Evaluation: Context7 CLI (ctx7)
|
||||
|
||||
**URL**: https://context7.com/docs/clients/cli
|
||||
**Date**: 2026-03-17
|
||||
**Evaluator**: Claude Sonnet 4.6
|
||||
**Score**: 4/5
|
||||
|
||||
## Summary
|
||||
|
||||
CLI companion to the Context7 MCP server (Upstash). Covers three functions: fetch library docs in terminal, manage skills (install/search/suggest/generate), configure Context7 for Claude Code.
|
||||
|
||||
Key commands:
|
||||
- `npx ctx7 skills suggest` — auto-detects project deps, recommends matching skills
|
||||
- `npx ctx7 skills install owner/repo` — install from any GitHub repository
|
||||
- `npx ctx7 setup --claude` — wizard for MCP or CLI+Skills mode configuration
|
||||
- `npx ctx7 library [name]` / `ctx7 docs [id] [query]` — doc lookup without browser
|
||||
|
||||
## Decision
|
||||
|
||||
**Integrated** into `guide/ultimate-guide.md` §5.5 as new subsection "Registry-based Discovery: ctx7 CLI" (~60 lines) and a cross-reference note in `guide/ecosystem/mcp-servers-ecosystem.md` Context7 section.
|
||||
|
||||
## Key Finding
|
||||
|
||||
The existing workflow (curl/unzip from GitHub) is replaced by `ctx7 skills suggest` + `ctx7 skills install`, which adds dependency-awareness and trust scores. The guide was documenting a 2024 manual workflow for a 2025 ecosystem.
|
||||
|
||||
## Fact-Check Note
|
||||
|
||||
First WebFetch call hallucinated "Built by Anthropic" for Context7 — this is false. Context7 is an Upstash product (confirmed via mcp-servers-ecosystem.md: `@upstash/context7-mcp`). Corrected before integration.
|
||||
|
||||
## Registry Relationship
|
||||
|
||||
- `agentskills.io` = open spec (30+ platforms, defined skill format) — guide §5.1
|
||||
- `context7.com/skills` = hosted registry of conforming skills with trust scores
|
||||
- These are complementary, not competing. Documented in the guide integration.
|
||||
|
|
@ -0,0 +1,95 @@
|
|||
# Resource Evaluation: mcp2cli (knowsuchagency)
|
||||
|
||||
**Date**: 2026-03-17
|
||||
**Evaluator**: Claude Sonnet 4.6
|
||||
**Resource URL**: https://github.com/knowsuchagency/mcp2cli
|
||||
**Resource Type**: Open-source CLI tool (GitHub)
|
||||
**Author**: knowsuchagency (Stephan Fitzpatrick)
|
||||
**Published**: 2026-03-09
|
||||
**Stars**: 1,261 | **License**: MIT | **Language**: Python
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
mcp2cli converts MCP servers, OpenAPI specs, and GraphQL schemas into runtime CLI commands, eliminating tool schema injection from LLM prompts. The project claims 96-99% token savings by removing schema overhead, with an additional 40-60% via TOON encoding on array output. Created 8 days ago, it has 1,261 stars and a Claude Code skill integration. The architectural insight is real — MCP tool schema injection is a documented cost driver in agentic workflows — but the tool is too new for production recommendation. There is also a structural mismatch with Claude Code's internal MCP architecture: Claude Code manages MCP connections natively, so mcp2cli's schema-elimination approach doesn't map cleanly onto the standard Claude Code workflow.
|
||||
|
||||
---
|
||||
|
||||
## Content Summary
|
||||
|
||||
- **Core mechanic**: converts MCP server definitions, OpenAPI specs, and GraphQL schemas into runtime CLI tools with zero codegen, calling the underlying server on invocation
|
||||
- **Token savings claims**: 96-99% reduction by removing tool schema injection from prompts; 40-60% additional via TOON (Tree-Optimized Output Notation) on array output
|
||||
- **Features**: OAuth support, spec caching, secrets management, `bake` mode (batch commands), jq integration for filtering
|
||||
- **Claude Code skill**: `npx skills add knowsuchagency/mcp2cli --skill mcp2cli` — installs as a skill for direct use in sessions
|
||||
- **Supported sources**: MCP servers (stdio, HTTP/SSE), OpenAPI 2.x/3.x, GraphQL schemas
|
||||
- **Contributors**: 3 | **Open issues**: 0 | **Last commit**: 2026-03-17
|
||||
|
||||
---
|
||||
|
||||
## Gap Analysis vs. Guide
|
||||
|
||||
| Area | mcp2cli | Guide coverage |
|
||||
|------|---------|----------------|
|
||||
| MCP schema overhead documentation | ✅ Addresses directly | ❌ Not documented anywhere |
|
||||
| Token cost of MCP tool injection | ✅ Core value prop | ❌ Gap — not mentioned in cost section |
|
||||
| CLI vs MCP tradeoff pattern | ✅ Practical tool for this | ⚠️ Mentioned at concept level, no concrete tooling |
|
||||
| RTK-style output filtering | ❌ Different mechanism (schema removal, not output filtering) | ✅ RTK covered |
|
||||
| Claude Code MCP integration | ⚠️ Structural mismatch (see Risk) | ✅ Covered in mcp-servers-ecosystem.md |
|
||||
| OpenAPI/GraphQL → CLI conversion | ✅ | ❌ Not covered |
|
||||
|
||||
**Real gap**: the guide does not document MCP tool schema overhead as a cost driver. This is worth adding to the cost optimization or MCP ecosystem sections, independent of whether this specific tool is recommended.
|
||||
|
||||
---
|
||||
|
||||
## Risk Assessment
|
||||
|
||||
**Structural mismatch with Claude Code**: Claude Code manages MCP connections internally via its own runtime. mcp2cli's primary value proposition — replacing MCP tool injection with CLI calls — does not apply to the standard Claude Code workflow where tool schemas are injected by the host. The Claude Code skill (`npx skills add`) is the intended integration path, but it positions mcp2cli as a complementary tool rather than a replacement for native MCP. Users expecting "install mcp2cli, save 96% tokens in Claude Code" will be disappointed. The actual use case is closer to: use mcp2cli in scripts, hooks, or non-Claude Code contexts where you control the tool injection.
|
||||
|
||||
**Maturity**: 8 days old. No production track record. 0 open issues could mean the tool is solid or could mean it hasn't been stress-tested yet. The Python dependency stack (typer, httpx, pydantic) is mature, but the integration surface with arbitrary MCP servers is wide.
|
||||
|
||||
**Token savings claims**: the 96-99% figure references an external blog post and does not include a controlled benchmark against a defined baseline. TOON encoding savings are plausible for array-heavy output but not independently verified.
|
||||
|
||||
---
|
||||
|
||||
## Score
|
||||
|
||||
**Score: 3/5** (Moderate — Watch list)
|
||||
|
||||
Real problem, real approach, credible engineering (MIT, active dev, 1K+ stars in one week). The structural mismatch with Claude Code's architecture and the 8-day maturity are the limiting factors. The architectural insight about schema overhead is worth documenting in the guide even if the tool itself isn't ready for a primary recommendation.
|
||||
|
||||
---
|
||||
|
||||
## Challenge
|
||||
|
||||
**Challenge**: "1,261 stars in 8 days is a recency bubble. The guide has a track record problem with early hype. Score should be 2/5."
|
||||
|
||||
**Response**: The concern is valid for production recommendation. However, 3/5 here reflects "watch list + document the insight," not "integrate now." The architectural gap (MCP schema overhead is not mentioned anywhere in the guide) is real and independent of mcp2cli's maturity. If the tool were 2/5, the insight would still be worth documenting. Score stays at 3/5 with explicit maturity caveat.
|
||||
|
||||
---
|
||||
|
||||
## Fact-Check
|
||||
|
||||
| Claim | Verified | Source |
|
||||
|-------|----------|--------|
|
||||
| 1,261 stars | ✅ | GitHub API — created 2026-03-09, checked 2026-03-17 |
|
||||
| MIT license | ✅ | LICENSE file in repo |
|
||||
| Claude Code skill integration | ✅ | `npx skills add knowsuchagency/mcp2cli --skill mcp2cli` in README |
|
||||
| 96-99% token savings | ⚠️ | Claimed in README, references external blog — not independently verified |
|
||||
| TOON encoding | ⚠️ | Described in README, no independent benchmark |
|
||||
| 3 contributors, 0 open issues | ✅ | GitHub API |
|
||||
| Python, typer, httpx | ✅ | pyproject.toml |
|
||||
|
||||
---
|
||||
|
||||
## Decision
|
||||
|
||||
**Score: 3/5 — Watch list. Document the architectural insight now, revisit the tool recommendation in 3 months.**
|
||||
|
||||
**Immediate action (guide)**: Add a note in `guide/ecosystem/mcp-servers-ecosystem.md` (or the cost optimization section of the ultimate guide) documenting MCP tool schema overhead as a token cost driver. This insight exists independently of mcp2cli.
|
||||
|
||||
**Deferred action**: If mcp2cli reaches 200+ stars sustained after the initial wave, has 10+ contributors, and has documented real-world usage with Claude Code specifically, revisit for mention in `guide/ecosystem/third-party-tools.md`.
|
||||
|
||||
**What NOT to do**: Do not mention mcp2cli as a production tool for Claude Code token savings without clarifying the structural mismatch with Claude Code's native MCP architecture.
|
||||
|
||||
**Confidence**: High on architecture analysis. Medium on token savings claims (unverified externally). High on maturity assessment.
|
||||
99
docs/resource-evaluations/2026-03-17-mcporter-mcp-toolkit.md
Normal file
99
docs/resource-evaluations/2026-03-17-mcporter-mcp-toolkit.md
Normal file
|
|
@ -0,0 +1,99 @@
|
|||
# Resource Evaluation: MCPorter (steipete)
|
||||
|
||||
**Date**: 2026-03-17
|
||||
**Evaluator**: Claude Sonnet 4.6
|
||||
**Resource URL**: https://github.com/steipete/mcporter
|
||||
**Resource Type**: Open-source TypeScript toolkit (GitHub)
|
||||
**Author**: Peter Steinberger (PSPDFKit founder)
|
||||
**Stars**: 2,966 | **License**: MIT | **Language**: TypeScript
|
||||
**Website**: mcporter.dev
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
MCPorter is a TypeScript runtime and CLI toolkit for MCP servers: it calls any MCP server programmatically, generates CLI wrappers, and emits typed TypeScript clients. Peter Steinberger (already referenced in the guide for practitioner insights) built it as a developer companion for testing and integrating MCP servers outside IDE environments. At 2,966 stars and 12+ contributors with a 2-week track record, it is meaningfully more mature than mcp2cli. The tool has genuine utility for power users writing hooks or scripts that need MCP server access without a running Claude Code session, but it is not a Claude Code workflow tool in the primary sense.
|
||||
|
||||
---
|
||||
|
||||
## Content Summary
|
||||
|
||||
- **Three operating modes**:
|
||||
- Runtime calling: call any MCP server tool programmatically from TypeScript/Node
|
||||
- CLI generation (`mcporter generate-cli`): generates shell-callable CLIs from MCP server definitions
|
||||
- TypeScript codegen (`mcporter emit-ts`): generates typed TS clients for MCP servers
|
||||
- **Auto-discovery**: reads MCP configs from Claude Desktop, Cursor, Codex, Windsurf, VS Code, OpenCode — detects which servers are configured and connects to them
|
||||
- **Transport support**: stdio and HTTP/SSE, unified interface regardless of server transport
|
||||
- **Connection pooling**: reuses connections across calls for efficiency
|
||||
- **OAuth**: full OAuth2 flow for HTTP-based MCP servers requiring auth
|
||||
- **Target use cases per README**: testing MCP servers, CI/CD pipelines needing MCP access, scripts and hooks, TypeScript apps consuming MCP services
|
||||
- **Contributors**: 12+ | **Last commit**: 2026-03-03 | **Open issues**: tracked actively
|
||||
|
||||
---
|
||||
|
||||
## Gap Analysis vs. Guide
|
||||
|
||||
| Area | MCPorter | Guide coverage |
|
||||
|------|----------|----------------|
|
||||
| Testing MCP servers outside Claude Code | ✅ Primary use case | ❌ Not documented |
|
||||
| MCP servers in hooks and scripts | ✅ `generate-cli` covers this | ⚠️ Hooks documented, MCP-in-hooks not |
|
||||
| Typed TypeScript clients for MCP | ✅ `emit-ts` | ❌ Not covered |
|
||||
| Auto-discovery of Claude MCP config | ✅ Reads claude_desktop_config.json | ⚠️ Config file location documented, not programmatic access |
|
||||
| Debugging MCP servers during development | ✅ Useful companion | ⚠️ MCP Inspector mentioned, MCPorter not |
|
||||
| Connection pooling / transport abstraction | ✅ | ❌ Not covered |
|
||||
|
||||
**Real gap**: the guide documents MCP server configuration and usage within Claude Code sessions, but does not cover accessing MCP servers programmatically from scripts, hooks, or external tools. MCPorter fills this gap for TypeScript environments.
|
||||
|
||||
---
|
||||
|
||||
## Steinberger Context
|
||||
|
||||
Peter Steinberger is the founder of PSPDFKit (now Nutrient), a well-known iOS/macOS SDK vendor. He is already cited in the guide for sharing operational insights on Claude Code usage in production (multi-agent workflows, cost management). His building MCPorter is a signal that MCP server access from non-IDE contexts is a real workflow need among practitioners — he would not build and publish this if the use case were marginal. The 12-contributor count and mcporter.dev website suggest this is not a weekend experiment.
|
||||
|
||||
---
|
||||
|
||||
## Score
|
||||
|
||||
**Score: 3/5** (Moderate — mention when covering MCP power-user workflows)
|
||||
|
||||
The tool is solid and the author is credible. The limiting factor is that it is a companion/debug tool, not a core Claude Code workflow tool. Most Claude Code users accessing MCP servers through the standard interface will never need MCPorter. The target audience is narrower: developers building MCP servers, writing complex hooks that need MCP access, or integrating Claude Code into CI/CD pipelines.
|
||||
|
||||
---
|
||||
|
||||
## Challenge
|
||||
|
||||
**Challenge**: "The auto-discovery of Claude Desktop config is the most interesting feature for the guide's audience, not the TypeScript codegen. The evaluation undersells the debugging angle."
|
||||
|
||||
**Response**: Valid. The debug/testing angle (testing MCP server behavior without a running IDE) is probably the highest-value use case for the guide's audience. A developer building a custom MCP server needs a way to call it and inspect responses without restarting Claude Code every time. MCPorter fills that gap cleanly. The `generate-cli` mode is also directly relevant to hook authors. Integration recommendation updated to lead with these two angles.
|
||||
|
||||
---
|
||||
|
||||
## Fact-Check
|
||||
|
||||
| Claim | Verified | Source |
|
||||
|-------|----------|--------|
|
||||
| 2,966 stars | ✅ | GitHub API |
|
||||
| MIT license | ✅ | LICENSE file |
|
||||
| Peter Steinberger / PSPDFKit | ✅ | GitHub profile + mcporter.dev About |
|
||||
| 12+ contributors | ✅ | GitHub contributors graph |
|
||||
| Auto-discovery: Claude, Cursor, Codex, Windsurf, VS Code, OpenCode | ✅ | README config-discovery section |
|
||||
| Last commit 2026-03-03 | ✅ | GitHub |
|
||||
| TypeScript, stdio + HTTP/SSE | ✅ | README + source |
|
||||
| Connection pooling | ✅ | README features |
|
||||
|
||||
---
|
||||
|
||||
## Decision
|
||||
|
||||
**Score: 3/5 — Mention in third-party-tools.md or mcp-servers-ecosystem.md for MCP power-user workflows.**
|
||||
|
||||
**Integration angles** (in order of relevance for the guide's audience):
|
||||
1. Testing and debugging MCP servers during development — use MCPorter to call server tools directly without restarting Claude Code
|
||||
2. Hook scripts needing MCP server access — `generate-cli` creates shell-callable wrappers from an MCP server definition
|
||||
3. TypeScript apps or CI/CD pipelines consuming MCP services — `emit-ts` for typed clients
|
||||
|
||||
**Placement**: a callout or "See also" in `guide/ecosystem/mcp-servers-ecosystem.md` under a "Testing and debugging MCP servers" paragraph, or in `guide/ecosystem/third-party-tools.md` under a developer tools subsection.
|
||||
|
||||
**When to revisit**: already at 3K stars with established author and active contributors — this is ready for a guide mention. The 3/5 score reflects scope (companion tool, not primary workflow) rather than maturity concerns.
|
||||
|
||||
**Confidence**: High. Author credibility verified, claims verified against source, use case is clear.
|
||||
|
|
@ -21,6 +21,8 @@ Trigger reached → re-evaluation → Integrate (Graduated) / Drop (Dropped)
|
|||
| [ICM](https://github.com/rtk-ai/icm) | MCP | 2026-02-12 | Pre-v1 (1 star, 11 commits) | First release + >20 stars |
|
||||
| [System Prompts](https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools) | Tool | 2026-01-26 | Redundant with official sources. Re-evaluated 2026-02-13 (Opus 4.6 update): still 2/5. | Anthropic confirms CLI prompts not published |
|
||||
| [o16g — Outcome Engineering](https://o16g.com/) | Manifesto | 2026-02-13 | Emerging framework by Cory Ondrejka (CTO Onebrief, co-creator Second Life, ex-VP Google/Meta). 16 principles for shifting from "code writing" to "outcome engineering". Honeycomb endorsement. No Claude Code-specific content yet. Memetic potential (naming follows i18n/k8s pattern). | Term adopted in >3 independent AI engineering resources OR author publishes tool-specific implementation |
|
||||
| [Fabro](https://github.com/fabro-sh/fabro) | Tool | 2026-03-17 | Graph-based workflow orchestrator for AI coding agents (Rust binary, zero deps, MIT). DOT graph pipelines + multi-model routing (CSS stylesheets) + Git checkpointing per stage (unique, no equivalent found) + Daytona cloud sandboxes. Direct Claude Code integration via `curl \| claude`. Score 3/5. Eval: [079](079-fabro-workflow-orchestration.md) | >200 GitHub stars OR practitioner write-up from production use |
|
||||
| [Rippletide Code](https://www.rippletide.com/dev) | Tool | 2026-03-17 | Hook-native runtime rule enforcement for Claude Code. Builds a Context Graph outside LLM context window, uses PreToolUse hooks to block violations pre-execution. Addresses CLAUDE.md degradation at scale (40+ rules) and compaction-driven rule loss. Free beta (`npx rippletide-code`, no signup). Distinct from eval 072 (MCP/SaaS). Score 3/5. Eval: [081](081-rippletide-code-rule-enforcement.md) | Public GitHub repo >100 stars OR independent practitioner write-up from production |
|
||||
|
||||
## Graduated
|
||||
|
||||
|
|
|
|||
76
examples/memory/icm-session-starter.md
Normal file
76
examples/memory/icm-session-starter.md
Normal file
|
|
@ -0,0 +1,76 @@
|
|||
# ICM Session Starter
|
||||
> Paste this at the beginning of any Claude Code session to activate ICM context.
|
||||
> Requires ICM installed and configured: `brew tap rtk-ai/tap && brew install icm`
|
||||
> then `icm init --mode mcp && icm init --mode hook && icm init --mode skill`
|
||||
|
||||
---
|
||||
|
||||
# Context — ICM (Infinite Context Memory) active in this session
|
||||
|
||||
ICM is installed and configured on this machine. Use it to store and retrieve persistent
|
||||
memory across sessions, bypassing context window limits.
|
||||
|
||||
## Available MCP tools
|
||||
|
||||
The `icm` MCP server is running. You have access to 22 `icm_*` tools for storing,
|
||||
recalling, and managing persistent memory.
|
||||
|
||||
**Direct CLI** (via Bash if needed):
|
||||
|
||||
```bash
|
||||
# Store a memory
|
||||
icm store --topic "<project-slug>" --content "<fact>" --importance high|medium|low|critical
|
||||
|
||||
# Recall by semantic query
|
||||
icm recall "<natural language query>"
|
||||
|
||||
# Inspect
|
||||
icm stats # count, topics, avg weight
|
||||
icm topics # list all topics
|
||||
icm list # list recent memories
|
||||
|
||||
# Manage
|
||||
icm forget <id> # delete by ID
|
||||
icm decay # apply temporal decay
|
||||
icm prune # remove low-weight entries
|
||||
```
|
||||
|
||||
**Important (v0.5.0 syntax)**:
|
||||
- `--importance` is an enum: `critical / high / medium / low` — not a float
|
||||
- No `memory` subcommand — use `icm store`, `icm recall` directly
|
||||
- Permanent knowledge graph: `icm memoir` (separate layer, no decay)
|
||||
|
||||
## Slash commands
|
||||
|
||||
- `/recall <query>` — search ICM memory
|
||||
- `/remember <content>` — store a memory in ICM
|
||||
|
||||
## How memories work
|
||||
|
||||
Two layers:
|
||||
- **Memories** (episodic): timestamped entries with temporal decay based on importance.
|
||||
`critical` importance never decays. `low` fades over time.
|
||||
- **Memoir** (semantic): permanent knowledge graph with typed relations
|
||||
(`depends_on`, `contradicts`, `superseded_by`, `part_of`, and 5 others).
|
||||
|
||||
Search is hybrid: BM25 full-text (30%) + vector similarity (70%).
|
||||
|
||||
A `PostToolUse` hook runs automatically — every N tool calls, ICM extracts context
|
||||
and stores it without any explicit action from you.
|
||||
|
||||
## Suggested usage in this session
|
||||
|
||||
```bash
|
||||
# At session start — recall relevant context
|
||||
icm recall "<current feature or topic>"
|
||||
|
||||
# When making a key decision
|
||||
icm store --topic "<project>" --content "<decision and rationale>" --importance high
|
||||
|
||||
# For permanent architectural facts
|
||||
icm memoir add-concept -m "<project>" -n "<concept>"
|
||||
```
|
||||
|
||||
## DB location
|
||||
|
||||
`~/Library/Application Support/dev.icm.icm/memories.db`
|
||||
|
|
@ -8,9 +8,25 @@ version: 1.0.0
|
|||
tags: [quality, audit, agents, skills, validation, production-readiness]
|
||||
---
|
||||
|
||||
# Audit Agents/Skills/Commands
|
||||
# Audit Agents/Skills/Commands (Advanced Skill)
|
||||
|
||||
Score Claude Code agents, skills, and commands across 16 weighted criteria. Outputs production readiness grades (A-F) with actionable fix suggestions.
|
||||
Comprehensive quality audit system for Claude Code agents, skills, and commands. Provides quantitative scoring, comparative analysis, and production readiness grading based on industry best practices.
|
||||
|
||||
## Purpose
|
||||
|
||||
**Problem**: Manual validation of agents/skills is error-prone and inconsistent. According to the LangChain Agent Report 2026, 29.5% of organizations deploy agents without systematic evaluation, leading to "agent bugs" as the top challenge (18% of teams).
|
||||
|
||||
**Solution**: Automated quality scoring across 16 weighted criteria with production readiness thresholds (80% = Grade B minimum for production deployment).
|
||||
|
||||
**Key Features**:
|
||||
- Quantitative scoring (32 points for agents/skills, 20 for commands)
|
||||
- Weighted criteria (Identity 3x, Prompt 2x, Validation 1x, Design 2x)
|
||||
- Production readiness grading (A-F scale with 80% threshold)
|
||||
- Comparative analysis vs reference templates
|
||||
- JSON/Markdown dual output for programmatic integration
|
||||
- Fix suggestions for failing criteria
|
||||
|
||||
---
|
||||
|
||||
## Modes
|
||||
|
||||
|
|
@ -22,126 +38,420 @@ Score Claude Code agents, skills, and commands across 16 weighted criteria. Outp
|
|||
|
||||
**Default**: Full Audit (recommended for first run)
|
||||
|
||||
---
|
||||
|
||||
## Methodology
|
||||
|
||||
### Why These Criteria?
|
||||
|
||||
The 16-criteria framework is derived from:
|
||||
1. **Claude Code Best Practices** (Ultimate Guide line 4921: Agent Validation Checklist)
|
||||
2. **Industry Data** (LangChain Agent Report 2026: evaluation gaps)
|
||||
3. **Production Failures** (Community feedback on hardcoded paths, missing error handling)
|
||||
4. **Composition Patterns** (Skills should reference other skills, agents should be modular)
|
||||
|
||||
### Scoring Philosophy
|
||||
|
||||
**Weight Rationale**:
|
||||
- **Identity (3x)**: If users can't find/invoke the agent, quality is irrelevant (discoverability > quality)
|
||||
- **Prompt (2x)**: Determines reliability and accuracy of outputs
|
||||
- **Validation (1x)**: Improves robustness but is secondary to core functionality
|
||||
- **Design (2x)**: Impacts long-term maintainability and scalability
|
||||
|
||||
**Grade Standards**:
|
||||
- **A (90-100%)**: Production-ready, minimal risk
|
||||
- **B (80-89%)**: Good, meets production threshold
|
||||
- **C (70-79%)**: Needs improvement before production
|
||||
- **D (60-69%)**: Significant gaps, not production-ready
|
||||
- **F (<60%)**: Critical issues, requires major refactoring
|
||||
|
||||
**Industry Alignment**: The 80% threshold aligns with software engineering best practices for production deployment (e.g., code coverage >80%, security scan pass rates).
|
||||
|
||||
---
|
||||
|
||||
## Workflow
|
||||
|
||||
### Phase 1: Discovery
|
||||
|
||||
Scan and classify files from:
|
||||
```
|
||||
.claude/agents/ .claude/skills/ .claude/commands/
|
||||
examples/agents/ examples/skills/ examples/commands/
|
||||
```
|
||||
1. **Scan directories**:
|
||||
```
|
||||
.claude/agents/
|
||||
.claude/skills/
|
||||
.claude/commands/
|
||||
examples/agents/ (if exists)
|
||||
examples/skills/ (if exists)
|
||||
examples/commands/ (if exists)
|
||||
```
|
||||
|
||||
**Checkpoint**: Confirm file count and types before proceeding to scoring.
|
||||
2. **Classify files** by type (agent/skill/command)
|
||||
|
||||
### Phase 2: Scoring
|
||||
3. **Load reference templates** (for Comparative mode):
|
||||
```
|
||||
guide/examples/agents/ (benchmark files)
|
||||
guide/examples/skills/ (benchmark files)
|
||||
guide/examples/commands/ (benchmark files)
|
||||
```
|
||||
|
||||
### Phase 2: Scoring Engine
|
||||
|
||||
Load scoring criteria from `scoring/criteria.yaml`:
|
||||
|
||||
```yaml
|
||||
agents:
|
||||
max_points: 32
|
||||
categories:
|
||||
identity:
|
||||
weight: 3
|
||||
criteria:
|
||||
- id: A1.1
|
||||
name: "Clear name"
|
||||
points: 3
|
||||
detection: "frontmatter.name exists and is descriptive"
|
||||
# ... (16 total criteria)
|
||||
```
|
||||
|
||||
For each file:
|
||||
1. Parse YAML frontmatter
|
||||
1. Parse frontmatter (YAML)
|
||||
2. Extract content sections
|
||||
3. Run detection patterns (regex, keyword search)
|
||||
4. Calculate score: `(points / max_points) x 100`
|
||||
5. Assign grade: A (90-100%), B (80-89%), C (70-79%), D (60-69%), F (<60%)
|
||||
|
||||
**Checkpoint**: Verify at least one file scores successfully before batch processing.
|
||||
|
||||
**Grade threshold**: 80% (Grade B) = minimum for production deployment.
|
||||
4. Calculate score: `(points / max_points) × 100`
|
||||
5. Assign grade (A-F)
|
||||
|
||||
### Phase 3: Comparative Analysis (Comparative Mode Only)
|
||||
|
||||
1. Match each project file to closest reference template
|
||||
For each project file:
|
||||
1. Find closest matching template (by description similarity)
|
||||
2. Compare scores per criterion
|
||||
3. Flag gaps >10 points
|
||||
3. Identify gaps: `template_score - project_score`
|
||||
4. Flag significant gaps (>10 points difference)
|
||||
|
||||
**Example**:
|
||||
```
|
||||
Project file: .claude/agents/debugging-specialist.md (Score: 78%, Grade C)
|
||||
Closest template: examples/agents/debugging-specialist.md (Score: 94%, Grade A)
|
||||
|
||||
Gaps:
|
||||
- Anti-hallucination measures: -2 points (template has, project missing)
|
||||
- Edge cases documented: -1 point (template has 5 examples, project has 1)
|
||||
- Integration documented: -1 point (template references 3 skills, project none)
|
||||
|
||||
Total gap: 16 points (explains C vs A difference)
|
||||
```
|
||||
|
||||
### Phase 4: Report Generation
|
||||
|
||||
Output both `audit-report.md` (human-readable) and `audit-report.json` (programmatic):
|
||||
**Markdown Report** (`audit-report.md`):
|
||||
- Summary table (overall + by type)
|
||||
- Individual scores with top issues
|
||||
- Detailed breakdown per file (collapsible)
|
||||
- Prioritized recommendations
|
||||
|
||||
**JSON Output** (`audit-report.json`):
|
||||
```json
|
||||
{
|
||||
"metadata": {
|
||||
"project_path": "/path/to/project",
|
||||
"audit_date": "2026-02-07",
|
||||
"mode": "full",
|
||||
"version": "1.0.0"
|
||||
},
|
||||
"summary": {
|
||||
"overall_score": 82.5,
|
||||
"overall_grade": "B",
|
||||
"total_files": 15,
|
||||
"production_ready_count": 10
|
||||
"production_ready_count": 10,
|
||||
"production_ready_percentage": 66.7
|
||||
},
|
||||
"by_type": {
|
||||
"agents": { "count": 5, "avg_score": 85.2, "grade": "B" },
|
||||
"skills": { "count": 8, "avg_score": 78.9, "grade": "C" },
|
||||
"commands": { "count": 2, "avg_score": 92.0, "grade": "A" }
|
||||
},
|
||||
"files": [
|
||||
{
|
||||
"path": ".claude/agents/debugging-specialist.md",
|
||||
"type": "agent",
|
||||
"score": 78.1,
|
||||
"grade": "C",
|
||||
"points_obtained": 25,
|
||||
"points_max": 32,
|
||||
"failed_criteria": [
|
||||
{ "id": "A2.4", "name": "Anti-hallucination measures", "points_lost": 2 }
|
||||
{
|
||||
"id": "A2.4",
|
||||
"name": "Anti-hallucination measures",
|
||||
"points_lost": 2,
|
||||
"recommendation": "Add section on source verification"
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"top_issues": [
|
||||
{
|
||||
"issue": "Missing error handling",
|
||||
"affected_files": 8,
|
||||
"impact": "Runtime failures unhandled",
|
||||
"priority": "high"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Checkpoint**: Verify report file is written and contains all scanned files.
|
||||
|
||||
### Phase 5: Fix Suggestions (Optional)
|
||||
|
||||
For each failing criterion, generate actionable fix with the specific section to add and detection keywords to verify the fix.
|
||||
For each failing criterion, generate **actionable fix**:
|
||||
|
||||
```markdown
|
||||
### File: .claude/agents/debugging-specialist.md
|
||||
**Issue**: Missing anti-hallucination measures (2 points lost)
|
||||
|
||||
**Fix**:
|
||||
Add this section after "Methodology":
|
||||
|
||||
## Source Verification
|
||||
|
||||
- Always cite sources for technical claims
|
||||
- Use phrases: "According to [documentation]...", "Based on [tool output]..."
|
||||
- If uncertain, state: "I don't have verified information on..."
|
||||
- Never invent: statistics, version numbers, API signatures, stack traces
|
||||
|
||||
**Detection**: Grep for keywords: "verify", "cite", "source", "evidence"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Scoring Criteria
|
||||
|
||||
See `scoring/criteria.yaml` for complete definitions. Summary:
|
||||
|
||||
### Agents (32 points max)
|
||||
|
||||
| Category | Weight | Max Points | Key Criteria |
|
||||
|----------|--------|------------|--------------|
|
||||
| Identity | 3x | 12 | Clear name, description with triggers, role defined |
|
||||
| Prompt Quality | 2x | 8 | 3+ examples, anti-hallucination measures |
|
||||
| Validation | 1x | 4 | Error handling, no hardcoded paths |
|
||||
| Design | 2x | 8 | Single responsibility, integration documented |
|
||||
| Category | Weight | Criteria Count | Max Points |
|
||||
|----------|--------|----------------|------------|
|
||||
| Identity | 3x | 4 | 12 |
|
||||
| Prompt Quality | 2x | 4 | 8 |
|
||||
| Validation | 1x | 4 | 4 |
|
||||
| Design | 2x | 4 | 8 |
|
||||
|
||||
**Key Criteria**:
|
||||
- Clear name (3 pts): Not generic like "agent1"
|
||||
- Description with triggers (3 pts): Contains "when"/"use"
|
||||
- Role defined (2 pts): "You are..." statement
|
||||
- 3+ examples (1 pt): Usage scenarios documented
|
||||
- Single responsibility (2 pts): Focused, not "general purpose"
|
||||
|
||||
### Skills (32 points max)
|
||||
|
||||
| Category | Weight | Max Points | Key Criteria |
|
||||
|----------|--------|------------|--------------|
|
||||
| Structure | 3x | 12 | Valid SKILL.md, valid name, methodology section |
|
||||
| Content | 2x | 8 | Clear triggers, usage examples |
|
||||
| Technical | 1x | 4 | No hardcoded paths, token budget |
|
||||
| Design | 2x | 8 | Modular, references other skills |
|
||||
| Category | Weight | Criteria Count | Max Points |
|
||||
|----------|--------|----------------|------------|
|
||||
| Structure | 3x | 4 | 12 |
|
||||
| Content | 2x | 4 | 8 |
|
||||
| Technical | 1x | 4 | 4 |
|
||||
| Design | 2x | 4 | 8 |
|
||||
|
||||
**Key Criteria**:
|
||||
- Valid SKILL.md (3 pts): Proper naming
|
||||
- Name valid (3 pts): Lowercase, 1-64 chars, no spaces
|
||||
- Methodology described (2 pts): Workflow section exists
|
||||
- No hardcoded paths (1 pt): No `/Users/`, `/home/`
|
||||
- Clear triggers (2 pts): "When to use" section
|
||||
|
||||
### Commands (20 points max)
|
||||
|
||||
| Category | Weight | Max Points | Key Criteria |
|
||||
|----------|--------|------------|--------------|
|
||||
| Structure | 3x | 12 | Valid frontmatter, argument hint, step-by-step |
|
||||
| Quality | 2x | 8 | Error handling, mentions failure modes |
|
||||
| Category | Weight | Criteria Count | Max Points |
|
||||
|----------|--------|----------------|------------|
|
||||
| Structure | 3x | 4 | 12 |
|
||||
| Quality | 2x | 4 | 8 |
|
||||
|
||||
**Key Criteria**:
|
||||
- Valid frontmatter (3 pts): name + description
|
||||
- Argument hint (3 pts): If uses `$ARGUMENTS`
|
||||
- Step-by-step workflow (3 pts): Numbered sections
|
||||
- Error handling (2 pts): Mentions failure modes
|
||||
|
||||
---
|
||||
|
||||
## Detection Patterns
|
||||
|
||||
### Frontmatter Parsing
|
||||
|
||||
```python
|
||||
import yaml
|
||||
import re
|
||||
|
||||
def parse_frontmatter(content):
|
||||
match = re.search(r'^---\n(.*?)\n---', content, re.DOTALL)
|
||||
if match:
|
||||
return yaml.safe_load(match.group(1))
|
||||
return None
|
||||
```
|
||||
|
||||
### Keyword Detection
|
||||
|
||||
```python
|
||||
def has_keywords(text, keywords):
|
||||
text_lower = text.lower()
|
||||
return any(kw in text_lower for kw in keywords)
|
||||
|
||||
# Example
|
||||
has_trigger = has_keywords(description, ['when', 'use', 'trigger'])
|
||||
has_error_handling = has_keywords(content, ['error', 'failure', 'fallback'])
|
||||
```
|
||||
|
||||
### Overlap Detection (Duplication Check)
|
||||
|
||||
```python
|
||||
def jaccard_similarity(text1, text2):
|
||||
words1 = set(text1.lower().split())
|
||||
words2 = set(text2.lower().split())
|
||||
intersection = words1 & words2
|
||||
union = words1 | words2
|
||||
return len(intersection) / len(union) if union else 0
|
||||
|
||||
# Flag if similarity > 0.5 (50% keyword overlap)
|
||||
if jaccard_similarity(desc1, desc2) > 0.5:
|
||||
issues.append("High overlap with another file")
|
||||
```
|
||||
|
||||
### Token Counting (Approximate)
|
||||
|
||||
```python
|
||||
def estimate_tokens(text):
|
||||
# Rough estimate: 1 token ≈ 0.75 words
|
||||
word_count = len(text.split())
|
||||
return int(word_count * 1.3)
|
||||
|
||||
# Check budget
|
||||
tokens = estimate_tokens(file_content)
|
||||
if tokens > 5000:
|
||||
issues.append("File too large (>5K tokens)")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Industry Context
|
||||
|
||||
**Source**: LangChain Agent Report 2026 (public report, page 14-22)
|
||||
|
||||
**Key Findings**:
|
||||
- **29.5%** of organizations deploy agents without systematic evaluation
|
||||
- **18%** cite "agent bugs" as their primary challenge
|
||||
- **Only 12%** use automated quality checks (88% manual or none)
|
||||
- **43%** report difficulty maintaining agent quality over time
|
||||
- **Top issues**: Hallucinations (31%), poor error handling (28%), unclear triggers (22%)
|
||||
|
||||
**Implications**:
|
||||
1. **Automation gap**: Most teams rely on manual checklists (error-prone at scale)
|
||||
2. **Quality debt**: Agents deployed without validation accumulate technical debt
|
||||
3. **Maintenance burden**: 43% struggle with quality over time (no tracking system)
|
||||
|
||||
**This skill addresses**:
|
||||
- Automation: Replaces manual checklists with quantitative scoring
|
||||
- Tracking: JSON output enables trend analysis over time
|
||||
- Standards: 80% threshold provides clear production gate
|
||||
|
||||
---
|
||||
|
||||
## Output Examples
|
||||
|
||||
### Quick Audit (Top-5 Criteria)
|
||||
|
||||
```markdown
|
||||
# Quick Audit: Agents/Skills/Commands
|
||||
|
||||
**Files**: 15 (5 agents, 8 skills, 2 commands)
|
||||
**Critical Issues**: 3 files fail top-5 criteria
|
||||
|
||||
## Top-5 Criteria (Pass/Fail)
|
||||
|
||||
| File | Valid Name | Has Triggers | Error Handling | No Hardcoded Paths | Examples |
|
||||
|------|------------|--------------|----------------|--------------------|----------|
|
||||
| agent1.md | ✅ | ✅ | ❌ | ✅ | ❌ |
|
||||
| skill2/ | ✅ | ❌ | ✅ | ❌ | ✅ |
|
||||
|
||||
## Action Required
|
||||
|
||||
1. **Add error handling**: 5 files
|
||||
2. **Remove hardcoded paths**: 3 files
|
||||
3. **Add usage examples**: 4 files
|
||||
```
|
||||
|
||||
### Full Audit
|
||||
|
||||
See Phase 4: Report Generation above for full structure.
|
||||
|
||||
### Comparative (Full + Benchmarks)
|
||||
|
||||
```markdown
|
||||
# Comparative Audit
|
||||
|
||||
## Project vs Templates
|
||||
|
||||
| File | Project Score | Template Score | Gap | Top Missing |
|
||||
|------|---------------|----------------|-----|-------------|
|
||||
| debugging-specialist.md | 78% (C) | 94% (A) | -16 pts | Anti-hallucination, edge cases |
|
||||
| testing-expert/ | 85% (B) | 91% (A) | -6 pts | Integration docs |
|
||||
|
||||
## Recommendations
|
||||
|
||||
Focus on these gaps to reach template quality:
|
||||
1. **Anti-hallucination measures** (8 files): Add source verification sections
|
||||
2. **Edge case documentation** (5 files): Add failure scenario examples
|
||||
3. **Integration documentation** (4 files): List compatible agents/skills
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic (Full Audit)
|
||||
|
||||
```bash
|
||||
# Full audit (default)
|
||||
# In Claude Code
|
||||
Use skill: audit-agents-skills
|
||||
|
||||
# Specify path
|
||||
Use skill: audit-agents-skills for ~/projects/my-app
|
||||
```
|
||||
|
||||
# Quick audit
|
||||
### With Options
|
||||
|
||||
```bash
|
||||
# Quick audit (fast)
|
||||
Use skill: audit-agents-skills with mode=quick
|
||||
|
||||
# Comparative with benchmarks
|
||||
# Comparative (benchmark analysis)
|
||||
Use skill: audit-agents-skills with mode=comparative
|
||||
|
||||
# Generate fixes
|
||||
Use skill: audit-agents-skills with fixes=true
|
||||
|
||||
# JSON output for CI/CD
|
||||
# Custom output path
|
||||
Use skill: audit-agents-skills with output=~/Desktop/audit.json
|
||||
```
|
||||
|
||||
### JSON Output Only
|
||||
|
||||
```bash
|
||||
# For programmatic integration
|
||||
Use skill: audit-agents-skills with format=json output=audit.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Integration with CI/CD
|
||||
|
||||
### Pre-commit Hook
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# .git/hooks/pre-commit
|
||||
|
||||
# Run quick audit on changed agent/skill/command files
|
||||
changed_files=$(git diff --cached --name-only | grep -E "^\.claude/(agents|skills|commands)/")
|
||||
|
||||
if [ -n "$changed_files" ]; then
|
||||
echo "Running quick audit on changed files..."
|
||||
# Run audit (requires Claude Code CLI wrapper)
|
||||
# Exit with 1 if any file scores <80%
|
||||
fi
|
||||
```
|
||||
|
|
@ -158,10 +468,80 @@ jobs:
|
|||
- uses: actions/checkout@v3
|
||||
- name: Run quality audit
|
||||
run: |
|
||||
# Run audit skill, parse JSON, fail if overall_score < 80
|
||||
# Run audit skill
|
||||
# Parse JSON output
|
||||
# Fail if overall_score < 80
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Comparison: Command vs Skill
|
||||
|
||||
| Aspect | Command (`/audit-agents-skills`) | Skill (this file) |
|
||||
|--------|----------------------------------|-------------------|
|
||||
| **Scope** | Current project only | Multi-project, comparative |
|
||||
| **Output** | Markdown report | Markdown + JSON |
|
||||
| **Speed** | Fast (5-10 min) | Slower (10-20 min with comparative) |
|
||||
| **Depth** | Standard 16 criteria | Same + benchmark analysis |
|
||||
| **Fix suggestions** | Via `--fix` flag | Built-in with recommendations |
|
||||
| **Programmatic** | Terminal output | JSON for CI/CD integration |
|
||||
| **Best for** | Quick checks, dev workflow | Deep audits, quality tracking |
|
||||
|
||||
**Recommendation**: Use command for daily checks, skill for release gates and quality tracking.
|
||||
|
||||
---
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Updating Criteria
|
||||
|
||||
Edit `scoring/criteria.yaml`:
|
||||
```yaml
|
||||
agents:
|
||||
categories:
|
||||
identity:
|
||||
criteria:
|
||||
- id: A1.5 # New criterion
|
||||
name: "API versioning specified"
|
||||
points: 3
|
||||
detection: "mentions API version or compatibility"
|
||||
```
|
||||
|
||||
Version bump: Increment `version` in frontmatter when criteria change.
|
||||
|
||||
### Adding File Types
|
||||
|
||||
To support new file types (e.g., "workflows"):
|
||||
1. Add to `scoring/criteria.yaml`:
|
||||
```yaml
|
||||
workflows:
|
||||
max_points: 24
|
||||
categories: [...]
|
||||
```
|
||||
2. Update detection logic (file path patterns)
|
||||
3. Update report templates
|
||||
|
||||
---
|
||||
|
||||
## Related
|
||||
|
||||
- **Command version**: `.claude/commands/audit-agents-skills.md` (quick checks, dev workflow)
|
||||
- **Command version**: `.claude/commands/audit-agents-skills.md`
|
||||
- **Agent Validation Checklist**: guide line 4921 (manual 16 criteria)
|
||||
- **Skill Validation**: guide line 5491 (spec documentation)
|
||||
- **Reference templates**: `examples/agents/`, `examples/skills/`, `examples/commands/`
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
**v1.0.0** (2026-02-07):
|
||||
- Initial release
|
||||
- 16-criteria framework (agents/skills/commands)
|
||||
- 3 audit modes (quick/full/comparative)
|
||||
- JSON + Markdown output
|
||||
- Fix suggestions
|
||||
- Industry context (LangChain 2026 report)
|
||||
|
||||
---
|
||||
|
||||
**Skill ready for use**: `audit-agents-skills`
|
||||
|
|
|
|||
|
|
@ -9,20 +9,38 @@ tags: [dashboard, tui, monitoring, claude-code, costs]
|
|||
|
||||
# ccboard - Claude Code Dashboard
|
||||
|
||||
TUI/Web dashboard for monitoring Claude Code usage: sessions, costs, tokens, MCP servers, and configuration.
|
||||
Comprehensive TUI/Web dashboard for monitoring and managing your Claude Code usage.
|
||||
|
||||
## Prerequisites
|
||||
## Overview
|
||||
|
||||
ccboard provides a unified interface to visualize and explore all your Claude Code data:
|
||||
|
||||
- **Sessions**: Browse all conversations across your projects
|
||||
- **Statistics**: Real-time token usage, cache hit rates, activity trends
|
||||
- **MCP Servers**: Monitor and manage Model Context Protocol servers
|
||||
- **Costs**: Track spending with detailed token breakdown and pricing
|
||||
- **Configuration**: View cascading settings (Global > Project > Local)
|
||||
- **Hooks**: Explore pre/post execution hooks and automation
|
||||
- **Agents**: Manage custom agents, commands, and skills
|
||||
- **History**: Search across all messages with full-text search
|
||||
|
||||
## Installation
|
||||
|
||||
### Via Cargo (Recommended)
|
||||
|
||||
```bash
|
||||
# Using Claude Code command
|
||||
/ccboard-install
|
||||
|
||||
# Or manually
|
||||
cargo install ccboard
|
||||
```
|
||||
|
||||
### Requirements
|
||||
|
||||
- Rust 1.70+ and Cargo
|
||||
- Claude Code installed (reads from `~/.claude/`)
|
||||
|
||||
```bash
|
||||
# Install
|
||||
cargo install ccboard
|
||||
# Or via Claude Code command
|
||||
/ccboard-install
|
||||
```
|
||||
|
||||
## Commands
|
||||
|
||||
| Command | Description | Shortcut |
|
||||
|
|
@ -34,78 +52,347 @@ cargo install ccboard
|
|||
| `/ccboard-web` | Launch web UI | `ccboard web` |
|
||||
| `/ccboard-install` | Install/update ccboard | - |
|
||||
|
||||
## Tabs Overview
|
||||
## Features
|
||||
|
||||
| Tab | Key | What It Shows |
|
||||
|-----|-----|---------------|
|
||||
| Dashboard | `1` | Token stats, cache ratio, 7-day sparkline, model gauges |
|
||||
| Sessions | `2` | Project tree + session list, search with `/`, edit with `e` |
|
||||
| Config | `3` | Cascading settings: Global / Project / Local / Merged |
|
||||
| Hooks | `4` | Event-based hooks, script preview, match patterns |
|
||||
| Agents | `5` | Agents, commands, skills with frontmatter extraction |
|
||||
| Costs | `6` | Overview, by-model breakdown, daily trend |
|
||||
| History | `7` | Full-text search across all sessions |
|
||||
| MCP | `8` | Server status (Running/Stopped), details, quick actions |
|
||||
### 8 Interactive Tabs
|
||||
|
||||
## Navigation
|
||||
#### 1. Dashboard (Press `1`)
|
||||
- Token usage statistics
|
||||
- Session count
|
||||
- Messages sent
|
||||
- Cache hit ratio
|
||||
- MCP server count
|
||||
- 7-day activity sparkline
|
||||
- Top 5 models usage gauges
|
||||
|
||||
| Keys | Action |
|
||||
|------|--------|
|
||||
| `1-8` | Jump to tab |
|
||||
| `Tab` / `Shift+Tab` | Navigate tabs |
|
||||
| `h/j/k/l` or arrows | Navigate within tab |
|
||||
| `Enter` | View details / Focus pane |
|
||||
| `e` | Edit file in `$EDITOR` |
|
||||
| `o` | Reveal file in finder |
|
||||
| `/` | Search (Sessions/History) |
|
||||
| `F5` | Refresh data |
|
||||
| `q` | Quit |
|
||||
#### 2. Sessions (Press `2`)
|
||||
- Dual-pane: Project tree + Session list
|
||||
- Metadata: timestamps, duration, tokens, models
|
||||
- Search: Filter by project, message, or model (press `/`)
|
||||
- File operations: `e` to edit JSONL, `o` to reveal in finder
|
||||
|
||||
#### 3. Config (Press `3`)
|
||||
- 4-column cascading view: Global | Project | Local | Merged
|
||||
- Settings inheritance visualization
|
||||
- MCP servers configuration
|
||||
- Rules (CLAUDE.md) preview
|
||||
- Permissions, hooks, environment variables
|
||||
- Edit config with `e` key
|
||||
|
||||
#### 4. Hooks (Press `4`)
|
||||
- Event-based hook browsing (PreToolUse, UserPromptSubmit)
|
||||
- Hook bash script preview
|
||||
- Match patterns and conditions
|
||||
- File path tracking for easy editing
|
||||
|
||||
#### 5. Agents (Press `5`)
|
||||
- 3 sub-tabs: Agents (12) | / Commands (5) | ★ Skills (0)
|
||||
- Frontmatter metadata extraction
|
||||
- File preview and editing
|
||||
- Recursive directory scanning
|
||||
|
||||
#### 6. Costs (Press `6`)
|
||||
- 3 views: Overview | By Model | Daily Trend
|
||||
- Token breakdown: input, output, cache read/write
|
||||
- Pricing: total estimated costs
|
||||
- Model distribution breakdown
|
||||
|
||||
#### 7. History (Press `7`)
|
||||
- Full-text search across all sessions
|
||||
- Activity by hour histogram (24h)
|
||||
- 7-day sparkline
|
||||
- All messages searchable
|
||||
|
||||
#### 8. MCP (Press `8`) **NEW**
|
||||
- Dual-pane: Server list (35%) | Details (65%)
|
||||
- Live status detection: ● Running, ○ Stopped, ? Unknown
|
||||
- Full server details: command, args, environment vars
|
||||
- Quick actions: `e` edit config, `o` reveal file, `r` refresh status
|
||||
|
||||
### Navigation
|
||||
|
||||
**Global Keys**:
|
||||
- `1-8` : Jump to tab
|
||||
- `Tab` / `Shift+Tab` : Navigate tabs
|
||||
- `q` : Quit
|
||||
- `F5` : Refresh data
|
||||
|
||||
**Vim-style**:
|
||||
- `h/j/k/l` : Navigate (left/down/up/right)
|
||||
- `←/→/↑/↓` : Arrow alternatives
|
||||
|
||||
**Common Actions**:
|
||||
- `Enter` : View details / Focus pane
|
||||
- `e` : Edit file in $EDITOR
|
||||
- `o` : Reveal file in finder
|
||||
- `/` : Search (in Sessions/History tabs)
|
||||
- `Esc` : Close popup / Cancel
|
||||
|
||||
### Real-time Monitoring
|
||||
|
||||
ccboard includes a file watcher that monitors `~/.claude/` for changes:
|
||||
|
||||
- **Stats updates**: Live refresh when `stats-cache.json` changes
|
||||
- **Session updates**: New sessions appear automatically
|
||||
- **Config updates**: Settings changes reflected in UI
|
||||
- **500ms debounce**: Prevents excessive updates
|
||||
|
||||
### File Editing
|
||||
|
||||
Press `e` on any item to open in your preferred editor:
|
||||
|
||||
- Uses `$VISUAL` > `$EDITOR` > platform default (nano/notepad)
|
||||
- Supports: Sessions (JSONL), Config (JSON), Hooks (Shell), Agents (Markdown)
|
||||
- Terminal state preserved (alternate screen mode)
|
||||
- Cross-platform (macOS, Linux, Windows)
|
||||
|
||||
### MCP Server Management
|
||||
|
||||
The MCP tab provides comprehensive server monitoring:
|
||||
|
||||
**Status Detection** (Unix):
|
||||
- Checks running processes via `ps aux`
|
||||
- Extracts package name from command
|
||||
- Displays PID when running
|
||||
- Windows shows "Unknown" status
|
||||
|
||||
**Server Details**:
|
||||
- Full command and arguments
|
||||
- Environment variables with values
|
||||
- Config file path (`~/.claude/claude_desktop_config.json`)
|
||||
- Quick edit/reveal actions
|
||||
|
||||
**Navigation**:
|
||||
- `h/l` or `←/→` : Switch between list and details
|
||||
- `j/k` or `↑/↓` : Select server
|
||||
- `Enter` : Focus detail pane
|
||||
- `e` : Edit MCP config
|
||||
- `o` : Reveal config in finder
|
||||
- `r` : Refresh server status
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Daily Monitoring
|
||||
|
||||
```bash
|
||||
# Launch dashboard
|
||||
/dashboard
|
||||
# Press '1' for overview, '6' for costs, '7' for history
|
||||
|
||||
# Check activity and costs
|
||||
# Press '1' for overview
|
||||
# Press '6' for costs breakdown
|
||||
# Press '7' for recent history
|
||||
```
|
||||
|
||||
### MCP Troubleshooting
|
||||
|
||||
```bash
|
||||
# Open MCP tab
|
||||
/mcp-status
|
||||
# Check server status (green = running)
|
||||
# Press 'e' to edit config, 'r' to refresh status
|
||||
|
||||
# Or: ccboard then press '8'
|
||||
|
||||
# Check server status (● green = running)
|
||||
# Press 'e' to edit config if needed
|
||||
# Press 'r' to refresh status after changes
|
||||
```
|
||||
|
||||
### Session Analysis
|
||||
|
||||
```bash
|
||||
# Browse sessions
|
||||
/sessions
|
||||
# Press '/' to search by project, model, or message content
|
||||
# Press 'e' on a session to view full JSONL
|
||||
|
||||
# Press '/' to search
|
||||
# Filter by project: /my-project
|
||||
# Filter by model: /opus
|
||||
# Press 'e' on session to view full JSONL
|
||||
```
|
||||
|
||||
### Cost Tracking
|
||||
|
||||
```bash
|
||||
# View costs
|
||||
/costs
|
||||
|
||||
# Press '1' for overview
|
||||
# Press '2' for breakdown by model
|
||||
# Press '3' for daily trend
|
||||
|
||||
# Identify expensive sessions
|
||||
# Track cache efficiency (99.9% hit rate)
|
||||
```
|
||||
|
||||
## Web Interface
|
||||
|
||||
Launch browser-based interface for remote monitoring:
|
||||
|
||||
```bash
|
||||
/ccboard-web # Launch at http://localhost:3333
|
||||
ccboard web --port 8080 # Custom port
|
||||
ccboard both --port 3333 # TUI + Web simultaneously
|
||||
# Launch web UI
|
||||
/ccboard-web
|
||||
|
||||
# Or with custom port
|
||||
ccboard web --port 8080
|
||||
|
||||
# Access at http://localhost:3333
|
||||
```
|
||||
|
||||
## Validation
|
||||
**Features**:
|
||||
- Same data as TUI (shared backend)
|
||||
- Server-Sent Events (SSE) for live updates
|
||||
- Responsive design (desktop/tablet/mobile)
|
||||
- Concurrent multi-user access
|
||||
|
||||
After launching, verify ccboard is working:
|
||||
**Run both simultaneously**:
|
||||
```bash
|
||||
ccboard both --port 3333
|
||||
```
|
||||
|
||||
1. Run `/dashboard` and confirm token stats load on tab `1`
|
||||
2. Press `2` and verify sessions are listed
|
||||
3. Press `6` and confirm cost data appears
|
||||
4. If no data: check `ls ~/.claude/` and `cat ~/.claude/stats-cache.json`
|
||||
## Architecture
|
||||
|
||||
ccboard is a single Rust binary with dual frontends:
|
||||
|
||||
```
|
||||
ccboard/
|
||||
├── ccboard-core/ # Parsers, models, data store, watcher
|
||||
├── ccboard-tui/ # Ratatui frontend (8 tabs)
|
||||
└── ccboard-web/ # Axum + Leptos frontend
|
||||
```
|
||||
|
||||
**Data Sources**:
|
||||
- `~/.claude/stats-cache.json` - Statistics
|
||||
- `~/.claude/claude_desktop_config.json` - MCP config
|
||||
- `~/.claude/projects/*/` - Session JSONL files
|
||||
- `~/.claude/settings.json` - Global settings
|
||||
- `.claude/settings.json` - Project settings
|
||||
- `.claude/settings.local.json` - Local overrides
|
||||
- `.claude/CLAUDE.md` - Rules and behavior
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **ccboard not found**: Run `/ccboard-install` or `cargo install ccboard`
|
||||
- **No data visible**: Verify `~/.claude/` exists and contains `stats-cache.json`
|
||||
- **MCP shows "Unknown"**: Status detection requires Unix; Windows shows "Unknown" by default
|
||||
- **File watcher issues**: Check file permissions on `~/.claude/`, restart ccboard
|
||||
### ccboard not found
|
||||
|
||||
```bash
|
||||
# Check installation
|
||||
which ccboard
|
||||
|
||||
# Install if needed
|
||||
/ccboard-install
|
||||
```
|
||||
|
||||
### No data visible
|
||||
|
||||
```bash
|
||||
# Verify Claude Code is installed
|
||||
ls ~/.claude/
|
||||
|
||||
# Check stats file exists
|
||||
cat ~/.claude/stats-cache.json
|
||||
|
||||
# Run with specific project
|
||||
ccboard --project ~/path/to/project
|
||||
```
|
||||
|
||||
### MCP status shows "Unknown"
|
||||
|
||||
- Status detection requires Unix (macOS/Linux)
|
||||
- Windows shows "Unknown" by default
|
||||
- Check if server process is actually running: `ps aux | grep <server-name>`
|
||||
|
||||
### File watcher not working
|
||||
|
||||
- Ensure `notify` crate supports your platform
|
||||
- Check file permissions on `~/.claude/`
|
||||
- Restart ccboard if file system events missed
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
### Command-line Options
|
||||
|
||||
```bash
|
||||
ccboard --help # Show all options
|
||||
ccboard --claude-home PATH # Custom Claude directory
|
||||
ccboard --project PATH # Specific project
|
||||
ccboard stats # Print stats and exit
|
||||
ccboard web --port 8080 # Web UI on port 8080
|
||||
ccboard both # TUI + Web simultaneously
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
```bash
|
||||
# Editor preference
|
||||
export EDITOR=vim
|
||||
export VISUAL=code
|
||||
|
||||
# Custom Claude home
|
||||
export CLAUDE_HOME=~/custom/.claude
|
||||
```
|
||||
|
||||
### Integration with Claude Code
|
||||
|
||||
ccboard reads **read-only** from Claude Code directories:
|
||||
|
||||
- Non-invasive monitoring
|
||||
- No modifications to Claude data
|
||||
- Safe to run concurrently with Claude Code
|
||||
- File watcher detects changes in real-time
|
||||
|
||||
## Performance
|
||||
|
||||
- **Binary size**: 2.4MB (release build)
|
||||
- **Initial load**: <2s for 1,000+ sessions
|
||||
- **Memory**: ~50MB typical usage
|
||||
- **CPU**: <5% during monitoring
|
||||
- **Lazy loading**: Session content loaded on-demand
|
||||
|
||||
## Limitations
|
||||
|
||||
Current version (0.1.0):
|
||||
|
||||
- **Read-only**: No write operations to Claude data
|
||||
- **MCP status**: Unix only (Windows shows "Unknown")
|
||||
- **Web UI**: In development (TUI is primary interface)
|
||||
- **Search**: Basic substring matching (no fuzzy search yet)
|
||||
|
||||
Future roadmap:
|
||||
|
||||
- Enhanced MCP server management (start/stop)
|
||||
- MCP protocol health checks
|
||||
- Export reports (PDF, JSON, CSV)
|
||||
- Config editing (write settings.json)
|
||||
- Session resume integration
|
||||
- Enhanced search with fuzzy matching
|
||||
|
||||
## Contributing
|
||||
|
||||
ccboard is open source (MIT OR Apache-2.0).
|
||||
|
||||
Repository: https://github.com/{OWNER}/ccboard
|
||||
|
||||
Contributions welcome:
|
||||
- Bug reports and feature requests
|
||||
- Pull requests for new features
|
||||
- Documentation improvements
|
||||
- Platform-specific testing (Windows, Linux)
|
||||
|
||||
## Credits
|
||||
|
||||
Built with:
|
||||
- [Ratatui](https://ratatui.rs/) - Terminal UI framework
|
||||
- [Axum](https://github.com/tokio-rs/axum) - Web framework
|
||||
- [Leptos](https://leptos.dev/) - Reactive frontend
|
||||
- [Notify](https://github.com/notify-rs/notify) - File watcher
|
||||
- [Serde](https://serde.rs/) - Serialization
|
||||
|
||||
## License
|
||||
|
||||
MIT OR Apache-2.0
|
||||
|
||||
---
|
||||
|
||||
**Questions?**
|
||||
|
||||
- GitHub Issues: https://github.com/{OWNER}/ccboard/issues
|
||||
- Documentation: https://github.com/{OWNER}/ccboard
|
||||
- Claude Code: https://claude.ai/code
|
||||
|
|
|
|||
|
|
@ -38,30 +38,39 @@ Check that the log file exists (or that log content was provided inline). If the
|
|||
|
||||
### Step 2 — Spawn Log Ingestor
|
||||
|
||||
Use the Agent tool to spawn the `log-ingestor` agent:
|
||||
|
||||
```
|
||||
Agent(tool="Task", prompt="Parse the log file at [log_path] and write structured events to cyber-defense-events.json.", agent="log-ingestor", model="haiku")
|
||||
Task: Parse the log file at [log_path] and write structured events to cyber-defense-events.json.
|
||||
Log path: [log_path]
|
||||
```
|
||||
|
||||
Wait for completion. Confirm `cyber-defense-events.json` was created.
|
||||
|
||||
### Step 3 — Spawn Anomaly Detector
|
||||
|
||||
Use the Agent tool to spawn the `anomaly-detector` agent:
|
||||
|
||||
```
|
||||
Agent(tool="Task", prompt="Read cyber-defense-events.json and detect anomalies. Write results to cyber-defense-anomalies.json.", agent="anomaly-detector", model="sonnet")
|
||||
Task: Read cyber-defense-events.json and detect anomalies. Write results to cyber-defense-anomalies.json.
|
||||
```
|
||||
|
||||
Wait for completion. If `anomalies_found: 0`, skip to Step 5 (reporter still runs).
|
||||
|
||||
### Step 4 — Spawn Risk Classifier
|
||||
|
||||
Use the Agent tool to spawn the `risk-classifier` agent:
|
||||
|
||||
```
|
||||
Agent(tool="Task", prompt="Read cyber-defense-anomalies.json and classify overall risk. Write result to cyber-defense-risk.json.", agent="risk-classifier", model="sonnet")
|
||||
Task: Read cyber-defense-anomalies.json and classify overall risk. Write result to cyber-defense-risk.json.
|
||||
```
|
||||
|
||||
### Step 5 — Spawn Threat Reporter
|
||||
|
||||
Use the Agent tool to spawn the `threat-reporter` agent:
|
||||
|
||||
```
|
||||
Agent(tool="Task", prompt="Read all 3 JSON files (events, anomalies, risk). Generate a complete incident report and save to cyber-defense-report.md.", agent="threat-reporter", model="sonnet")
|
||||
Task: Read cyber-defense-events.json, cyber-defense-anomalies.json, and cyber-defense-risk.json. Generate a complete incident report and save it to cyber-defense-report.md.
|
||||
```
|
||||
|
||||
### Step 6 — Summarize for User
|
||||
|
|
|
|||
|
|
@ -33,7 +33,11 @@ agent: specialist
|
|||
5. JSON Report Generation
|
||||
```
|
||||
|
||||
**Example**: `/design-patterns detect src/`
|
||||
**Example invocation**:
|
||||
```
|
||||
/design-patterns detect src/
|
||||
/design-patterns analyze --format=json
|
||||
```
|
||||
|
||||
### Mode 2: Suggestion
|
||||
|
||||
|
|
@ -49,7 +53,11 @@ agent: specialist
|
|||
5. Markdown Report with Code Examples
|
||||
```
|
||||
|
||||
**Example**: `/design-patterns suggest src/payment/`
|
||||
**Example invocation**:
|
||||
```
|
||||
/design-patterns suggest src/payment/
|
||||
/design-patterns refactor --focus=creational
|
||||
```
|
||||
|
||||
### Mode 3: Evaluation
|
||||
|
||||
|
|
@ -65,7 +73,11 @@ agent: specialist
|
|||
5. JSON Report with Recommendations
|
||||
```
|
||||
|
||||
**Example**: `/design-patterns evaluate src/services/singleton.ts`
|
||||
**Example invocation**:
|
||||
```
|
||||
/design-patterns evaluate src/services/singleton.ts
|
||||
/design-patterns quality --pattern=observer
|
||||
```
|
||||
|
||||
## Methodology
|
||||
|
||||
|
|
@ -274,6 +286,134 @@ ELSE IF pattern_implemented_incorrectly:
|
|||
}
|
||||
```
|
||||
|
||||
### Suggestion Mode (Markdown)
|
||||
|
||||
```markdown
|
||||
# Design Pattern Suggestions
|
||||
|
||||
**Scope**: `src/payment/`
|
||||
**Stack**: React 18 + TypeScript + Stripe
|
||||
**Date**: 2026-01-21
|
||||
|
||||
---
|
||||
|
||||
## High Priority
|
||||
|
||||
### 1. Strategy Pattern → `src/payment/processor.ts:45-89`
|
||||
|
||||
**Code Smell**: Switch statement on payment type (4 cases, 78 lines)
|
||||
|
||||
**Current Implementation** (lines 52-87):
|
||||
```typescript
|
||||
switch (paymentType) {
|
||||
case 'credit':
|
||||
// 20 lines of credit card logic
|
||||
break;
|
||||
case 'paypal':
|
||||
// 15 lines of PayPal logic
|
||||
break;
|
||||
case 'crypto':
|
||||
// 18 lines of crypto logic
|
||||
break;
|
||||
case 'bank':
|
||||
// 12 lines of bank transfer logic
|
||||
break;
|
||||
}
|
||||
```
|
||||
|
||||
**Recommended (React-adapted Strategy)**:
|
||||
```typescript
|
||||
// Define strategy interface
|
||||
interface PaymentStrategy {
|
||||
process: (amount: number) => Promise<PaymentResult>;
|
||||
}
|
||||
|
||||
// Custom hooks as strategies
|
||||
const useCreditPayment = (): PaymentStrategy => ({
|
||||
process: async (amount) => { /* credit logic */ }
|
||||
});
|
||||
|
||||
const usePaypalPayment = (): PaymentStrategy => ({
|
||||
process: async (amount) => { /* PayPal logic */ }
|
||||
});
|
||||
|
||||
// Strategy selection hook
|
||||
const usePaymentStrategy = (type: PaymentType): PaymentStrategy => {
|
||||
const strategies = {
|
||||
credit: useCreditPayment(),
|
||||
paypal: usePaypalPayment(),
|
||||
crypto: useCryptoPayment(),
|
||||
bank: useBankPayment(),
|
||||
};
|
||||
return strategies[type];
|
||||
};
|
||||
|
||||
// Usage in component
|
||||
const PaymentForm = ({ type }: Props) => {
|
||||
const strategy = usePaymentStrategy(type);
|
||||
const handlePay = () => strategy.process(amount);
|
||||
// ...
|
||||
};
|
||||
```
|
||||
|
||||
**Impact**:
|
||||
- **Complexity**: Reduces cyclomatic complexity from 12 to 2
|
||||
- **Extensibility**: New payment methods = new hook, no modification to existing code
|
||||
- **Testability**: Each strategy hook can be tested in isolation
|
||||
- **Effort**: ~2 hours (extract logic into hooks, add tests)
|
||||
|
||||
---
|
||||
|
||||
## Medium Priority
|
||||
|
||||
### 2. Observer Pattern → `src/cart/CartManager.ts:23-156`
|
||||
|
||||
**Code Smell**: Manual notification logic scattered across 8 methods
|
||||
|
||||
**Current**: Manual loops calling update functions
|
||||
**Recommended**: Use Zustand store (already in dependencies)
|
||||
|
||||
```typescript
|
||||
// Instead of custom observer:
|
||||
import create from 'zustand';
|
||||
|
||||
interface CartStore {
|
||||
items: CartItem[];
|
||||
addItem: (item: CartItem) => void;
|
||||
removeItem: (id: string) => void;
|
||||
// Zustand automatically notifies subscribers
|
||||
}
|
||||
|
||||
export const useCartStore = create<CartStore>((set) => ({
|
||||
items: [],
|
||||
addItem: (item) => set((state) => ({ items: [...state.items, item] })),
|
||||
removeItem: (id) => set((state) => ({ items: state.items.filter(i => i.id !== id) })),
|
||||
}));
|
||||
|
||||
// Components auto-subscribe:
|
||||
const CartDisplay = () => {
|
||||
const items = useCartStore((state) => state.items);
|
||||
// Re-renders automatically on cart changes
|
||||
};
|
||||
```
|
||||
|
||||
**Impact**:
|
||||
- **LOC**: Reduces from 156 to ~25 lines
|
||||
- **Stack-native**: Uses existing Zustand dependency
|
||||
- **Testability**: Zustand stores are easily tested
|
||||
- **Effort**: ~1.5 hours
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
- **Total suggestions**: 4
|
||||
- **High priority**: 2 (Strategy, Observer)
|
||||
- **Medium priority**: 2 (Builder, Facade)
|
||||
- **Estimated total effort**: ~6 hours
|
||||
- **Primary benefits**: Reduced complexity, improved testability, stack-native idioms
|
||||
```
|
||||
|
||||
### Evaluation Mode (JSON)
|
||||
|
||||
```json
|
||||
|
|
@ -363,11 +503,40 @@ ELSE IF pattern_implemented_incorrectly:
|
|||
|
||||
## Usage Examples
|
||||
|
||||
### Basic Detection
|
||||
```bash
|
||||
/design-patterns detect src/ # Detect all patterns
|
||||
/design-patterns detect src/ --category=creational # Creational only
|
||||
/design-patterns suggest src/payment/ # Suggestions for module
|
||||
/design-patterns evaluate src/services/api-client.ts # Evaluate specific file
|
||||
# Detect all patterns in src/
|
||||
/design-patterns detect src/
|
||||
|
||||
# Detect only creational patterns
|
||||
/design-patterns detect src/ --category=creational
|
||||
|
||||
# Focus on specific pattern
|
||||
/design-patterns detect src/ --pattern=singleton
|
||||
```
|
||||
|
||||
### Targeted Suggestions
|
||||
```bash
|
||||
# Get suggestions for payment module
|
||||
/design-patterns suggest src/payment/
|
||||
|
||||
# Focus on specific smell
|
||||
/design-patterns suggest src/ --smell=switch-on-type
|
||||
|
||||
# High priority only
|
||||
/design-patterns suggest src/ --priority=high
|
||||
```
|
||||
|
||||
### Quality Evaluation
|
||||
```bash
|
||||
# Evaluate specific file
|
||||
/design-patterns evaluate src/services/api-client.ts
|
||||
|
||||
# Evaluate all singletons
|
||||
/design-patterns evaluate src/ --pattern=singleton
|
||||
|
||||
# Full quality report
|
||||
/design-patterns evaluate src/ --detailed
|
||||
```
|
||||
|
||||
## Integration with Other Skills
|
||||
|
|
|
|||
|
|
@ -115,11 +115,33 @@ Scan each open PR body for references to the issue number:
|
|||
|
||||
#### 3. Duplicate Detection via Jaccard Similarity
|
||||
|
||||
Compare each open issue against all other open issues AND the 20 most recent closed issues using Jaccard similarity (self-contained, no external library).
|
||||
**Algorithm (self-contained — no external library)**:
|
||||
|
||||
**Steps**: Normalize text (lowercase, strip prefixes like "feat:"/"fix:", remove punctuation) → Tokenize (split on whitespace, remove stop words and tokens <3 chars) → Compute `|A ∩ B| / |A ∪ B|` on token sets from title + first 300 chars of body.
|
||||
For each open issue, compute Jaccard similarity against all other open issues AND the 20 most recent closed issues.
|
||||
|
||||
**Threshold**: Jaccard >= 0.60 → flag as potential duplicate. Keep the older issue as canonical. Report: "Similar to #N (Jaccard: 0.72)". Computed at runtime on fetched data — no additional API calls.
|
||||
```
|
||||
Step 1 — Normalize title + first 300 chars of body:
|
||||
- Lowercase the full text
|
||||
- Strip category prefixes: "feat:", "fix:", "bug:", "chore:", "docs:", "test:", "refactor:"
|
||||
- Remove punctuation: .,!?;:'"()[]{}-_/\@#
|
||||
|
||||
Step 2 — Tokenize:
|
||||
- Split on whitespace
|
||||
- Remove stop words: the a an is in on to for of and or with this that it can not no be
|
||||
- Remove tokens shorter than 3 characters
|
||||
|
||||
Step 3 — Compute Jaccard:
|
||||
tokens_A = set of tokens from issue A
|
||||
tokens_B = set of tokens from issue B
|
||||
jaccard = |tokens_A ∩ tokens_B| / |tokens_A ∪ tokens_B|
|
||||
|
||||
Step 4 — Flag:
|
||||
- If jaccard >= 0.60: mark as potential duplicate
|
||||
- Report: "Similar to #N (Jaccard: 0.72)"
|
||||
- Keep the OLDER issue as canonical; newer = duplicate candidate
|
||||
```
|
||||
|
||||
Jaccard is computed at runtime using the fetched data — no API calls beyond Phase 1 gather.
|
||||
|
||||
#### 4. Risk Classification
|
||||
|
||||
|
|
@ -383,14 +405,18 @@ If "None" → `No actions executed. Workflow complete.`
|
|||
|
||||
## Edge Cases
|
||||
|
||||
- **0 open issues**: Display `No open issues.` and stop
|
||||
- **Empty body**: Category = Unclear, always request details first
|
||||
- **Collaborator reporter**: Protect from auto-close, flag in table
|
||||
- **Jaccard 0.55–0.65**: Flag as "possible duplicate — verify manually"
|
||||
- **Label not in repo**: Skip label action, notify user to create it
|
||||
- **Collaborators API 403/404**: Fallback to last 10 merged PR authors
|
||||
- **Large body (>5000 chars)**: Truncate with `[truncated]` note
|
||||
- **Milestoned issues**: Never close without explicit confirmation
|
||||
| Situation | Behavior |
|
||||
|-----------|----------|
|
||||
| 0 open issues | Display `No open issues.` + stop |
|
||||
| Body empty | Category = Unclear, action = request details, never assume |
|
||||
| Collaborator as reporter | Protect from auto-close, flag explicitly in table |
|
||||
| Jaccard inconclusive (0.55–0.65) | Flag as "possible duplicate — verify manually" |
|
||||
| Label not in repo | Skip label action, notify user to create the label first |
|
||||
| Issue already closed during workflow | Skip silently, note in summary |
|
||||
| `gh api .../collaborators` 403/404 | Fallback to last 10 merged PR authors |
|
||||
| Parallel agents unavailable | Run sequential analysis, notify user |
|
||||
| Very large body (>5000 chars) | Truncate to 5000 chars with `[truncated]` note |
|
||||
| Milestone assigned | Include in table, never close milestoned issues without confirmation |
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -7,86 +7,138 @@ tags: [optimization, tokens, efficiency, git]
|
|||
|
||||
# RTK Optimizer Skill
|
||||
|
||||
Automatically suggest and apply RTK (Rust Token Killer) wrappers for verbose commands, reducing token usage by ~73% on average.
|
||||
**Purpose**: Automatically suggest RTK wrappers for high-verbosity commands to reduce token consumption.
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **Detect high-verbosity commands** in user requests
|
||||
2. **Suggest RTK wrapper** with expected savings
|
||||
2. **Suggest RTK wrapper** if applicable
|
||||
3. **Execute with RTK** when user confirms
|
||||
4. **Track savings** over session via `rtk gain`
|
||||
|
||||
## Prerequisites
|
||||
|
||||
```bash
|
||||
rtk --version # Requires rtk 0.16.0+
|
||||
|
||||
# Install if needed:
|
||||
brew install rtk-ai/tap/rtk # macOS/Linux
|
||||
cargo install rtk # All platforms
|
||||
```
|
||||
4. **Track savings** over session
|
||||
|
||||
## Supported Commands
|
||||
|
||||
| Command | RTK Equivalent | Reduction |
|
||||
|---------|---------------|-----------|
|
||||
| `git log` | `rtk git log` | 92% (13,994 -> 1,076 chars) |
|
||||
| `git status` | `rtk git status` | 76% |
|
||||
| `git diff` | `rtk git diff` | 56% (15,815 -> 6,982 chars) |
|
||||
| `find` | `rtk find` | 76% |
|
||||
| `cat <large-file>` | `rtk read <file>` | 63% (163K -> 61K chars) |
|
||||
| `pnpm list` | `rtk pnpm list` | 82% |
|
||||
| `vitest run` / `pnpm test` | `rtk vitest run` | 90% |
|
||||
| `cargo test` | `rtk cargo test` | 90% |
|
||||
| `cargo build` | `rtk cargo build` | 80% |
|
||||
| `cargo clippy` | `rtk cargo clippy` | 80% |
|
||||
| `pytest` | `rtk python pytest` | 90% |
|
||||
| `go test` | `rtk go test` | 90% |
|
||||
| `gh pr view` | `rtk gh pr view` | 87% |
|
||||
| `gh pr checks` | `rtk gh pr checks` | 79% |
|
||||
| `ls` | `rtk ls` | condensed |
|
||||
| `grep` | `rtk grep` | filtered |
|
||||
### Git (>70% reduction)
|
||||
- `git log` → `rtk git log` (92.3% reduction)
|
||||
- `git status` → `rtk git status` (76.0% reduction)
|
||||
- `find` → `rtk find` (76.3% reduction)
|
||||
|
||||
## Usage Pattern
|
||||
### Medium-Value (50-70% reduction)
|
||||
- `git diff` → `rtk git diff` (55.9% reduction)
|
||||
- `cat <large-file>` → `rtk read <file>` (62.5% reduction)
|
||||
|
||||
```markdown
|
||||
# When user requests a verbose command:
|
||||
### JS/TS Stack (70-90% reduction)
|
||||
- `pnpm list` → `rtk pnpm list` (82% reduction)
|
||||
- `pnpm test` / `vitest run` → `rtk vitest run` (90% reduction)
|
||||
|
||||
1. Acknowledge the request
|
||||
2. Suggest RTK: "I'll use `rtk git log` to reduce token usage by ~92%"
|
||||
3. Execute the RTK-wrapped command
|
||||
4. Report savings: "Saved ~13K tokens (baseline: 14K, RTK: 1K)"
|
||||
```
|
||||
### Rust Toolchain (80-90% reduction)
|
||||
- `cargo test` → `rtk cargo test` (90% reduction)
|
||||
- `cargo build` → `rtk cargo build` (80% reduction)
|
||||
- `cargo clippy` → `rtk cargo clippy` (80% reduction)
|
||||
|
||||
### Python & Go (90% reduction)
|
||||
- `pytest` → `rtk python pytest` (90% reduction)
|
||||
- `go test` → `rtk go test` (90% reduction)
|
||||
|
||||
### GitHub CLI (79-87% reduction)
|
||||
- `gh pr view` → `rtk gh pr view` (87% reduction)
|
||||
- `gh pr checks` → `rtk gh pr checks` (79% reduction)
|
||||
|
||||
### File Operations
|
||||
- `ls` → `rtk ls` (condensed output)
|
||||
- `grep` → `rtk grep` (filtered output)
|
||||
|
||||
## Activation Examples
|
||||
|
||||
**User**: "Show me the git history"
|
||||
**Action**: Detect `git log` -> execute `rtk git log` -> report 92% savings
|
||||
**Skill**: Detects `git log` → Suggests `rtk git log` → Explains 92.3% token savings
|
||||
|
||||
**User**: "Run the test suite"
|
||||
**Action**: Detect `cargo test` / `pytest` -> execute `rtk cargo test` -> report 90% savings
|
||||
**User**: "Find all markdown files"
|
||||
**Skill**: Detects `find` → Suggests `rtk find "*.md" .` → Explains 76.3% savings
|
||||
|
||||
## When to Skip RTK
|
||||
## Installation Check
|
||||
|
||||
- **Small outputs** (<100 chars): Overhead not worth it
|
||||
- **Claude built-in tools**: Grep/Read tools are already optimized
|
||||
- **Interactive commands**: RTK is for batch/non-interactive output only
|
||||
- **Multiple piped commands**: Wrap the outermost command, not each step
|
||||
Before first use, verify RTK is installed:
|
||||
```bash
|
||||
rtk --version # Should output: rtk 0.16.0+
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
If not installed:
|
||||
```bash
|
||||
# Homebrew (macOS/Linux)
|
||||
brew install rtk-ai/tap/rtk
|
||||
|
||||
- If `rtk` is not found, fall back to the raw command and suggest installation
|
||||
- If RTK output is empty or malformed, re-run without RTK and report the issue
|
||||
- If RTK version is outdated, warn about potential breaking changes (rapid release cadence)
|
||||
# Cargo (all platforms)
|
||||
cargo install rtk
|
||||
```
|
||||
|
||||
## Usage Pattern
|
||||
|
||||
```markdown
|
||||
# When user requests high-verbosity command:
|
||||
|
||||
1. Acknowledge request
|
||||
2. Suggest RTK optimization:
|
||||
"I'll use `rtk git log` to reduce token usage by ~92%"
|
||||
3. Execute RTK command
|
||||
4. Track savings (optional):
|
||||
"Saved ~13K tokens (baseline: 14K, RTK: 1K)"
|
||||
```
|
||||
|
||||
## Session Tracking
|
||||
|
||||
Optional: Track cumulative savings across session:
|
||||
|
||||
```bash
|
||||
rtk gain # Shows cumulative token savings for the session (SQLite-backed)
|
||||
# At session end
|
||||
rtk gain # Shows total token savings for session (SQLite-backed)
|
||||
```
|
||||
|
||||
## Edge Cases
|
||||
|
||||
- **Small outputs** (<100 chars): Skip RTK (overhead not worth it)
|
||||
- **Already using Claude tools**: Grep/Read tools are already optimized
|
||||
- **Multiple commands**: Batch with RTK wrapper once, not per command
|
||||
|
||||
## Configuration
|
||||
|
||||
Enable via CLAUDE.md:
|
||||
```markdown
|
||||
## Token Optimization
|
||||
|
||||
Use RTK (Rust Token Killer) for high-verbosity commands:
|
||||
- git operations (log, status, diff)
|
||||
- package managers (pnpm, npm)
|
||||
- build tools (cargo, go)
|
||||
- test frameworks (vitest, pytest)
|
||||
- file finding and reading
|
||||
```
|
||||
|
||||
## Metrics (Verified)
|
||||
|
||||
Based on real-world testing:
|
||||
- `git log`: 13,994 chars → 1,076 chars (92.3% reduction)
|
||||
- `git status`: 100 chars → 24 chars (76.0% reduction)
|
||||
- `find`: 780 chars → 185 chars (76.3% reduction)
|
||||
- `git diff`: 15,815 chars → 6,982 chars (55.9% reduction)
|
||||
- `read file`: 163,587 chars → 61,339 chars (62.5% reduction)
|
||||
|
||||
**Average: 72.6% token reduction**
|
||||
|
||||
## Limitations
|
||||
|
||||
- 446 stars on GitHub, actively maintained (30 releases in 23 days)
|
||||
- Not suitable for interactive commands
|
||||
- Rapid development cadence (check for breaking changes)
|
||||
|
||||
## Recommendation
|
||||
|
||||
**Use RTK for**: git workflows, file operations, test frameworks, build tools, package managers
|
||||
**Skip RTK for**: small outputs, quick exploration, interactive commands
|
||||
|
||||
## References
|
||||
|
||||
- RTK GitHub: https://github.com/rtk-ai/rtk
|
||||
- RTK Website: https://www.rtk-ai.app/
|
||||
- Evaluation: `docs/resource-evaluations/rtk-evaluation.md`
|
||||
- CLAUDE.md template: `examples/claude-md/rtk-optimized.md`
|
||||
|
|
|
|||
|
|
@ -12,7 +12,7 @@ tags: [cheatsheet, reference]
|
|||
|
||||
**Written with**: Claude (Anthropic)
|
||||
|
||||
**Version**: 3.36.0 | **Last Updated**: February 2026
|
||||
**Version**: 3.37.0 | **Last Updated**: March 2026
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -564,7 +564,7 @@ Deep analysis → Use Opus (thinking on by default)
|
|||
|
||||
| Problem | Solution |
|
||||
|---------|----------|
|
||||
| "Command not found" | Check PATH, reinstall npm global |
|
||||
| "Command not found" | Check PATH, reinstall: `curl -fsSL https://claude.ai/install.sh \| sh` |
|
||||
| Context too high (>70%) | `/compact` immediately |
|
||||
| Slow responses | `/compact` or `/clear` |
|
||||
| MCP not working | `claude mcp list`, check config |
|
||||
|
|
@ -639,4 +639,4 @@ Speed: `rg` (~20ms) → Serena (~100ms) → ast-grep (~200ms) → grepai (~500ms
|
|||
|
||||
**Author**: Florian BRUNIAUX | [@Méthode Aristote](https://methode-aristote.fr) | Written with Claude
|
||||
|
||||
*Last updated: February 2026 | Version 3.36.0*
|
||||
*Last updated: March 2026 | Version 3.37.0*
|
||||
|
|
|
|||
|
|
@ -10,6 +10,8 @@ tags: [mcp, reference, integration]
|
|||
|
||||
This guide covers validated community MCP servers beyond the official Anthropic servers. All servers listed have been evaluated for production readiness, maintenance activity, and security.
|
||||
|
||||
> **Not sure whether to use an MCP server or a CLI tool?** See the [MCP vs CLI Decision Guide](./mcp-vs-cli.md) for a full breakdown of tradeoffs, a decision matrix, and guidance by situation.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Official vs Community Servers](#official-vs-community-servers)
|
||||
|
|
@ -908,6 +910,8 @@ Result: Official Python SDK docs + example code for streaming
|
|||
- **Official Site**: https://context7.com
|
||||
- **LobeHub Registry**: https://lobehub.com/mcp/upstash-context7
|
||||
|
||||
**ctx7 CLI companion**: Context7 also ships a CLI (`npx ctx7`) that handles skill discovery and MCP setup from the terminal. `ctx7 skills suggest` auto-detects project dependencies and recommends matching skills; `ctx7 setup --claude` runs a wizard that configures MCP or CLI+Skills mode automatically. See §5.5 of the ultimate guide for the full workflow.
|
||||
|
||||
---
|
||||
|
||||
### Project Management
|
||||
|
|
|
|||
184
guide/ecosystem/mcp-vs-cli.md
Normal file
184
guide/ecosystem/mcp-vs-cli.md
Normal file
|
|
@ -0,0 +1,184 @@
|
|||
---
|
||||
title: "MCP vs CLI — Decision Guide"
|
||||
description: "When to use MCP servers vs CLI tools in Claude Code workflows. Tradeoffs, decision dimensions, and guidance by situation."
|
||||
tags: [mcp, cli, tokens, architecture, decision]
|
||||
---
|
||||
|
||||
# MCP vs CLI — Decision Guide
|
||||
|
||||
**Last updated**: March 2026
|
||||
|
||||
> Interactive version with guidance table and practitioner quotes: [cc.bruniaux.com/ecosystem/mcp-vs-cli/](https://cc.bruniaux.com/ecosystem/mcp-vs-cli/)
|
||||
|
||||
This page compares two integration patterns for giving Claude Code access to external tools and services: MCP servers and CLI tools. Neither is universally better. The right choice depends on your context — and most real workflows end up using both.
|
||||
|
||||
---
|
||||
|
||||
## What each approach does
|
||||
|
||||
**MCP servers** inject tool schemas into Claude's context at session start. Claude sees a structured list of available tools with parameters, types, and descriptions. It then calls those tools natively, receiving structured responses.
|
||||
|
||||
**CLI tools** are shell commands that Claude invokes via Bash. Claude drives them the same way a developer would: constructing command strings, parsing text output. No schema injection at startup. The shell is the interface.
|
||||
|
||||
---
|
||||
|
||||
## Tradeoffs
|
||||
|
||||
### MCP strengths
|
||||
|
||||
| Advantage | Detail |
|
||||
|-----------|--------|
|
||||
| **Structured interface** | Tool schemas guide Claude precisely — fewer hallucinated flags or arguments |
|
||||
| **Complex auth** | OAuth, token refresh, secrets rotation handled by the server, not the prompt |
|
||||
| **Structured output** | JSON responses are directly parseable by Claude and downstream agents |
|
||||
| **Observability** | Remote MCP servers can log every call — essential for enterprise usage tracking and ROI attribution |
|
||||
| **Distribution at scale** | Update the server once, all connected clients get the change. No per-machine package management. |
|
||||
| **Non-technical users** | Users who never touch a terminal can access tools transparently via MCP connectors |
|
||||
| **Weaker models** | A structured schema compensates when the model is less capable of parsing CLI help text |
|
||||
|
||||
### CLI strengths
|
||||
|
||||
| Advantage | Detail |
|
||||
|-----------|--------|
|
||||
| **Zero context overhead** | No schema injected at startup — relevant when context budget is tight |
|
||||
| **Deterministic actions** | Explicit commands with predictable output are easier to audit and test |
|
||||
| **Human + AI use** | The same CLI wrapper works for a developer running it manually and for Claude |
|
||||
| **Frontier models** | Claude Opus/Sonnet 4.6 can drive complex CLIs (aws-cli, glab, gh) without a structured schema |
|
||||
| **Speed** | No connection setup, no MCP handshake — direct subprocess execution |
|
||||
| **Simplicity** | Easier to debug, log, and reason about than a remote server call chain |
|
||||
| **Skills encapsulation** | A CLI wrapped in a skill is transparent to the user and keeps the tool logic version-controlled |
|
||||
|
||||
### MCP weaknesses
|
||||
|
||||
| Weakness | Detail |
|
||||
|----------|--------|
|
||||
| **Schema token cost** | Every MCP server injects its full tool list into the context window at session start, whether or not those tools are used that session |
|
||||
| **Connection overhead** | Session startup takes longer with many MCP servers connected |
|
||||
| **Debugging difficulty** | Failures inside an MCP server are harder to trace than a failed shell command |
|
||||
| **Maintenance complexity** | Running, updating, and securing remote MCP servers adds infrastructure |
|
||||
| **Overkill for simple APIs** | A GitLab MCP that surfaces 20% of glab's functionality is worse than glab itself |
|
||||
|
||||
### CLI weaknesses
|
||||
|
||||
| Weakness | Detail |
|
||||
|----------|--------|
|
||||
| **No observability** | Shell commands on a local machine are invisible to ops/management tooling |
|
||||
| **Distribution problem** | Keeping CLIs updated across a team requires package management discipline (brew, scoop, etc.) |
|
||||
| **Weaker models struggle** | A less capable model may hallucinate flags or misread help text — schemas help |
|
||||
| **No multi-agent structure** | CLI output requires parsing; structured MCP responses are more reliable across agent-to-agent handoffs |
|
||||
| **Non-tech user barrier** | A non-technical user cannot be expected to have a configured CLI environment |
|
||||
|
||||
---
|
||||
|
||||
## The four decision dimensions
|
||||
|
||||
Before asking "MCP or CLI?", answer these four questions. They rank from most to least constraining.
|
||||
|
||||
### 1. Who is the end user?
|
||||
|
||||
This is the dominant variable. Everything else is secondary.
|
||||
|
||||
- **Non-technical user** (using a chat interface, no terminal) → **MCP or skill-encapsulated CLI**. You cannot expose a raw CLI to a non-dev user. Connectors must be MCP-based or wrapped invisibly in a skill that handles the CLI internally.
|
||||
- **Technical user / developer** → continue to question 2.
|
||||
|
||||
### 2. Which model is driving the tool?
|
||||
|
||||
- **Frontier model** (Claude Opus/Sonnet 4.6) → strong enough to drive complex CLIs directly. A structured MCP schema adds overhead without proportional benefit.
|
||||
- **Smaller or local model** (Qwen, Mistral, lighter deployments) → structured MCP schemas compensate for weaker CLI parsing ability. MCP is more reliable here.
|
||||
|
||||
### 3. Does your organization need observability?
|
||||
|
||||
- **Yes** (enterprise, C-level reporting, compliance, ROI attribution on AI spend) → **MCP Remote server**. Local CLI calls are invisible. A remote MCP server can log every tool invocation, associate it with a user, and feed dashboards. You cannot replicate this with CLIs on local machines.
|
||||
- **No** (individual dev, local workflow) → observability is not a constraint. CLI is fine.
|
||||
|
||||
### 4. How often does the tool schema change?
|
||||
|
||||
- **Stable API** (mature tool, versioned interface) → MCP investment pays off over time.
|
||||
- **Rapidly changing** or **thin wrapper** → CLI is cheaper to maintain. A hand-rolled glab wrapper that exposes only the 5 commands you actually use is more durable than a GitLab MCP that duplicates the full API surface.
|
||||
|
||||
---
|
||||
|
||||
## Guidance by situation
|
||||
|
||||
Quick reference — not rules, but directional defaults.
|
||||
|
||||
| Situation | Lean toward | Rationale |
|
||||
|-----------|-------------|-----------|
|
||||
| Non-technical user, chat interface | **MCP / Skill** | CLI is inaccessible; connectors must be invisible |
|
||||
| Frontier model (Claude 4.x), developer workflow | **CLI** | Model handles it natively; schemas are overhead |
|
||||
| Smaller/local model | **MCP** | Schema guides the model reliably |
|
||||
| Enterprise, observability required | **MCP Remote** | Only way to log, attribute, and report on usage |
|
||||
| Team distribution (10+ devs) | **MCP** | Central update vs per-machine CLI maintenance |
|
||||
| Individual dev, local machine | **CLI or skill** | Simpler, faster, no infrastructure |
|
||||
| Deterministic actions (git, CI, deploy) | **CLI** | Explicit commands, predictable output, auditable |
|
||||
| Complex auth (OAuth, token refresh) | **MCP** | Server handles auth; CLI would require credential plumbing |
|
||||
| Tight context budget / many tools loaded | **CLI** | No schema injection at startup |
|
||||
| Agent-to-agent structured output | **MCP** | JSON responses are more reliable than parsed CLI text |
|
||||
| Debugging / prototyping a new integration | **CLI** | Easier to inspect, faster to iterate |
|
||||
| Browser automation (non-frontier model) | **MCP** | Playwright MCP structures interaction reliably |
|
||||
| Browser automation (frontier model, Claude Code) | **CLI + skill** | playwright-cli + skill reported faster and more efficient in practice |
|
||||
| GitLab / GitHub access | **CLI** (glab, gh) | Official CLIs are richer than most MCP wrappers |
|
||||
| Documentation lookup (Context7) | **MCP** | No CLI equivalent; structured doc retrieval has no shell analog |
|
||||
|
||||
---
|
||||
|
||||
## The hybrid is the default
|
||||
|
||||
Most production workflows don't choose one. They use both, with each covering the layer it handles best.
|
||||
|
||||
**A practical example** (from practitioners):
|
||||
|
||||
- **Inner layer** (local dev iteration, git, file ops, shell scripts) → CLI, fast, deterministic, no overhead
|
||||
- **Outer layer** (CI/CD, shared infrastructure, cross-team services) → MCP Remote, observable, centralized, scalable
|
||||
- **Skill layer** (user-facing actions, CLI tools encapsulated for non-tech users) → CLIs wrapped in skills, transparent to the end user
|
||||
|
||||
The mistake is applying one answer to both layers. A solo developer building a Claude Code workflow for themselves should mostly use CLIs. A team deploying an AI assistant to non-technical colleagues should mostly use MCP.
|
||||
|
||||
---
|
||||
|
||||
## Token cost of MCP schemas — what the numbers look like
|
||||
|
||||
MCP servers inject their full tool list into the context at session start. This is not free.
|
||||
|
||||
A typical MCP server with 10-15 tools injects 500-2,000 tokens per session before any task starts. With 5 MCP servers connected, that is 2,500-10,000 tokens of overhead on every session, whether or not those tools are used.
|
||||
|
||||
The practical consequence: if you load 10 MCP servers but only use 2 in a given session, you are paying for 8 servers worth of schema every time. This compounds with long sessions and high-frequency workflows.
|
||||
|
||||
**Mitigation strategies:**
|
||||
|
||||
- Load MCP servers selectively per project (project-level config vs global config)
|
||||
- Use CLI tools for high-frequency operations where schema overhead accumulates
|
||||
- Monitor token usage per session to identify which MCP schemas are loaded but unused
|
||||
- Consider a CLI wrapper for tools you use frequently in tight loops (compile → test → fix cycles)
|
||||
|
||||
---
|
||||
|
||||
## Tooling in this space
|
||||
|
||||
| Tool | What it does | Status |
|
||||
|------|-------------|--------|
|
||||
| **RTK** (Rust Token Killer) | Filters CLI output before it reaches Claude's context — reduces response verbosity, not schema overhead | Production-ready, actively maintained |
|
||||
| **MCPorter** (steipete) | TypeScript runtime for calling MCP servers from scripts, generating CLI wrappers, and emitting typed TS clients. Useful for testing MCP servers and writing hooks that need MCP access. | 3K stars, MIT, 2+ weeks, ready to use |
|
||||
| **mcp2cli** (knowsuchagency) | Converts MCP/OpenAPI/GraphQL to runtime CLI, eliminating schema injection. Claims 96-99% token savings. | 1.2K stars, 8 days old — watch list, not production-ready yet |
|
||||
|
||||
Note on mcp2cli: the core claim (eliminate schema injection by converting MCP to CLI) is architecturally valid. But Claude Code manages MCP connections internally, so the token savings don't apply directly to the standard Claude Code workflow. The Claude Code skill integration (`npx skills add knowsuchagency/mcp2cli --skill mcp2cli`) is the practical entry point.
|
||||
|
||||
---
|
||||
|
||||
## What practitioners say
|
||||
|
||||
A few representative perspectives from experienced Claude Code users:
|
||||
|
||||
> "I prefer CLI for deterministic actions. For GitLab interactions I use glab (the GitLab MCP is too limited) wrapped in a custom CLI — usable by both humans and AI." — practitioner
|
||||
|
||||
> "On Claude Code with frontier models, fewer MCPs is better. I replaced playwright-mcp with playwright-cli + skill — faster and more effective. I still use context7-mcp only because I haven't found a CLI equivalent." — practitioner
|
||||
|
||||
> "The CLI vs MCP debate is only happening among devs doing dev things. But there's one fundamental constraint: you cannot propose a CLI solution to a non-technical user who just wants to use their tool simply." — practitioner
|
||||
|
||||
> "For enterprise industrialization, observability is non-negotiable. CLI on a local machine is a black box. MCP Remote gives you the logging that C-levels need to attribute investment." — practitioner
|
||||
|
||||
> "Frontier models are strong enough to drive a CLI directly. A weaker local model will struggle — that's where MCP schemas earn their overhead." — practitioner
|
||||
|
||||
---
|
||||
|
||||
*Back to [MCP Servers Ecosystem](./mcp-servers-ecosystem.md) | [Third-Party Tools](./third-party-tools.md) | [Main guide](../ultimate-guide.md)*
|
||||
|
|
@ -16,7 +16,7 @@ tags: [guide, reference, workflows, agents, hooks, mcp, security]
|
|||
|
||||
**Last updated**: January 2026
|
||||
|
||||
**Version**: 3.36.0
|
||||
**Version**: 3.37.0
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -5166,7 +5166,7 @@ The `.claude/` folder is your project's Claude Code directory for memory, settin
|
|||
| Personal preferences | `CLAUDE.md` | ❌ Gitignore |
|
||||
| Personal permissions | `settings.local.json` | ❌ Gitignore |
|
||||
|
||||
### 3.36.0 Version Control & Backup
|
||||
### 3.37.0 Version Control & Backup
|
||||
|
||||
**Problem**: Without version control, losing your Claude Code configuration means hours of manual reconfiguration across agents, skills, hooks, and MCP servers.
|
||||
|
||||
|
|
@ -7576,6 +7576,69 @@ This skill is now installed in the Méthode Aristote repository at:
|
|||
|
||||
## 5.5 Community Skill Repositories
|
||||
|
||||
### Registry-based Discovery: ctx7 CLI
|
||||
|
||||
Before diving into specific repositories, Context7 provides a CLI companion (`ctx7`) that automates skill discovery and installation. Instead of manually cloning repos, `ctx7 skills suggest` analyzes your project's dependencies and recommends matching skills from the [context7.com/skills](https://context7.com/skills) registry — with trust scores to help evaluate quality.
|
||||
|
||||
**Install**:
|
||||
|
||||
```bash
|
||||
npx ctx7 --help # No install required (npx)
|
||||
npm install -g ctx7 # Global install
|
||||
```
|
||||
|
||||
**Discovery workflow**:
|
||||
|
||||
```bash
|
||||
# Auto-detect project deps and suggest matching skills
|
||||
npx ctx7 skills suggest
|
||||
|
||||
# Search by keyword
|
||||
npx ctx7 skills search terraform
|
||||
|
||||
# Install from any GitHub repository
|
||||
npx ctx7 skills install antonbabenko/terraform-skill
|
||||
npx ctx7 skills install owner/repo
|
||||
|
||||
# List / remove installed skills
|
||||
npx ctx7 skills list
|
||||
npx ctx7 skills remove skill-name
|
||||
```
|
||||
|
||||
**Setup wizard** (replaces manual `claude mcp add`):
|
||||
|
||||
```bash
|
||||
# Configure Context7 for Claude Code — detects editor, picks MCP or CLI+Skills mode
|
||||
npx ctx7 setup --claude
|
||||
```
|
||||
|
||||
`ctx7 setup` runs a wizard that configures Context7 in the right mode for your editor. Use it when setting up Context7 for the first time instead of writing `claude mcp add` manually. The `--claude` flag targets Claude Code specifically; `--cursor` and `--universal` are available for other editors.
|
||||
|
||||
**Registry vs. agentskills.io**: The [agentskills.io](https://agentskills.io) specification is the open standard defining the skill format (supported by 30+ platforms — see §5.1). The [context7.com/skills](https://context7.com/skills) registry is a hosted directory of skills conforming to that standard. The two are complementary: agentskills.io defines the format, context7.com/skills is one place to discover and share conforming skills. Skills installed via `ctx7` land in `~/.claude/skills/` and work identically to manually installed ones.
|
||||
|
||||
**Skill generation** (authenticated, rate-limited):
|
||||
|
||||
```bash
|
||||
npx ctx7 skills generate # AI-generated custom skill
|
||||
# Free: 6 generations/week — Pro: 10/week
|
||||
```
|
||||
|
||||
Generation is best reserved for skills with no equivalent in the registry. For team onboarding at scale, the `suggest` + `install` workflow is more practical than generation.
|
||||
|
||||
**CLI doc lookup** (alternative to MCP):
|
||||
|
||||
```bash
|
||||
# Search available libraries
|
||||
npx ctx7 library react
|
||||
|
||||
# Fetch docs for a specific library + query
|
||||
npx ctx7 docs /facebook/react "useEffect cleanup"
|
||||
```
|
||||
|
||||
This is the terminal equivalent of what the Context7 MCP server does. Useful when you want to look something up yourself without invoking Claude, or in environments where MCP is not configured. Claude Code users who already have the MCP server active don't need this — Claude handles it automatically.
|
||||
|
||||
---
|
||||
|
||||
### Cybersecurity Skills Repository
|
||||
|
||||
The Claude Code community has created specialized skill collections for specific domains. One notable collection focuses on cybersecurity and penetration testing.
|
||||
|
|
@ -11531,27 +11594,44 @@ curl -fsSL https://raw.githubusercontent.com/rtk-ai/icm/main/install.sh | sh
|
|||
cargo install --path crates/icm-cli
|
||||
```
|
||||
|
||||
**MCP Config** (auto-configured via `icm init`):
|
||||
**Setup** (3 separate modes, not a single interactive command):
|
||||
|
||||
```bash
|
||||
icm init # interactive setup — detects and configures your editor
|
||||
# Step 1: MCP server → auto-injects into ~/.claude.json (and 13 other editors)
|
||||
icm init --mode mcp
|
||||
|
||||
# Step 2: PostToolUse hook → auto-extracts context every N tool calls
|
||||
icm init --mode hook
|
||||
|
||||
# Step 3: /recall and /remember slash commands
|
||||
icm init --mode skill
|
||||
```
|
||||
|
||||
Restart Claude Code after running all three.
|
||||
|
||||
**Usage**:
|
||||
|
||||
```bash
|
||||
# Store episodic memory (auto-decay)
|
||||
icm store -t "my-project" -c "Use PostgreSQL for main DB" -i high
|
||||
# Store episodic memory (importance = critical|high|medium|low, not a float)
|
||||
icm store --topic "my-project" --content "Use PostgreSQL for main DB" --importance high
|
||||
|
||||
# Recall with hybrid search
|
||||
icm recall "database choice" --topic "my-project"
|
||||
icm recall "database choice"
|
||||
|
||||
# Build permanent knowledge graph
|
||||
icm memoir create -n "system-architecture"
|
||||
icm memoir add-concept -m "system-architecture" -n "auth-service"
|
||||
icm memoir link -m "system-architecture" --from "api-gateway" --to "auth-service" -r depends-on
|
||||
|
||||
# Session management
|
||||
icm stats # memory count, topics, avg weight
|
||||
icm topics # list all topics
|
||||
icm decay # apply temporal decay manually
|
||||
icm prune # remove low-weight entries
|
||||
```
|
||||
|
||||
**Onboarding prompt**: a ready-to-use session starter template is available at `examples/memory/icm-session-starter.md`.
|
||||
|
||||
**Performance** (1000 ops, 384d embeddings — vendor-reported):
|
||||
|
||||
| Operation | Latency |
|
||||
|
|
@ -23402,4 +23482,4 @@ We'll evaluate and add it to this section if it meets quality criteria.
|
|||
|
||||
**Contributions**: Issues and PRs welcome.
|
||||
|
||||
**Last updated**: January 2026 | **Version**: 3.36.0
|
||||
**Last updated**: January 2026 | **Version**: 3.37.0
|
||||
|
|
|
|||
|
|
@ -3,8 +3,8 @@
|
|||
# Source: guide/ultimate-guide.md
|
||||
# Purpose: Condensed index for LLMs to quickly answer user questions about Claude Code
|
||||
|
||||
version: "3.36.0"
|
||||
updated: "2026-03-13"
|
||||
version: "3.37.0"
|
||||
updated: "2026-03-17"
|
||||
|
||||
# ════════════════════════════════════════════════════════════════
|
||||
# DEEP DIVE - Line numbers in guide/ultimate-guide.md
|
||||
|
|
@ -1581,7 +1581,7 @@ ecosystem:
|
|||
- "Cross-links modified → Update all 4 repos"
|
||||
history:
|
||||
- date: "2026-01-20"
|
||||
event: "Code Landing sync v3.36.0, 66 templates, cross-links"
|
||||
event: "Code Landing sync v3.37.0, 66 templates, cross-links"
|
||||
commit: "5b5ce62"
|
||||
- date: "2026-01-20"
|
||||
event: "Cowork Landing fix (paths, README, UI badges)"
|
||||
|
|
@ -1593,7 +1593,7 @@ ecosystem:
|
|||
onboarding_matrix_meta:
|
||||
version: "2.1.0"
|
||||
last_updated: "2026-03-09"
|
||||
aligned_with_guide: "3.36.0"
|
||||
aligned_with_guide: "3.37.0"
|
||||
changelog:
|
||||
- version: "2.1.0"
|
||||
date: "2026-03-09"
|
||||
|
|
@ -1624,7 +1624,7 @@ onboarding_matrix:
|
|||
core: [rules, sandbox_native_guide, commands]
|
||||
time_budget: "5 min"
|
||||
topics_max: 3
|
||||
note: "SECURITY FIRST - sandbox before commands (v3.36.0 critical fix)"
|
||||
note: "SECURITY FIRST - sandbox before commands (v3.37.0 critical fix)"
|
||||
|
||||
beginner_15min:
|
||||
core: [rules, sandbox_native_guide, workflow, essential_commands]
|
||||
|
|
@ -1713,7 +1713,7 @@ onboarding_matrix:
|
|||
- default: agent_validation_checklist
|
||||
time_budget: "60 min"
|
||||
topics_max: 6
|
||||
note: "Dual-instance pattern for quality workflows (v3.36.0)"
|
||||
note: "Dual-instance pattern for quality workflows (v3.37.0)"
|
||||
|
||||
learn_security:
|
||||
intermediate_30min:
|
||||
|
|
@ -1724,7 +1724,7 @@ onboarding_matrix:
|
|||
- default: permission_modes
|
||||
time_budget: "30 min"
|
||||
topics_max: 4
|
||||
note: "NEW goal (v3.36.0) - Security-focused learning path"
|
||||
note: "NEW goal (v3.37.0) - Security-focused learning path"
|
||||
|
||||
power_60min:
|
||||
core: [sandbox_native_guide, mcp_secrets_management, security_hardening]
|
||||
|
|
@ -1749,7 +1749,7 @@ onboarding_matrix:
|
|||
core: [rules, sandbox_native_guide, workflow, essential_commands, context_management, plan_mode]
|
||||
time_budget: "60 min"
|
||||
topics_max: 6
|
||||
note: "Security foundation + core workflow (v3.36.0 sandbox added)"
|
||||
note: "Security foundation + core workflow (v3.37.0 sandbox added)"
|
||||
|
||||
intermediate_120min:
|
||||
core: [plan_mode, agents, skills, config_hierarchy, git_mcp_guide, hooks, mcp_servers]
|
||||
|
|
|
|||
|
|
@ -195,6 +195,42 @@ else
|
|||
fi
|
||||
echo ""
|
||||
|
||||
# ===================
|
||||
# 7. MCP VS CLI PAGE SYNC
|
||||
# ===================
|
||||
MCP_GUIDE="$GUIDE_DIR/guide/ecosystem/mcp-vs-cli.md"
|
||||
MCP_LANDING="$LANDING_DIR/src/pages/ecosystem/mcp-vs-cli.astro"
|
||||
|
||||
echo -e "${BLUE}7. MCP vs CLI page sync${NC}"
|
||||
|
||||
if [ ! -f "$MCP_GUIDE" ]; then
|
||||
echo -e " ${YELLOW}INFO${NC}: guide/ecosystem/mcp-vs-cli.md not found (skip)"
|
||||
elif [ ! -f "$MCP_LANDING" ]; then
|
||||
echo -e " ${RED}MISSING${NC}: landing page src/pages/ecosystem/mcp-vs-cli.astro not found"
|
||||
ISSUES=$((ISSUES + 1))
|
||||
else
|
||||
# Count H2 sections in guide
|
||||
GUIDE_H2=$(grep -c '^## ' "$MCP_GUIDE" || true)
|
||||
# Count guidance table rows (lines starting with | followed by content, skip header/separator)
|
||||
GUIDE_TABLE_ROWS=$(grep -cE '^\| [^-]' "$MCP_GUIDE" || true)
|
||||
# Count <tr> rows in landing (approximate — includes header rows)
|
||||
LANDING_TR=$(grep -c '<tr>' "$MCP_LANDING" || true)
|
||||
|
||||
echo " Guide H2 sections: $GUIDE_H2"
|
||||
echo " Guide table rows: $GUIDE_TABLE_ROWS"
|
||||
echo " Landing <tr> rows: $LANDING_TR"
|
||||
|
||||
# Loose check: if landing has zero <tr>, something is wrong
|
||||
if [ "$LANDING_TR" -lt 5 ]; then
|
||||
echo -e " ${RED}ERROR${NC}: Landing page has too few table rows — may be out of sync"
|
||||
ISSUES=$((ISSUES + 1))
|
||||
else
|
||||
echo -e " ${GREEN}OK${NC} (landing page exists, has table content)"
|
||||
echo " Tip: if you update the guide section, mirror changes in mcp-vs-cli.astro"
|
||||
fi
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# ===================
|
||||
# SUMMARY
|
||||
# ===================
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue