release: v3.37.0 — MCP vs CLI landing page + guide section

- New guide section: guide/ecosystem/mcp-vs-cli.md (4 decision dimensions,
  15-row guidance table, token cost analysis, practitioner quotes)
- New landing page: cc.bruniaux.com/ecosystem/mcp-vs-cli/ (4 decision cards,
  collapsible guidance table, zero JS, WCAG-compliant badges)
- ICM v0.5.0 setup guide corrections + icm-session-starter.md template
- 3 resource evaluations: mcp2cli, MCPorter, CircleCI MCP vs CLI blog
- WP10 v1.2.0 DAF/finance feedback corrections (FR+EN)
- Recap cards EN translations (57 cards) + FR version bump 3.32.1 → 3.36.0
- Whitepapers v2.2: 7 WPs synced with guide v3.27.6 → v3.36.0 delta
- check-landing-sync.sh: section 7 for MCP vs CLI sync tracking
- docs/for-cto.md: whitepapers links updated to florian.bruniaux.com/guides

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Florian BRUNIAUX 2026-03-17 15:55:44 +01:00
parent 728431cd4d
commit f5d78e1004
14 changed files with 663 additions and 27 deletions

View file

@ -0,0 +1,106 @@
# Resource Evaluation: "MCP vs. CLI" (CircleCI Blog)
**Date**: 2026-03-17
**Evaluator**: Claude Sonnet 4.6
**Resource URL**: https://circleci.com/blog/mcp-vs-cli/
**Resource Type**: Technical blog post
**Author**: Jacob Schmitt (CircleCI)
**Published**: 2026-03-11
---
## Executive Summary
Jacob Schmitt proposes a decision framework for choosing between MCP servers and CLI tools in agentic workflows, using the inner loop / outer loop distinction as the organizing principle. The post includes a browser automation benchmark (CLI 33% better token efficiency, 77 vs 60 task completion), a 6-question decision guide, and a hybrid architecture example from CircleCI's own tooling. The framework aligns with how the guide already positions RTK and the CLI+MCP hybrid approach. The post adds useful external validation and a cleaner decision vocabulary than what currently exists in the guide, but does not introduce new technical ground for experienced Claude Code users.
---
## Content Summary
- **Core thesis**: inner loop (frequent, local, low-latency dev iteration) favors CLI; outer loop (shared systems, CI/CD, cross-team infrastructure) favors MCP
- **Browser automation benchmark**: single test comparing agentic browser automation via CLI vs MCP. CLI: 77% task completion, 33% better token efficiency. MCP: 60% task completion. Methodology not detailed (single test, CircleCI-internal)
- **6-question decision framework**:
1. Who owns the feedback loop? (developer alone → CLI; multiple agents/team → MCP)
2. How often does the schema change? (frequently → CLI overhead lower; stable → MCP investment worthwhile)
3. Does the tool require auth/secrets management at runtime? (yes → MCP; no → CLI simpler)
4. Do you need structured output consumed by another agent? (yes → MCP; no → CLI)
5. Is this a team or individual tool? (team → MCP standardization; individual → CLI flexibility)
6. How much context budget do you have? (tight → CLI; ample → MCP acceptable)
- **CircleCI hybrid model**: Chunk CLI (local file ops) + Local CLI (shell + git) + CircleCI MCP Server (CI/CD system access) — each layer mapped to inner/outer loop
- **Does NOT mention**: Claude Code, Anthropic, RTK, or any specific tool outside CircleCI's stack
---
## Gap Analysis vs. Guide
| Area | CircleCI post | Guide coverage |
|------|---------------|----------------|
| Inner loop / outer loop vocabulary | ✅ Clean framework | ⚠️ Concept exists implicitly, not named this way |
| Decision framework (when CLI vs MCP) | ✅ 6-question guide | ⚠️ Philosophy covered, structured decision tool not present |
| Token cost of MCP tool schema | ✅ Mentioned as key driver | ❌ Not quantified anywhere in guide |
| Browser automation benchmark | ✅ Single data point | ❌ No benchmark data in guide |
| Hybrid CLI + MCP architecture | ✅ CircleCI example | ✅ Guide covers this philosophically |
| Claude Code-specific guidance | ❌ | ✅ Guide's primary differentiator |
**Real gap**: the guide lacks a structured decision framework for "should I use an MCP server or a CLI tool for this workflow?" The inner loop / outer loop vocabulary is clean and could be adopted directly, or serve as source inspiration for adding this framing to the guide.
---
## Quality Assessment
**Strengths**:
- The inner loop / outer loop distinction is well-established in dev productivity literature (ring-fencing fast local iteration vs. shared system operations) and applies cleanly to MCP vs. CLI
- The 6-question framework is actionable and maps directly to real workflow decisions
- CircleCI's own hybrid architecture is a credible worked example
- Published on a high-traffic engineering blog — will be referenced by practitioners
**Weaknesses**:
- The browser automation benchmark is a single internal test with no methodology disclosure. 77% vs 60% task completion difference could reflect implementation quality as much as CLI vs MCP architecture
- The post does not distinguish between different LLM hosts. Claude Code's MCP integration has different overhead characteristics than, say, a custom agent using the raw API
- The CircleCI MCP server recommendation at the end is vendor content (mild but present)
- Does not address cost (token price per call) — only token count, not dollars
---
## Score
**Score: 3/5** (Moderate — reference as external validation)
Solid framework from a credible source. The inner loop / outer loop vocabulary is worth borrowing. The benchmark data is too thin to cite as evidence but useful as a directional signal. The decision framework would be a meaningful addition to the guide's cost optimization or MCP section — either as inspiration for a new section or as an external reference link.
---
## Challenge
**Challenge**: "The benchmark is methodologically thin and CircleCI is selling their MCP server. This is marketing dressed as engineering. Score should be 2/5."
**Response**: The marketing angle is real but mild — the post's core content (decision framework, inner/outer loop model) stands independently of the CircleCI MCP product pitch at the end. The benchmark is not cited here as evidence; it's noted as a directional signal with the caveat that methodology is undisclosed. The decision framework and vocabulary are the primary value, and those are clean. 3/5 stands. If the guide cites this resource, it should reference the framework, not the benchmark numbers.
---
## Fact-Check
| Claim | Verified | Source |
|-------|----------|--------|
| Author: Jacob Schmitt, CircleCI | ✅ | Blog byline |
| Published 2026-03-11 | ✅ | Blog post date |
| CLI: 77% task completion, 33% better token efficiency | ⚠️ | Cited in post, no methodology link — treat as directional |
| MCP: 60% task completion | ⚠️ | Same benchmark — same caveat |
| 6-question decision framework | ✅ | Read from post directly |
| CircleCI hybrid model (3 layers) | ✅ | Post section "How CircleCI uses both" |
---
## Decision
**Score: 3/5 — Reference as external validation; borrow the inner loop / outer loop vocabulary.**
**Immediate actions**:
1. Consider adding "inner loop / outer loop" framing to the guide's section on CLI vs. MCP tradeoffs — it is a cleaner mental model than what the guide currently uses
2. The 6-question decision framework is a good template; a Claude Code-specific version would be higher value than citing the original (the guide's audience needs Claude Code-specific guidance, not generic agentic framework advice)
**What NOT to do**: do not cite the benchmark numbers (77% vs 60%) without disclosing that the methodology is undisclosed and the test is CircleCI-internal.
**Placement**: footnote or "See also" in `guide/core/` cost optimization section, or in the MCP ecosystem section when discussing when to use MCP vs. CLI patterns.
**Confidence**: High on framework quality. Low on benchmark reliability.

View file

@ -0,0 +1,95 @@
# Resource Evaluation: mcp2cli (knowsuchagency)
**Date**: 2026-03-17
**Evaluator**: Claude Sonnet 4.6
**Resource URL**: https://github.com/knowsuchagency/mcp2cli
**Resource Type**: Open-source CLI tool (GitHub)
**Author**: knowsuchagency (Stephan Fitzpatrick)
**Published**: 2026-03-09
**Stars**: 1,261 | **License**: MIT | **Language**: Python
---
## Executive Summary
mcp2cli converts MCP servers, OpenAPI specs, and GraphQL schemas into runtime CLI commands, eliminating tool schema injection from LLM prompts. The project claims 96-99% token savings by removing schema overhead, with an additional 40-60% via TOON encoding on array output. Created 8 days ago, it has 1,261 stars and a Claude Code skill integration. The architectural insight is real — MCP tool schema injection is a documented cost driver in agentic workflows — but the tool is too new for production recommendation. There is also a structural mismatch with Claude Code's internal MCP architecture: Claude Code manages MCP connections natively, so mcp2cli's schema-elimination approach doesn't map cleanly onto the standard Claude Code workflow.
---
## Content Summary
- **Core mechanic**: converts MCP server definitions, OpenAPI specs, and GraphQL schemas into runtime CLI tools with zero codegen, calling the underlying server on invocation
- **Token savings claims**: 96-99% reduction by removing tool schema injection from prompts; 40-60% additional via TOON (Tree-Optimized Output Notation) on array output
- **Features**: OAuth support, spec caching, secrets management, `bake` mode (batch commands), jq integration for filtering
- **Claude Code skill**: `npx skills add knowsuchagency/mcp2cli --skill mcp2cli` — installs as a skill for direct use in sessions
- **Supported sources**: MCP servers (stdio, HTTP/SSE), OpenAPI 2.x/3.x, GraphQL schemas
- **Contributors**: 3 | **Open issues**: 0 | **Last commit**: 2026-03-17
---
## Gap Analysis vs. Guide
| Area | mcp2cli | Guide coverage |
|------|---------|----------------|
| MCP schema overhead documentation | ✅ Addresses directly | ❌ Not documented anywhere |
| Token cost of MCP tool injection | ✅ Core value prop | ❌ Gap — not mentioned in cost section |
| CLI vs MCP tradeoff pattern | ✅ Practical tool for this | ⚠️ Mentioned at concept level, no concrete tooling |
| RTK-style output filtering | ❌ Different mechanism (schema removal, not output filtering) | ✅ RTK covered |
| Claude Code MCP integration | ⚠️ Structural mismatch (see Risk) | ✅ Covered in mcp-servers-ecosystem.md |
| OpenAPI/GraphQL → CLI conversion | ✅ | ❌ Not covered |
**Real gap**: the guide does not document MCP tool schema overhead as a cost driver. This is worth adding to the cost optimization or MCP ecosystem sections, independent of whether this specific tool is recommended.
---
## Risk Assessment
**Structural mismatch with Claude Code**: Claude Code manages MCP connections internally via its own runtime. mcp2cli's primary value proposition — replacing MCP tool injection with CLI calls — does not apply to the standard Claude Code workflow where tool schemas are injected by the host. The Claude Code skill (`npx skills add`) is the intended integration path, but it positions mcp2cli as a complementary tool rather than a replacement for native MCP. Users expecting "install mcp2cli, save 96% tokens in Claude Code" will be disappointed. The actual use case is closer to: use mcp2cli in scripts, hooks, or non-Claude Code contexts where you control the tool injection.
**Maturity**: 8 days old. No production track record. 0 open issues could mean the tool is solid or could mean it hasn't been stress-tested yet. The Python dependency stack (typer, httpx, pydantic) is mature, but the integration surface with arbitrary MCP servers is wide.
**Token savings claims**: the 96-99% figure references an external blog post and does not include a controlled benchmark against a defined baseline. TOON encoding savings are plausible for array-heavy output but not independently verified.
---
## Score
**Score: 3/5** (Moderate — Watch list)
Real problem, real approach, credible engineering (MIT, active dev, 1K+ stars in one week). The structural mismatch with Claude Code's architecture and the 8-day maturity are the limiting factors. The architectural insight about schema overhead is worth documenting in the guide even if the tool itself isn't ready for a primary recommendation.
---
## Challenge
**Challenge**: "1,261 stars in 8 days is a recency bubble. The guide has a track record problem with early hype. Score should be 2/5."
**Response**: The concern is valid for production recommendation. However, 3/5 here reflects "watch list + document the insight," not "integrate now." The architectural gap (MCP schema overhead is not mentioned anywhere in the guide) is real and independent of mcp2cli's maturity. If the tool were 2/5, the insight would still be worth documenting. Score stays at 3/5 with explicit maturity caveat.
---
## Fact-Check
| Claim | Verified | Source |
|-------|----------|--------|
| 1,261 stars | ✅ | GitHub API — created 2026-03-09, checked 2026-03-17 |
| MIT license | ✅ | LICENSE file in repo |
| Claude Code skill integration | ✅ | `npx skills add knowsuchagency/mcp2cli --skill mcp2cli` in README |
| 96-99% token savings | ⚠️ | Claimed in README, references external blog — not independently verified |
| TOON encoding | ⚠️ | Described in README, no independent benchmark |
| 3 contributors, 0 open issues | ✅ | GitHub API |
| Python, typer, httpx | ✅ | pyproject.toml |
---
## Decision
**Score: 3/5 — Watch list. Document the architectural insight now, revisit the tool recommendation in 3 months.**
**Immediate action (guide)**: Add a note in `guide/ecosystem/mcp-servers-ecosystem.md` (or the cost optimization section of the ultimate guide) documenting MCP tool schema overhead as a token cost driver. This insight exists independently of mcp2cli.
**Deferred action**: If mcp2cli reaches 200+ stars sustained after the initial wave, has 10+ contributors, and has documented real-world usage with Claude Code specifically, revisit for mention in `guide/ecosystem/third-party-tools.md`.
**What NOT to do**: Do not mention mcp2cli as a production tool for Claude Code token savings without clarifying the structural mismatch with Claude Code's native MCP architecture.
**Confidence**: High on architecture analysis. Medium on token savings claims (unverified externally). High on maturity assessment.

View file

@ -0,0 +1,99 @@
# Resource Evaluation: MCPorter (steipete)
**Date**: 2026-03-17
**Evaluator**: Claude Sonnet 4.6
**Resource URL**: https://github.com/steipete/mcporter
**Resource Type**: Open-source TypeScript toolkit (GitHub)
**Author**: Peter Steinberger (PSPDFKit founder)
**Stars**: 2,966 | **License**: MIT | **Language**: TypeScript
**Website**: mcporter.dev
---
## Executive Summary
MCPorter is a TypeScript runtime and CLI toolkit for MCP servers: it calls any MCP server programmatically, generates CLI wrappers, and emits typed TypeScript clients. Peter Steinberger (already referenced in the guide for practitioner insights) built it as a developer companion for testing and integrating MCP servers outside IDE environments. At 2,966 stars and 12+ contributors with a 2-week track record, it is meaningfully more mature than mcp2cli. The tool has genuine utility for power users writing hooks or scripts that need MCP server access without a running Claude Code session, but it is not a Claude Code workflow tool in the primary sense.
---
## Content Summary
- **Three operating modes**:
- Runtime calling: call any MCP server tool programmatically from TypeScript/Node
- CLI generation (`mcporter generate-cli`): generates shell-callable CLIs from MCP server definitions
- TypeScript codegen (`mcporter emit-ts`): generates typed TS clients for MCP servers
- **Auto-discovery**: reads MCP configs from Claude Desktop, Cursor, Codex, Windsurf, VS Code, OpenCode — detects which servers are configured and connects to them
- **Transport support**: stdio and HTTP/SSE, unified interface regardless of server transport
- **Connection pooling**: reuses connections across calls for efficiency
- **OAuth**: full OAuth2 flow for HTTP-based MCP servers requiring auth
- **Target use cases per README**: testing MCP servers, CI/CD pipelines needing MCP access, scripts and hooks, TypeScript apps consuming MCP services
- **Contributors**: 12+ | **Last commit**: 2026-03-03 | **Open issues**: tracked actively
---
## Gap Analysis vs. Guide
| Area | MCPorter | Guide coverage |
|------|----------|----------------|
| Testing MCP servers outside Claude Code | ✅ Primary use case | ❌ Not documented |
| MCP servers in hooks and scripts | ✅ `generate-cli` covers this | ⚠️ Hooks documented, MCP-in-hooks not |
| Typed TypeScript clients for MCP | ✅ `emit-ts` | ❌ Not covered |
| Auto-discovery of Claude MCP config | ✅ Reads claude_desktop_config.json | ⚠️ Config file location documented, not programmatic access |
| Debugging MCP servers during development | ✅ Useful companion | ⚠️ MCP Inspector mentioned, MCPorter not |
| Connection pooling / transport abstraction | ✅ | ❌ Not covered |
**Real gap**: the guide documents MCP server configuration and usage within Claude Code sessions, but does not cover accessing MCP servers programmatically from scripts, hooks, or external tools. MCPorter fills this gap for TypeScript environments.
---
## Steinberger Context
Peter Steinberger is the founder of PSPDFKit (now Nutrient), a well-known iOS/macOS SDK vendor. He is already cited in the guide for sharing operational insights on Claude Code usage in production (multi-agent workflows, cost management). His building MCPorter is a signal that MCP server access from non-IDE contexts is a real workflow need among practitioners — he would not build and publish this if the use case were marginal. The 12-contributor count and mcporter.dev website suggest this is not a weekend experiment.
---
## Score
**Score: 3/5** (Moderate — mention when covering MCP power-user workflows)
The tool is solid and the author is credible. The limiting factor is that it is a companion/debug tool, not a core Claude Code workflow tool. Most Claude Code users accessing MCP servers through the standard interface will never need MCPorter. The target audience is narrower: developers building MCP servers, writing complex hooks that need MCP access, or integrating Claude Code into CI/CD pipelines.
---
## Challenge
**Challenge**: "The auto-discovery of Claude Desktop config is the most interesting feature for the guide's audience, not the TypeScript codegen. The evaluation undersells the debugging angle."
**Response**: Valid. The debug/testing angle (testing MCP server behavior without a running IDE) is probably the highest-value use case for the guide's audience. A developer building a custom MCP server needs a way to call it and inspect responses without restarting Claude Code every time. MCPorter fills that gap cleanly. The `generate-cli` mode is also directly relevant to hook authors. Integration recommendation updated to lead with these two angles.
---
## Fact-Check
| Claim | Verified | Source |
|-------|----------|--------|
| 2,966 stars | ✅ | GitHub API |
| MIT license | ✅ | LICENSE file |
| Peter Steinberger / PSPDFKit | ✅ | GitHub profile + mcporter.dev About |
| 12+ contributors | ✅ | GitHub contributors graph |
| Auto-discovery: Claude, Cursor, Codex, Windsurf, VS Code, OpenCode | ✅ | README config-discovery section |
| Last commit 2026-03-03 | ✅ | GitHub |
| TypeScript, stdio + HTTP/SSE | ✅ | README + source |
| Connection pooling | ✅ | README features |
---
## Decision
**Score: 3/5 — Mention in third-party-tools.md or mcp-servers-ecosystem.md for MCP power-user workflows.**
**Integration angles** (in order of relevance for the guide's audience):
1. Testing and debugging MCP servers during development — use MCPorter to call server tools directly without restarting Claude Code
2. Hook scripts needing MCP server access — `generate-cli` creates shell-callable wrappers from an MCP server definition
3. TypeScript apps or CI/CD pipelines consuming MCP services — `emit-ts` for typed clients
**Placement**: a callout or "See also" in `guide/ecosystem/mcp-servers-ecosystem.md` under a "Testing and debugging MCP servers" paragraph, or in `guide/ecosystem/third-party-tools.md` under a developer tools subsection.
**When to revisit**: already at 3K stars with established author and active contributors — this is ready for a guide mention. The 3/5 score reflects scope (companion tool, not primary workflow) rather than maturity concerns.
**Confidence**: High. Author credibility verified, claims verified against source, use case is clear.