release: v3.37.1 - threat-db v2.8.0, CC releases v2.1.78, 19 skills update, doc fixes

- threat-db v2.8.0: GhostClaw campaign, Fake OpenClaw Installer, CVE-2026-24910 (Bun), T017 Shadow MCP, T018 AI Search Poisoning, Jozu Agent Guard, MCP Sentinel - Claude Code releases tracked to v2.1.78 (StopFailure hook, plugin state, security fixes) - 19 skill descriptions improved (PR #9 selective merge, @popey/Tessl) - MCP vs CLI token overhead corrected (lazy loading, 85% reduction benchmark) - Agent Adoption Curve self-assessment (7-level maturity scale, Martignole framework) - ctx7 CLI section §5.5 + resource evals #079 #080 #081 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 15:49:01 +01:00 · 2026-03-18 15:49:01 +01:00 · 44818a3f04
commit 44818a3f04
parent eea5662a65
19 changed files with 785 additions and 87 deletions
--- a/guide/ecosystem/mcp-vs-cli.md
+++ b/guide/ecosystem/mcp-vs-cli.md
@ -40,7 +40,7 @@ This page compares two integration patterns for giving Claude Code access to ext

 | Advantage | Detail |
 |-----------|--------|
-| **Zero context overhead** | No schema injected at startup — relevant when context budget is tight |
+| **Zero context overhead** | No schema injected at startup. Since v2.1.7 lazy loading closes most of the gap, but CLI is still the absolute minimum. |
 | **Deterministic actions** | Explicit commands with predictable output are easier to audit and test |
 | **Human + AI use** | The same CLI wrapper works for a developer running it manually and for Claude |
 | **Frontier models** | Claude Opus/Sonnet 4.6 can drive complex CLIs (aws-cli, glab, gh) without a structured schema |
@ -52,7 +52,7 @@ This page compares two integration patterns for giving Claude Code access to ext

 | Weakness | Detail |
 |----------|--------|
-| **Schema token cost** | Every MCP server injects its full tool list into the context window at session start, whether or not those tools are used that session |
+| **Schema token cost** | Since v2.1.7, lazy loading (MCP Tool Search) means unused tools inject only their name, not their full schema. Cost is still non-zero: tool names load at startup, full schemas load on first use. The pre-v2.1.7 worst case (~55K tokens for a 5-server setup) now averages ~8.7K — an 85% reduction, but not zero. |
 | **Connection overhead** | Session startup takes longer with many MCP servers connected |
 | **Debugging difficulty** | Failures inside an MCP server are harder to trace than a failed shell command |
 | **Maintenance complexity** | Running, updating, and securing remote MCP servers adds infrastructure |
@ -112,7 +112,7 @@ Quick reference — not rules, but directional defaults.
 | Individual dev, local machine | **CLI or skill** | Simpler, faster, no infrastructure |
 | Deterministic actions (git, CI, deploy) | **CLI** | Explicit commands, predictable output, auditable |
 | Complex auth (OAuth, token refresh) | **MCP** | Server handles auth; CLI would require credential plumbing |
-| Tight context budget / many tools loaded | **CLI** | No schema injection at startup |
+| Tight context budget / many tools loaded | **CLI** | Still the minimum-overhead option. Lazy loading (v2.1.7+) reduces MCP cost significantly, but CLI has zero schema cost by design. |
 | Agent-to-agent structured output | **MCP** | JSON responses are more reliable than parsed CLI text |
 | Debugging / prototyping a new integration | **CLI** | Easier to inspect, faster to iterate |
 | Browser automation (non-frontier model) | **MCP** | Playwright MCP structures interaction reliably |
@ -138,18 +138,35 @@ The mistake is applying one answer to both layers. A solo developer building a C

 ## Token cost of MCP schemas — what the numbers look like

-MCP servers inject their full tool list into the context at session start. This is not free.
+Since v2.1.7 (January 2026), Claude Code uses **MCP Tool Search** (lazy loading) by default. This changes the token math significantly, but does not eliminate schema cost entirely.

-A typical MCP server with 10-15 tools injects 500-2,000 tokens per session before any task starts. With 5 MCP servers connected, that is 2,500-10,000 tokens of overhead on every session, whether or not those tools are used.
+**How lazy loading works:** instead of injecting all tool schemas at session start, Claude receives only tool names in an `<available-deferred-tools>` block. Full schemas are fetched via `ToolSearch` only when Claude decides to call a specific tool. Unused tools in a session cost only their name in context (~0 schema tokens), not the full definition.

-The practical consequence: if you load 10 MCP servers but only use 2 in a given session, you are paying for 8 servers worth of schema every time. This compounds with long sessions and high-frequency workflows.
+**Measured impact** (Anthropic benchmarks, 5-server setup):

-**Mitigation strategies:**
+| Scenario | Token overhead | Note |
+|----------|---------------|------|
+| Before v2.1.7 (eager loading) | ~55,000 tokens | All schemas preloaded |
+| After v2.1.7 (lazy loading) | ~8,700 tokens | 85% reduction |
+| CLI (no MCP) | ~0 tokens | Baseline |
+
+The old worst-case claim of "500-2,000 tokens per server" described eager loading, which is no longer the default. With lazy loading, the cost per unused server is near zero. The cost per *used* server (~600 tokens per tool schema loaded on demand) remains real, but is now pay-per-use rather than always-on.
+
+**What still adds overhead even with lazy loading:**
+
+- Tool names are still injected at startup (one line per tool per server)
+- Schemas load at first invocation — long sessions using many tools accumulate cost
+- Connection setup per server is unchanged (latency, not tokens)
+- Many connected MCP servers still means more names in context, even if schemas stay deferred
+
+**Configuration** (v2.1.9+): the `ENABLE_TOOL_SEARCH` environment variable controls the threshold. `auto:N` triggers lazy loading when MCP tools exceed N% of context (default 10%).
+
+**Mitigation strategies** (still relevant, lower urgency):

 - Load MCP servers selectively per project (project-level config vs global config)
- Use CLI tools for high-frequency operations where schema overhead accumulates
- Monitor token usage per session to identify which MCP schemas are loaded but unused
- Consider a CLI wrapper for tools you use frequently in tight loops (compile → test → fix cycles)
+- Use CLI tools for high-frequency tight loops where any overhead compounds (compile → test → fix)
+- Monitor token usage per session to identify which schemas are being loaded at invocation time
+- Consider a CLI wrapper for tools you use constantly but don't need structured output from

 ---