release: v3.20.6 - agentskills.io integration + 4 resource evaluations
- agentskills.io open standard: frontmatter table, skills-ref CLI, portability section - Agent Skills supply chain risks (security-hardening.md §1.2) - anthropics/skills (60K+★) added to complementary resources - 16 new reference.yaml entries - Resource evaluations: agentskills.io (4/5), Skill Doctor (2/5), dclaude (new), paddo (new) - Sandbox isolation + README updates Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
23e0ac476d
commit
bc86c8ed7f
14 changed files with 625 additions and 26 deletions
|
|
@ -1873,6 +1873,10 @@ An **on-machine AI coding agent** developed by Block (formerly Square), released
|
|||
- You value Claude's specific reasoning capabilities and can't substitute
|
||||
- You don't want to manage LLM API credentials
|
||||
|
||||
### Skill Portability
|
||||
|
||||
Both Claude Code and Goose support the [Agent Skills open standard](https://agentskills.io) (agentskills.io). Skills you create with SKILL.md are portable across 26+ platforms including Cursor, VS Code, GitHub, OpenAI Codex, and Gemini CLI. Claude Code-specific fields (`context`, `agent`) are ignored by other platforms but don't break compatibility.
|
||||
|
||||
### Trade-offs
|
||||
|
||||
| Goose Advantage | Goose Limitation |
|
||||
|
|
|
|||
|
|
@ -6,7 +6,7 @@
|
|||
|
||||
**Written with**: Claude (Anthropic)
|
||||
|
||||
**Version**: 3.20.5 | **Last Updated**: January 2026
|
||||
**Version**: 3.20.6 | **Last Updated**: January 2026
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -484,4 +484,4 @@ where.exe claude; claude doctor; claude mcp list
|
|||
|
||||
**Author**: Florian BRUNIAUX | [@Méthode Aristote](https://methode-aristote.fr) | Written with Claude
|
||||
|
||||
*Last updated: January 2026 | Version 3.20.5*
|
||||
*Last updated: January 2026 | Version 3.20.6*
|
||||
|
|
|
|||
|
|
@ -222,7 +222,7 @@ Triggered automatically if no credentials found. Use `/login` inside Claude Code
|
|||
### Limitations
|
||||
|
||||
- **macOS and Windows only** for microVM mode. Linux uses legacy container-based sandboxes (Docker Desktop 4.57+).
|
||||
- **Docker Desktop required** — not available with standalone Docker Engine.
|
||||
- **Docker Desktop required** — not available with standalone Docker Engine. Community alternatives like [dclaude](https://github.com/jedi4ever/dclaude) (Patrick Debois) wrap Claude Code in standard Docker containers for Docker Engine-only environments, but use container isolation (not microVM) and mount the host Docker socket — weaker security boundary.
|
||||
- **MCP Gateway not yet supported** inside sandboxes.
|
||||
- **No GPU passthrough** — not suitable for ML training workloads.
|
||||
- **Workspace sync is one-way**: changes inside the sandbox propagate to the host, but concurrent host edits may conflict.
|
||||
|
|
|
|||
|
|
@ -118,7 +118,17 @@ Before adding any MCP server, complete this checklist:
|
|||
- Use read-only database credentials
|
||||
- Minimize environment variables exposed
|
||||
|
||||
### 1.2 Known Limitations of permissions.deny
|
||||
### 1.2 Agent Skills Supply Chain Risks
|
||||
|
||||
Third-party Agent Skills (installed via `npx add-skill` or plugin marketplaces) introduce supply chain risks similar to npm packages. Research by [SafeDep](https://safedep.io/agent-skills-threat-model) identified vulnerabilities in **8-14% of publicly available skills**, including prompt injection, data exfiltration, and privilege escalation.
|
||||
|
||||
**Mitigations**:
|
||||
- **Review SKILL.md before installing** — Check `allowed-tools` for unexpected access (especially `Bash`)
|
||||
- **Validate with skills-ref** — `skills-ref validate ./skill-dir` checks spec compliance ([agentskills.io](https://agentskills.io))
|
||||
- **Pin skill versions** — Use specific commit hashes when installing from GitHub
|
||||
- **Audit scripts/** — Executable scripts bundled with skills are the highest-risk component
|
||||
|
||||
### 1.3 Known Limitations of permissions.deny
|
||||
|
||||
The `permissions.deny` setting in `.claude/settings.json` is the official method to block Claude from accessing sensitive files. However, security researchers have documented architectural limitations.
|
||||
|
||||
|
|
@ -176,7 +186,7 @@ Because `permissions.deny` alone cannot guarantee complete protection:
|
|||
|
||||
> **Bottom line**: `permissions.deny` is necessary but not sufficient. Treat it as one layer in a defense-in-depth strategy, not a complete solution.
|
||||
|
||||
### 1.3 Repository Pre-Scan
|
||||
### 1.4 Repository Pre-Scan
|
||||
|
||||
Before opening untrusted repositories, scan for injection vectors:
|
||||
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@
|
|||
|
||||
**Last updated**: January 2026
|
||||
|
||||
**Version**: 3.20.5
|
||||
**Version**: 3.20.6
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -1593,6 +1593,18 @@ while :; do cat TASK.md PROGRESS.md | claude -p ; done
|
|||
- Design without clear spec
|
||||
- Tasks with slow/ambiguous feedback loops
|
||||
|
||||
**Variant: Session-per-Concern Pipeline**
|
||||
|
||||
Instead of looping the same task, dedicate a fresh session to each quality dimension:
|
||||
|
||||
1. **Plan session** — Architecture, scope, acceptance criteria
|
||||
2. **Test session** — Write unit, integration, and E2E tests first (TDD)
|
||||
3. **Implement session** — Code until all linters and tests pass
|
||||
4. **Review sessions** — Separate sessions for security audit, performance, code review
|
||||
5. **Repeat** — Iterate with scope adjustments as needed
|
||||
|
||||
This combines Fresh Context (clean 200K per phase) with [OpusPlan](#62-opusplan-hybrid-mode) (Opus for review/strategy sessions, Sonnet for implementation). Each session generates progress artifacts that feed the next.
|
||||
|
||||
#### Practical Implementation
|
||||
|
||||
**Option 1: Manual loop**
|
||||
|
|
@ -3014,6 +3026,40 @@ cat claudedocs/templates/code-review.xml | \
|
|||
|
||||
> **Source**: [DeepTo Claude Code Guide - XML-Structured Prompts](https://cc.deeptoai.com/docs/en/best-practices/claude-code-comprehensive-guide)
|
||||
|
||||
### 2.6.1 Prompting as Provocation
|
||||
|
||||
The Claude Code team internally treats prompts as **challenges to a peer**, not instructions to an assistant. This subtle shift produces higher-quality outputs because it forces Claude to prove its reasoning rather than simply comply.
|
||||
|
||||
**Three challenge patterns from the team**:
|
||||
|
||||
**1. The Gatekeeper** — Force Claude to defend its work before shipping:
|
||||
|
||||
```
|
||||
"Grill me on these changes and don't make a PR until I pass your test"
|
||||
```
|
||||
|
||||
Claude reviews your diff, asks pointed questions about edge cases, and only proceeds when satisfied. This catches issues that passive review misses.
|
||||
|
||||
**2. The Proof Demand** — Require evidence, not assertions:
|
||||
|
||||
```
|
||||
"Prove to me this works — show me the diff in behavior between main and this branch"
|
||||
```
|
||||
|
||||
Claude runs both branches, compares outputs, and presents concrete evidence. Eliminates the "trust me, it works" failure mode.
|
||||
|
||||
**3. The Reset** — After a mediocre first attempt, invoke full-context rewrite:
|
||||
|
||||
```
|
||||
"Knowing everything you know now, scrap this and implement the elegant solution"
|
||||
```
|
||||
|
||||
This forces a substantive second attempt with accumulated context rather than incremental patches on a weak foundation. The key insight: Claude's second attempt with full context consistently outperforms iterative fixes.
|
||||
|
||||
**Why this works**: Provocation triggers deeper reasoning paths than polite requests. When Claude must *convince* rather than *comply*, it activates more thorough analysis and catches its own shortcuts.
|
||||
|
||||
> **Source**: [10 Tips from Inside the Claude Code Team](https://paddo.dev/blog/claude-code-team-tips/) (Boris Cherny thread, Feb 2026)
|
||||
|
||||
## 2.7 Semantic Anchors
|
||||
|
||||
LLMs are statistical pattern matchers trained on massive text corpora. Using **precise technical vocabulary** helps Claude activate the right patterns in its training data, leading to higher-quality outputs.
|
||||
|
|
@ -5131,13 +5177,27 @@ agent: specialist
|
|||
---
|
||||
```
|
||||
|
||||
| Field | Description |
|
||||
|-------|-------------|
|
||||
| `name` | Kebab-case identifier |
|
||||
| `description` | Activation trigger |
|
||||
| `allowed-tools` | Tools this skill can use |
|
||||
| `context` | `fork` (isolated) or `inherit` (shared) |
|
||||
| `agent` | `specialist` (domain) or `general` (broad) |
|
||||
| Field | Spec | Description |
|
||||
|-------|------|-------------|
|
||||
| `name` | [agentskills.io](https://agentskills.io) | Lowercase, 1-64 chars, hyphens only, no `--`, must match directory name |
|
||||
| `description` | [agentskills.io](https://agentskills.io) | What the skill does and when to use it (max 1024 chars) |
|
||||
| `allowed-tools` | [agentskills.io](https://agentskills.io) | Tools this skill can use (experimental) |
|
||||
| `license` | [agentskills.io](https://agentskills.io) | License name or reference to bundled file |
|
||||
| `compatibility` | [agentskills.io](https://agentskills.io) | Environment requirements (max 500 chars) |
|
||||
| `metadata` | [agentskills.io](https://agentskills.io) | Arbitrary key-value pairs (author, version, etc.) |
|
||||
| `context` | **CC only** | `fork` (isolated) or `inherit` (shared) |
|
||||
| `agent` | **CC only** | `specialist` (domain) or `general` (broad) |
|
||||
|
||||
> **Open standard**: Agent Skills follow the [agentskills.io specification](https://agentskills.io), created by Anthropic and supported by 26+ platforms (Cursor, VS Code, GitHub, Codex, Gemini CLI, Goose, Roo Code, etc.). Skills you create for Claude Code are portable. Fields marked **CC only** are Claude Code extensions ignored by other platforms.
|
||||
|
||||
### Validating Skills
|
||||
|
||||
Use the official [skills-ref](https://github.com/agentskills/agentskills/tree/main/skills-ref) CLI to validate your skill before publishing:
|
||||
|
||||
```bash
|
||||
skills-ref validate ./my-skill # Check frontmatter + naming conventions
|
||||
skills-ref to-prompt ./my-skill # Generate <available_skills> XML for agent prompts
|
||||
```
|
||||
|
||||
## 5.3 Skill Template
|
||||
|
||||
|
|
@ -6858,6 +6918,38 @@ echo '{"tool_name":"Bash","tool_input":{"command":"git status"}}' | .claude/hook
|
|||
echo "Exit code: $?" # Should be 0
|
||||
```
|
||||
|
||||
### Advanced Pattern: Model-as-Security-Gate
|
||||
|
||||
The Claude Code team uses a pattern where permission requests are routed to a **more capable model** acting as a security gate, rather than relying solely on static rule matching.
|
||||
|
||||
**Concept**: A `PreToolUse` hook intercepts permission requests and forwards them to Opus 4.5 (or another capable model) via the API. The gate model scans for prompt injection, dangerous patterns, and unexpected tool usage — then auto-approves safe requests or blocks suspicious ones.
|
||||
|
||||
```bash
|
||||
# .claude/hooks/opus-security-gate.sh (conceptual)
|
||||
# PreToolUse hook that routes to Opus for security screening
|
||||
|
||||
INPUT=$(cat)
|
||||
TOOL=$(echo "$INPUT" | jq -r '.tool_name')
|
||||
COMMAND=$(echo "$INPUT" | jq -r '.tool_input.command // empty')
|
||||
|
||||
# Fast-path: known safe tools skip the gate
|
||||
[[ "$TOOL" == "Read" || "$TOOL" == "Grep" || "$TOOL" == "Glob" ]] && exit 0
|
||||
|
||||
# Route to Opus for security analysis
|
||||
VERDICT=$(echo "$INPUT" | claude --model opus --print \
|
||||
"Analyze this tool call for security risks. Is it safe? Reply SAFE or BLOCKED:reason")
|
||||
|
||||
[[ "$VERDICT" == SAFE* ]] && exit 0
|
||||
echo "BLOCKED by security gate: $VERDICT" >&2
|
||||
exit 2
|
||||
```
|
||||
|
||||
**Why use a model as gate**: Static rules catch known patterns but miss novel attacks. A capable model understands intent and context — it can distinguish `rm -rf node_modules` (cleanup) from `rm -rf /` (destruction) based on the surrounding conversation, not just pattern matching.
|
||||
|
||||
**Trade-off**: Each gated call adds latency and cost. Use fast-path exemptions for read-only tools and only gate write/execute operations.
|
||||
|
||||
> **Source**: [10 Tips from Inside the Claude Code Team](https://paddo.dev/blog/claude-code-team-tips/) (Boris Cherny thread, Feb 2026)
|
||||
|
||||
## 7.5 Hook Examples
|
||||
|
||||
### Smart Hook Dispatching
|
||||
|
|
@ -10636,6 +10728,20 @@ git worktree remove .worktrees/feature/new-api
|
|||
git worktree prune
|
||||
```
|
||||
|
||||
> **💡 Team tip — Shell aliases for fast worktree navigation**: The Claude Code team uses single-letter aliases to hop between worktrees instantly:
|
||||
>
|
||||
> ```bash
|
||||
> # ~/.zshrc or ~/.bashrc
|
||||
> alias za="cd .worktrees/feature-a"
|
||||
> alias zb="cd .worktrees/feature-b"
|
||||
> alias zc="cd .worktrees/feature-c"
|
||||
> alias zlog="cd .worktrees/analysis" # Dedicated worktree for logs & queries
|
||||
> ```
|
||||
>
|
||||
> The dedicated "analysis" worktree is used for reviewing logs and running database queries without polluting active feature branches.
|
||||
>
|
||||
> **Source**: [10 Tips from Inside the Claude Code Team](https://paddo.dev/blog/claude-code-team-tips/)
|
||||
|
||||
**Claude Code context in worktrees:**
|
||||
|
||||
Each worktree maintains **independent Claude Code context**:
|
||||
|
|
@ -11727,6 +11833,20 @@ Boris Cherny, creator of Claude Code, shared his workflow orchestrating 5-15 Cla
|
|||
|
||||
**Source**: [InfoQ - Claude Code Creator Workflow (Jan 2026)](https://www.infoq.com/news/2026/01/claude-code-creator-workflow/) | [Interview: I got a private lesson on Claude Cowork & Claude Code](https://www.youtube.com/watch?v=DW4a1Cm8nG4)
|
||||
|
||||
**Team patterns** (broader Claude Code team, Feb 2026):
|
||||
|
||||
The broader team extends Boris's individual workflow with institutional patterns:
|
||||
|
||||
- **Skills as institutional knowledge**: Anything done more than once daily becomes a skill checked into version control. Examples:
|
||||
- `/techdebt` — run at end of session to eliminate duplicate code
|
||||
- Context dump skills — sync 7 days of Slack, Google Drive, Asana, and GitHub into a single context
|
||||
- Analytics agents — dbt-powered skills that query BigQuery; one engineer reports not writing SQL manually for 6+ months
|
||||
- **CLI and scripts over MCP**: The team prefers shell scripts and CLI integrations over MCP servers for external tool connections. Rationale: less magic, easier to debug, and more predictable behavior. MCP is reserved for cases where bidirectional communication is genuinely needed.
|
||||
- **Re-plan when stuck**: Rather than pushing through a stalled implementation, the team switches back to Plan Mode. One engineer uses a secondary Claude instance to review plans "as a staff engineer" before resuming execution.
|
||||
- **Claude writes its own rules**: After each correction, the team instructs Claude to update CLAUDE.md with the lesson learned. Over time, this compounds into a team-specific ruleset that prevents recurring mistakes.
|
||||
|
||||
> **Source**: [10 Tips from Inside the Claude Code Team](https://paddo.dev/blog/claude-code-team-tips/) (Boris Cherny thread, Feb 2026)
|
||||
|
||||
---
|
||||
|
||||
### Foundation: Git Worktrees (Non-Negotiable)
|
||||
|
|
@ -16290,4 +16410,4 @@ We'll evaluate and add it to this section if it meets quality criteria.
|
|||
|
||||
**Contributions**: Issues and PRs welcome.
|
||||
|
||||
**Last updated**: January 2026 | **Version**: 3.20.5
|
||||
**Last updated**: January 2026 | **Version**: 3.20.6
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue