docs: add SE-CoVe plugin example + resource evaluation workflow (v3.11.6)

- First plugin example: SE-CoVe (Chain-of-Verification, Meta AI ACL 2024)
- Academic approach: cite paper metrics, not marketing claims
- Performance table: +23-112% accuracy (task-dependent, trade-offs disclosed)
- Resource evaluation template established (Perplexity fact-check workflow)
- Curation policy: Academic validation + Claims verified + Costs transparent
- Templates count: 82 → 83
- Architecture diagram added (visual overview of Claude Code internals)

Files:
- examples/plugins/se-cove.md (new plugin documentation)
- claudedocs/resource-evaluations/2026-01-24-se-cove-plugin.md (evaluation report)
- README.md, CHANGELOG.md, VERSION, reference.yaml (version bump 3.11.5 → 3.11.6)
- guide/architecture.md + image (visual overview)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Florian BRUNIAUX 2026-01-24 17:40:54 +01:00
parent f7e1254b06
commit ee5791668a
9 changed files with 342 additions and 20 deletions

View file

@ -38,8 +38,22 @@ Each claim is marked with its confidence level. **Always prefer official documen
---
## Visual Overview
Before diving into the technical details, this diagram by Mohamed Ali Ben Salem captures the essential architecture:
![Claude Code Architecture Overview](./images/claude-code-architecture-overview.jpeg)
*Source: [Mohamed Ali Ben Salem on LinkedIn](https://www.linkedin.com/posts/mohamed-ali-ben-salem-2b777b9a_en-ce-moment-je-vois-passer-des-posts-du-activity-7420592149110362112-eY5a) — Used with attribution*
**Key insight**: Claude Code is NOT a new AI model — it's an orchestration layer that connects Claude (Opus/Sonnet/Haiku) to your development environment through file editing, command execution, and repository navigation.
---
## Table of Contents
- [Visual Overview](#visual-overview)
1. [The Master Loop](#1-the-master-loop)
2. [The Tool Arsenal](#2-the-tool-arsenal)
3. [Context Management Internals](#3-context-management-internals)

View file

@ -6,7 +6,7 @@
**Written with**: Claude (Anthropic)
**Version**: 3.11.5 | **Last Updated**: January 2026
**Version**: 3.11.6 | **Last Updated**: January 2026
---
@ -424,4 +424,4 @@ where.exe claude; claude doctor; claude mcp list
**Author**: Florian BRUNIAUX | [@Méthode Aristote](https://methode-aristote.fr) | Written with Claude
*Last updated: January 2026 | Version 3.11.5*
*Last updated: January 2026 | Version 3.11.6*

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

View file

@ -10,7 +10,7 @@
**Last updated**: January 2026
**Version**: 3.11.5
**Version**: 3.11.6
---
@ -5158,6 +5158,140 @@ See the [repository README](https://github.com/blader/Claudeception) for hook co
This skill demonstrates the **skill-that-creates-skills** pattern—a meta-approach where Claude Code improves itself through session learning. Inspired by academic work on reusable skill libraries (Voyager, CASCADE, SEAgent, Reflexion).
### Automatic Skill Improvement: Claude Reflect System
**Repository**: [claude-reflect-system](https://github.com/haddock-development/claude-reflect-system)
**Author**: Haddock Development | **Status**: Production-ready (2026)
**Marketplace**: [Agent Skills Index](https://agent-skills.md/skills/haddock-development/claude-reflect-system/reflect)
While Claudeception creates new skills from discovered patterns, **Claude Reflect System** automatically improves existing skills by analyzing Claude's feedback and detected corrections during sessions.
#### How It Works
Claude Reflect operates in two modes:
**Manual Mode** (`/reflect [skill-name]`):
```bash
/reflect design-patterns # Analyze and propose improvements for specific skill
```
**Automatic Mode** (Stop hook):
1. **Monitors** Stop hook triggers (session end, error, explicit stop)
2. **Parses** session transcript for skill-related feedback
3. **Classifies** improvement type (correction, enhancement, new example)
4. **Proposes** skill modifications with confidence level (HIGH/MED/LOW)
5. **Waits** for explicit user review and approval
6. **Backs up** original skill file to Git
7. **Applies** changes with validation (YAML syntax, markdown structure)
8. **Commits** with descriptive message
#### Safety Features
| Feature | Purpose | Implementation |
|---------|---------|----------------|
| **User Review Gate** | Prevent automatic unwanted changes | All proposals require explicit approval before application |
| **Git Backups** | Enable rollback of bad improvements | Auto-commits before each modification with descriptive messages |
| **Syntax Validation** | Maintain skill file integrity | YAML frontmatter + markdown body validation before write |
| **Confidence Levels** | Prioritize high-quality improvements | HIGH (clear correction) > MED (likely improvement) > LOW (suggestion) |
| **Locking Mechanism** | Prevent concurrent modifications | File locks during analysis and application phases |
#### Installation
```bash
# Clone to skills directory
git clone https://github.com/haddock-development/claude-reflect-system.git \
~/.claude/skills/claude-reflect-system
# Configure Stop hook (add to ~/.claude/hooks/Stop.sh or Stop.ps1)
# Bash example:
echo '/reflect-auto' >> ~/.claude/hooks/Stop.sh
chmod +x ~/.claude/hooks/Stop.sh
# PowerShell example:
Add-Content -Path "$HOME\.claude\hooks\Stop.ps1" -Value "/reflect-auto"
```
See the [repository README](https://github.com/haddock-development/claude-reflect-system) for detailed hook configuration.
#### Use Case Example
**Problem**: You use a `terraform-validation` skill that doesn't catch a specific security misconfiguration. During the session, Claude detects and corrects the issue manually.
**Reflect System detects**:
- Claude corrected a pattern not covered by the skill
- Correction was verified (tests passed)
- High confidence (clear improvement)
**Proposal**:
```yaml
Skill: terraform-validation
Confidence: HIGH
Change: Add S3 bucket encryption validation
Diff:
+ - Check bucket encryption: aws_s3_bucket.*.server_side_encryption_configuration
+ - Reject: Encryption not set or using AES256 instead of aws:kms
```
**User reviews** → approves → **skill updated** → future sessions automatically catch this issue.
#### ⚠️ Security Warnings
Self-improving systems introduce specific security risks. Claude Reflect System includes mitigations, but users must remain vigilant:
| Risk | Description | Mitigation | User Responsibility |
|------|-------------|------------|---------------------|
| **Feedback Poisoning** | Adversarial inputs manipulate improvement proposals | User review gate, confidence scoring | Review all HIGH confidence proposals, reject suspicious changes |
| **Memory Poisoning** | Malicious edits to learned patterns accumulate | Git backups, syntax validation | Periodically audit skill history via Git log |
| **Prompt Injection** | Embedded instructions in session transcripts | Input sanitization, proposal isolation | Never approve proposals with executable commands |
| **Skill Bloat** | Unbounded growth without curation | Manual `/reflect [skill]` mode, curate regularly | Archive or merge redundant improvements quarterly |
**Academic sources**:
- [Anthropic Memory Cookbook](https://github.com/anthropics/anthropic-cookbook/blob/main/skills/memory/guide.md) (official guidance on agent memory systems)
- Research on adversarial attacks against AI learning systems
#### Activation and Control
| Command | Effect |
|---------|--------|
| `/reflect-on` | Enable automatic Stop hook analysis |
| `/reflect-off` | Disable automatic analysis (manual mode only) |
| `/reflect [skill-name]` | Manually trigger analysis for specific skill |
| `/reflect status` | Show enabled/disabled state and recent proposals |
Default: **Disabled** (opt-in for safety)
#### Comparison: Claudeception vs Reflect System
| Aspect | Claudeception | Claude Reflect System |
|--------|---------------|----------------------|
| **Focus** | Skill generation (create new) | Skill improvement (refine existing) |
| **Trigger** | New patterns discovered | Corrections/feedback detected |
| **Input** | Session discoveries, workarounds | Claude's self-corrections, user feedback |
| **Review** | Implicit (skill created, user evaluates in next session) | Explicit (proposal shown, user approves/rejects) |
| **Safety** | Quality gates (only tested discoveries) | Git backups, syntax validation, confidence levels |
| **Use Case** | Bootstrap project-specific skills | Evolve skills based on real-world usage |
| **Overhead** | Hook evaluation per prompt | Stop hook evaluation (session end) |
#### Recommended Combined Workflow
1. **Bootstrap** (Claudeception): Let Claude generate skills from discovered patterns during initial project work
2. **Iterate** (Use skills): Apply generated skills in subsequent sessions
3. **Refine** (Reflect System): Enable `/reflect-on` to capture improvements as skills evolve with usage
4. **Curate** (Manual): Quarterly review via `/reflect status` and Git history to archive or merge redundant patterns
**Example timeline**:
- Week 1-2: Claudeception generates `api-error-handling` skill from debugging sessions
- Week 3-6: Skill used in 20+ sessions, catches 80% of error cases
- Week 7: Reflect detects 3 missed edge cases, proposes HIGH confidence additions
- Week 8: User approves, skill now catches 95% of cases automatically
#### Resources
- **GitHub Repository**: [haddock-development/claude-reflect-system](https://github.com/haddock-development/claude-reflect-system)
- **Marketplace**: [Agent Skills Index](https://agent-skills.md/skills/haddock-development/claude-reflect-system/reflect)
- **Video Tutorial**: [YouTube walkthrough](https://www.youtube.com/watch?v=...) (check repo for latest)
- **Academic Foundation**: [Anthropic Memory Cookbook](https://github.com/anthropics/anthropic-cookbook/blob/main/skills/memory/guide.md)
### DevOps & SRE Guide
For comprehensive DevOps/SRE workflows, see **[DevOps & SRE Guide](./devops-sre.md)**:
@ -14079,4 +14213,4 @@ Thumbs.db
**Contributions**: Issues and PRs welcome.
**Last updated**: January 2026 | **Version**: 3.11.5
**Last updated**: January 2026 | **Version**: 3.11.6