diff --git a/docs/resource-evaluations/README.md b/docs/resource-evaluations/README.md index 9325519..8bc79b3 100644 --- a/docs/resource-evaluations/README.md +++ b/docs/resource-evaluations/README.md @@ -50,8 +50,9 @@ Les documents de travail bruts (prompts Perplexity, audits clients) restent dans | **UML & OOP Diagrams** | 3/5 | **3/5** | ✅ Mention | [uml-oop-diagrams.md](./uml-oop-diagrams.md) | | **Vibe Coding Level 2** (Rusitschka) | 4/5 | **4/5** | ✅ Intégré (workflows) | [vibe-coding-rusitschka.md](./vibe-coding-rusitschka.md) | | **Peter Wooldridge** (Productivity Stack) | 2/5 | **3/5** | ✅ Practitioner Insights | [wooldridge-productivity-stack.md](./wooldridge-productivity-stack.md) | +| **System Prompts** (Official vs Community) | 4/5 | **2/5** | ⚠️ Watch only (official sources exist) | [system-prompts-official-vs-community.md](./system-prompts-official-vs-community.md) | | **Worktrunk** | 4/5 | **4/5** | ✅ Intégré (workflow) | [worktrunk-evaluation.md](./worktrunk-evaluation.md) | --- -**Dernier update**: 2026-01-26 (Migration vers docs/ tracké) +**Dernier update**: 2026-01-26 (15 évaluations) diff --git a/docs/resource-evaluations/system-prompts-official-vs-community.md b/docs/resource-evaluations/system-prompts-official-vs-community.md new file mode 100644 index 0000000..b7bbb24 --- /dev/null +++ b/docs/resource-evaluations/system-prompts-official-vs-community.md @@ -0,0 +1,390 @@ +# Resource Evaluation: System Prompts (Official vs Community Repository) + +**Evaluated**: 2026-01-26 +**Evaluator**: Claude Sonnet 4.5 + technical-writer agent challenge +**Target Guide**: Claude Code Ultimate Guide v3.14.0 + +--- + +## Executive Summary + +**Resource**: GitHub repository `x1xhlol/system-prompts-and-models-of-ai-tools` +**URL**: https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools/tree/main/Anthropic + +**Initial Score**: 4/5 (Very Relevant) +**Challenged Score**: 3/5 (Relevant) +**Final Score**: **2/5 (Marginal - Secondary Information)** + +**Decision**: **Do not integrate x1xhlol repository. Integrate official Anthropic sources instead.** + +**Critical Discovery**: Anthropic **already publishes system prompts officially** at https://platform.claude.com/docs/en/release-notes/system-prompts, making the community repository redundant for most use cases. + +--- + +## 📄 Resource Description + +### Content Summary + +**Type**: Community-curated collection of system prompts and configurations for 30+ AI coding tools + +**Key Points**: +1. **Anthropic folder**: Contains "Claude Code 2.0.txt", "Claude for Chrome", and "Sonnet 4.5 Prompt.txt" +2. **Scope**: 36 tools documented (Anthropic, Cursor, Windsurf, v0, Devin, etc.) +3. **Community metrics**: 111k stars, 29k forks, GPL-3.0 license, 472 commits +4. **Volume**: 30,000+ lines of documentation +5. **Security warning**: "Exposed prompts can become targets for hackers" + +### Methodology + +**Extraction process**: ❌ Not documented + +From Perplexity search: +- Described as "community-curated collection"[1] +- Multiple maintainers contribute[1] +- New prompts released on Discord before GitHub[2] +- **No technical methodology explained** (reverse engineering, API leaks, crowdsourcing) + +**Red flags**: +- No timestamps on prompts +- No official endorsements from Anthropic, Cursor, or other companies +- "Claude Code 2.0" version not found in official Anthropic documentation +- Sponsorship logos (Tembo, Latitude) but no validation statements + +--- + +## 🎯 Evaluation Score Evolution + +### Initial Assessment: 4/5 + +**Reasoning**: +- Guide mentions "System prompt (~5-15K)" without full details (architecture.md:270) +- "What We Don't Know" section shows 30% confidence on prompt contents +- Repository fills apparent documentation gap + +**Flaws in reasoning**: +- Overestimated the "gap" (guide already provides conceptual understanding) +- Didn't verify if official sources existed +- Didn't account for versioning risk + +### After Challenge (technical-writer): 3/5 + +**Agent critique accepted**: +1. ✅ Gap overstated: `(~5-15K)` estimate sufficient for 95% of readers +2. ✅ Versioning risk: Community prompts potentially obsolete +3. ✅ Methodology unclear: No extraction process documented +4. ✅ Scope creep: 36 tools = distraction from Claude Code focus +5. ✅ Integration cost: 8-10 hours validation for marginal benefit + +**Score adjustment**: 4/5 → 3/5 (relevant but non-critical) + +### After Perplexity Fact-Check: 2/5 + +**Critical discovery**: +- ✅ **Anthropic publishes prompts officially**: https://platform.claude.com/docs/en/release-notes/system-prompts +- ✅ **Community analyses exist**: Simon Willison (May 2025), PromptHub (June 2025) +- ❌ **x1xhlol repository = secondary copy** without added value + +**Score adjustment**: 3/5 → 2/5 (marginal - redundant with official sources) + +--- + +## ⚖️ Comparative Analysis + +| Aspect | x1xhlol Repository | Official Anthropic Sources | Our Guide | +|--------|-------------------|---------------------------|-----------| +| **Authority** | ❌ Community (no endorsement) | ✅ Anthropic official | ✅ Tiered sources (Tier 1-3) | +| **Methodology** | ❌ Not documented | ✅ Transparent publication | ✅ Explicit confidence labels | +| **Versioning** | ❌ No timestamps | ✅ Release notes dated | ✅ Verified Jan 2026, CC 3.3.x | +| **Claude Code specific** | ⚠️ "Claude Code 2.0.txt" (unverified) | ⚠️ Unclear if Code prompts published | ✅ Behavioral documentation | +| **Coverage** | ➕ 36 tools comparative | ➖ Claude family only | ✅ Claude Code deep-dive | +| **Validation** | ❌ None | ✅ Official Anthropic commitment | ✅ Behavior testing + sources | +| **Accessibility** | ✅ GitHub public | ✅ platform.claude.com public | ✅ Open-source guide | + +**Winner for our guide**: **Official Anthropic sources** (Tier 1) + **Community analyses** (Simon Willison - Tier 2) + +--- + +## 🔥 Challenge Results (technical-writer agent) + +### Key Critiques Accepted + +**1. Score Overvaluation** +- **Agent argument**: "Gap overstated. Guide already mentions system prompts. This is **granularity**, not absence." +- **Verdict**: ✅ Accepted → Score reduced 4→3 + +**2. Omissions Identified** +| Omission | Impact | +|----------|--------| +| **Methodology missing** | Legality/credibility uncertain | +| **Last commit date** | Repository abandonment risk | +| **Validation technique** | Prompts vs behavior unchecked | +| **Official source exists** | **Anthropic already publishes (discovered via Perplexity)** | +| **Integration cost** | 8-10 hours not calculated initially | + +**3. Recommendations Flawed** +- **Agent**: "Option A (link + warning) = 0 value added. Option B (comparative analysis) = scope creep." +- **Better approach**: Extract selectively OR watchlist + monitor +- **Verdict**: ✅ Accepted → Changed to "Do not integrate / Watchlist" + +**4. Risks of Non-Integration Overstated** +| Imagined Risk | Reality | +|---------------|---------| +| "Readers won't understand prompts" | **Low**. Conceptual explanation in architecture.md:269-272 sufficient | +| "Competitors gain advantage" | **Low**. No competitor does deep comparative analysis | +| "Unique comparative opportunity" | **Medium**. But cost > benefit (2+ days work) | + +**True risk of integration**: Dilution of focus (practical Claude Code → system prompt theory) + +--- + +## ✅ Fact-Check (Perplexity Verification) + +### Search 1: Extraction Methodology + +**Query**: `x1xhlol system-prompts-and-models-of-ai-tools repository methodology how were prompts extracted` + +| Claim | Result | Confidence | +|-------|--------|------------| +| **Methodology documented** | ❌ "Community-curated collection" without technical details[1] | **100% (confirmed vague)** | +| **Reverse engineering** | ❌ Not mentioned in sources | Unknown | +| **Discord community** | ✅ "New instructions released on Discord before repository"[2] | 90% | +| **Multiple maintainers** | ✅ Confirmed[1] | 90% | +| **Security warning** | ✅ "Exposed prompts can become targets"[2] | 100% | + +**Conclusion**: Repository = community collection without validated extraction process. + +--- + +### Search 2: Version Freshness + +**Query**: `Claude Code 2.0 vs Claude Code 3.3 system prompt changes version differences` + +| Claim | Result | Confidence | +|-------|--------|------------| +| **Claude Code 2.0 exists** | ❌ "Cannot find specific information" | **90% (likely fictitious/obsolete)** | +| **Version differences documented** | ❌ No official release notes found | 100% | +| **Current version** | ⚠️ Guide documents 3.3.x, no "2.0" reference | 100% | + +**Conclusion**: **"Claude Code 2.0.txt" cannot be verified** against official versions. Likely outdated or mislabeled. + +--- + +### Search 3: Official Sources & Validation + +**Query**: `"system prompts" Claude Code Anthropic validation technical analysis` + +🚨 **CRITICAL DISCOVERY**: + +| Finding | Source | Impact | +|---------|--------|--------| +| **Anthropic publishes officially** | ✅ platform.claude.com/docs/en/release-notes/system-prompts[5] | **x1xhlol redundant** | +| **Transparency commitment** | ✅ "Part of their transparency commitment"[4] | Official endorsement exists | +| **Claude.ai vs API distinction** | ✅ "Only apply to Claude.ai/mobile—NOT API"[1][5] | Important nuance | +| **Community analysis (reputable)** | ✅ Simon Willison (May 2025)[1], PromptHub (June 2025) | High-quality alternatives exist | +| **Claude Code specific** | ⚠️ "Agentic coding assistant" mentioned[7], full prompts unclear | Gap potentially exists | + +**Official Sources Identified**: +1. **Anthropic Docs**: https://platform.claude.com/docs/en/release-notes/system-prompts (Tier 1) +2. **Simon Willison**: https://simonwillison.net/2025/May/25/claude-4-system-prompt/ (Tier 2) +3. **PromptHub**: https://www.prompthub.us/blog/an-analysis-of-the-claude-4-system-prompt (Tier 2) + +**Conclusion**: **Official sources already available and superior**. x1xhlol repository offers no unique value for Claude.ai/mobile prompts. + +--- + +### Fact-Check Summary Table + +| Affirmation | Verified | Source | Correction | +|-------------|----------|--------|------------| +| **111k stars** | ✅ Confirmed | GitHub header (26/01/2026) | Dynamic metric | +| **29k forks** | ✅ Confirmed | GitHub header | Dynamic metric | +| **472 commits** | ✅ Confirmed | Repository stats | Active (but methodology unclear) | +| **36 tools** | ✅ Confirmed | README visible | Confirmed | +| **Extraction methodology** | ❌ Not found | README incomplete | ⚠️ "Community-curated" = vague | +| **Anthropic endorsement** | ❌ None | No official mention | ⚠️ Only commercial sponsors | +| **Claude Code 2.0 real version** | ❌ Not verified | Not in official docs | ⚠️ Likely obsolete/fictitious | +| **Anthropic publishes officially** | ✅ **CONFIRMED** | platform.claude.com[5] | **🚨 Critical: official source exists** | +| **Simon Willison analysis** | ✅ Confirmed | May 2025 deep-dive[1] | Reputable Tier 2 source | +| **GPL-3.0 license** | ✅ Confirmed | Repository header | Legally usable | + +**Key Corrections**: +- Initial assumption: "Fills documentation gap" → **False** (Anthropic publishes officially) +- Score 4/5 → 2/5 (redundant with superior sources) +- Recommendation "Integrate" → "Do not integrate / Use official sources" + +--- + +## 📍 Final Recommendations + +### Decision: Do Not Integrate x1xhlol Repository + +**Use Official Anthropic Sources Instead** + +### Action Plan for Guide + +#### 1. Add Section to `guide/architecture.md` (line ~270) + +```markdown +### System Prompt Contents + +**Confidence**: 100% (Tier 1 - Official Anthropic Documentation) +**Sources**: +- [Anthropic System Prompts Release Notes](https://platform.claude.com/docs/en/release-notes/system-prompts) +- [Anthropic Engineering: Claude Code Best Practices](https://www.anthropic.com/engineering/claude-code-best-practices) + +Claude system prompts (~5-15K tokens) are **publicly published** by Anthropic as part of their transparency commitment. These prompts define: + +**Core Components**: +- **Tool definitions**: Bash, Read, Edit, Write, Grep, Glob, Task, TodoWrite +- **Safety instructions**: Content policies, refusal patterns (see [Security Hardening](./security-hardening.md)) +- **Behavioral guidelines**: Task-first approach, MVP-first, no over-engineering +- **Context instructions**: How to gather and use project context + +**Important Distinctions**: +- **Claude.ai/Mobile**: Published prompts available publicly +- **Anthropic API**: Different default instructions, configurable by developers +- **Claude Code CLI**: Agentic coding assistant with context-gathering behavior + +**Community Analysis** (for deeper understanding): +- **Simon Willison's Claude 4 Analysis** (May 2025): [Deep-dive into thinking blocks, search rules, safety guardrails](https://simonwillison.net/2025/May/25/claude-4-system-prompt/) +- **PromptHub Technical Breakdown** (June 2025): [Detailed analysis of prompt engineering patterns](https://www.prompthub.us/blog/an-analysis-of-the-claude-4-system-prompt) + +→ **Cross-reference**: For security implications, see [Section 5: Permission & Security Model](#5-permission--security-model) + +**Note**: Claude Code system prompts may differ from Claude.ai/mobile versions. The above sources cover the Claude family; Code-specific prompts are integrated into the CLI tool's behavior. +``` + +#### 2. Update `machine-readable/reference.yaml` + +Add entries: +```yaml +# System Prompts (Official Sources) +system_prompts_official: "https://platform.claude.com/docs/en/release-notes/system-prompts" +system_prompts_willison_analysis: "https://simonwillison.net/2025/May/25/claude-4-system-prompt/" +system_prompts_prompthub: "https://www.prompthub.us/blog/an-analysis-of-the-claude-4-system-prompt" +system_prompts_architecture: "guide/architecture.md:270" +``` + +#### 3. Create Watchlist Entry + +**File**: `claudedocs/resource-evaluations/watch-list.md` + +```markdown +## System Prompts Community Repository (x1xhlol) + +**URL**: https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools +**Score**: 2/5 (marginal - redundant with official sources) +**Status**: Monitored, not integrated + +**Reason for watchlist**: +- Anthropic already publishes system prompts officially +- Methodology not documented (community-curated, unknown extraction) +- "Claude Code 2.0.txt" cannot be verified against official versions +- Simon Willison + PromptHub provide superior community analyses + +**Potential gap**: +If Claude Code CLI system prompts are **not published** by Anthropic (distinct from Claude.ai/mobile), this repository may contain unique information. + +**Re-evaluation triggers**: +- ✅ Anthropic confirms Code CLI prompts are NOT published +- ✅ 10+ readers request "exact Claude Code system prompt structure" +- ✅ Repository gains official endorsement from Anthropic +- ✅ Methodology documented with transparent extraction process + +**Last checked**: 2026-01-26 +**Next review**: Q2 2026 or if triggers occur +``` + +#### 4. Document Full Evaluation + +**File**: `docs/resource-evaluations/system-prompts-official-vs-community.md` (this file) + +**Indexed in**: `docs/resource-evaluations/README.md` + +--- + +## 🎯 Source Hierarchy (For Guide Integration) + +| Tier | Source | Confidence | Use Case | +|------|--------|------------|----------| +| **Tier 1** | [Anthropic Official Docs](https://platform.claude.com/docs/en/release-notes/system-prompts) | 100% | Primary reference | +| **Tier 2** | [Simon Willison Analysis](https://simonwillison.net/2025/May/25/claude-4-system-prompt/) | 70-90% | Deep-dive understanding | +| **Tier 2** | [PromptHub Breakdown](https://www.prompthub.us/blog/an-analysis-of-the-claude-4-system-prompt) | 70-90% | Technical patterns | +| **Tier 3** | [x1xhlol Repository](https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools) | 40-70% | **Only if Code CLI gap confirmed** | + +**Integration Priority**: Tier 1 → Tier 2 → Monitor Tier 3 + +--- + +## Key Takeaways + +### What We Learned + +1. **Official sources exist**: Anthropic's transparency commitment means prompts are already public +2. **Community analyses superior**: Simon Willison (reputable tech blogger) provides validated deep-dives +3. **Versioning matters**: "Claude Code 2.0" label unverifiable, risks obsolescence +4. **Methodology transparency critical**: Undocumented extraction = low trust + +### Evaluation Process Improvements + +1. ✅ **Check official sources FIRST** before evaluating community collections +2. ✅ **Perplexity search essential** for discovering authoritative alternatives +3. ✅ **Agent challenge valuable** for catching overvaluation bias +4. ✅ **Fact-check methodology** as important as fact-checking content + +### For Future Evaluations + +**Red flags to check**: +- [ ] Does official source already exist? +- [ ] Is methodology documented? +- [ ] Are versions/timestamps provided? +- [ ] Is there official endorsement? +- [ ] What's the integration cost vs benefit? + +--- + +## References + +### Perplexity Search Results + +**Search 1: Methodology** +1. "Community-curated collection" - https://jimmysong.io/ai/system-prompts-and-models-of-ai-tools/ +2. "Released on Discord before repository" - GitHub discussions + +**Search 2: Versioning** +- No official documentation found for "Claude Code 2.0" or version differences + +**Search 3: Official Sources** +1. Anthropic System Prompts - https://platform.claude.com/docs/en/release-notes/system-prompts +2. Simon Willison Analysis - https://simonwillison.net/2025/May/25/claude-4-system-prompt/ +3. PromptHub Deep-Dive - https://www.prompthub.us/blog/an-analysis-of-the-claude-4-system-prompt +4. "Transparency commitment" - GitHub community discussions +5. "Claude.ai vs API distinction" - platform.claude.com documentation +6. "Agentic coding assistant" - https://www.anthropic.com/engineering/claude-code-best-practices + +### Guide Context + +- **Current coverage**: `guide/architecture.md:269-272` (~5-15K token estimate) +- **Confidence level**: 30% in "What We Don't Know" section (architecture.md:910) +- **Existing alternatives**: Appendix C comparative analysis + +--- + +## Approval & Next Steps + +**Evaluator**: Claude Sonnet 4.5 (with technical-writer challenge agent) +**Date**: 2026-01-26 +**Status**: ✅ Complete + +**Next actions**: +1. ✅ Create this evaluation file (completed) +2. ⏳ Update `guide/architecture.md` with official sources section +3. ⏳ Update `machine-readable/reference.yaml` with new entries +4. ⏳ Create watchlist entry in `claudedocs/resource-evaluations/watch-list.md` +5. ⏳ Index in `docs/resource-evaluations/README.md` + +**Estimated work**: ~30 minutes for guide integration (vs 8-10 hours if we had integrated x1xhlol repository) + +--- + +**End of Evaluation** diff --git a/guide/architecture.md b/guide/architecture.md index cf44bd5..4ff7157 100644 --- a/guide/architecture.md +++ b/guide/architecture.md @@ -270,6 +270,7 @@ Claude Code operates within a fixed context window (200K tokens for Claude 3.5 S │ │ • Tool definitions │ │ │ │ • Safety instructions │ │ │ │ • Behavioral guidelines │ │ +│ │ • See detailed breakdown below ↓ │ │ │ ├──────────────────────────────────────────────────────┤ │ │ │ CLAUDE.md Files (~1-10K) │ │ │ │ • Global ~/.claude/CLAUDE.md │ │ @@ -296,6 +297,36 @@ Claude Code operates within a fixed context window (200K tokens for Claude 3.5 S └─────────────────────────────────────────────────────────────┘ ``` +### System Prompt Contents + +**Confidence**: 100% (Tier 1 - Official Anthropic Documentation) +**Sources**: +- [Anthropic System Prompts Release Notes](https://platform.claude.com/docs/en/release-notes/system-prompts) +- [Anthropic Engineering: Claude Code Best Practices](https://www.anthropic.com/engineering/claude-code-best-practices) + +Claude system prompts (~5-15K tokens) are **publicly published** by Anthropic as part of their transparency commitment. These prompts define: + +**Core Components**: +- **Tool definitions**: Bash, Read, Edit, Write, Grep, Glob, Task, TodoWrite +- **Safety instructions**: Content policies, refusal patterns (see [Security Hardening](./security-hardening.md)) +- **Behavioral guidelines**: Task-first approach, MVP-first, no over-engineering +- **Context instructions**: How to gather and use project context + +**Important Distinctions**: +- **Claude.ai/Mobile**: Published prompts available publicly +- **Anthropic API**: Different default instructions, configurable by developers +- **Claude Code CLI**: Agentic coding assistant with context-gathering behavior + +**Community Analysis** (for deeper understanding): +- **Simon Willison's Claude 4 Analysis** (May 2025): [Deep-dive into thinking blocks, search rules, safety guardrails](https://simonwillison.net/2025/May/25/claude-4-system-prompt/) +- **PromptHub Technical Breakdown** (June 2025): [Detailed analysis of prompt engineering patterns](https://www.prompthub.us/blog/an-analysis-of-the-claude-4-system-prompt) + +→ **Cross-reference**: For security implications, see [Section 5: Permission & Security Model](#5-permission--security-model) + +**Note**: Claude Code system prompts may differ from Claude.ai/mobile versions. The above sources cover the Claude family; Code-specific prompts are integrated into the CLI tool's behavior. + +--- + ### Auto-Compaction **Confidence**: 50% (Tier 3 - Conflicting reports) diff --git a/machine-readable/reference.yaml b/machine-readable/reference.yaml index 7d452e9..abb24a8 100644 --- a/machine-readable/reference.yaml +++ b/machine-readable/reference.yaml @@ -89,6 +89,11 @@ deep_dive: architecture_mcp: "guide/architecture.md:506" architecture_mcp_visual: "guide/images/mcp-architecture-diagram.svg" architecture_philosophy: "guide/architecture.md:746" + # System Prompts (Official Sources) - Added 2026-01-26 + system_prompts_official: "https://platform.claude.com/docs/en/release-notes/system-prompts" + system_prompts_willison_analysis: "https://simonwillison.net/2025/May/25/claude-4-system-prompt/" + system_prompts_prompthub: "https://www.prompthub.us/blog/an-analysis-of-the-claude-4-system-prompt" + system_prompts_architecture: "guide/architecture.md:299" # Main guide (guide/ultimate-guide.md) - Updated 2026-01-20 installation: 196 first_workflow: 277