docs: add verified critical bugs tracker (known-issues.md)

NEW: guide/known-issues.md (285 lines) - GitHub issue auto-creation bug (Issue #13797, v2.0.65+, ACTIVE) * 17+ confirmed accidental public disclosures * Security/privacy risk documented * Workarounds: explicit repo, manual approval, pre-execution verification - Excessive token consumption (Issue #16856, v2.1.1+, Jan 2026) * 20+ reports of 4x+ faster consumption * Anthropic: "Not officially confirmed as bug" (investigating) * Workarounds: /context monitoring, shorter sessions, disable auto-compact - Model quality degradation (Aug-Sep 2025, RESOLVED) * Anthropic official postmortem: 3 infrastructure bugs * Community theories (quantization) debunked FACT-CHECKED: Perplexity Pro + GitHub API direct queries - Verified: 5,702 open issues (not 4,697), 527 invalid labels - Corrected: v2.1.1 token bug (not non-existent v2.0.61) - Sources: GitHub Issues, Anthropic postmortem, The Register UPDATED: - guide/README.md: Added known-issues.md to docs table - machine-readable/reference.yaml: 4 new entries for issue tracking - CHANGELOG.md: Documented integration process NEW: docs/resource-evaluations/023-community-discussions-report-jan2026.md - Full evaluation process documented - Fact-check methodology: Perplexity + GitHub API - Score: 2/5 (Marginal - partial integration only) - Lesson: Always verify community reports with primary sources Impact: Critical security awareness for users, actionable workarounds Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-28 17:59:16 +01:00 · 2026-01-28 17:59:16 +01:00 · 940caf3f1e
commit 940caf3f1e
parent a75c66a890
5 changed files with 592 additions and 0 deletions
--- a/docs/resource-evaluations/023-community-discussions-report-jan2026.md
+++ b/docs/resource-evaluations/023-community-discussions-report-jan2026.md
@ -0,0 +1,281 @@
+# Resource Evaluation: Community Discussions Analysis Report (January 2026)
+
+**Evaluated**: January 28, 2026
+**Resource Type**: Analytical report (copied text, not URL)
+**Target**: Claude Code Ultimate Guide
+**Evaluator**: Claude Sonnet 4.5 via /eval-resource skill
+
+---
+
+## 📄 Resource Summary
+
+Comprehensive analytical report titled "Analyse Mensuelle des Discussions Communautaires Claude Code - Janvier 2026" covering:
+- 7 months of community sentiment tracking (July 2025 - January 2026)
+- Top 5 technical problems (token consumption, context window, model quality, performance, GitHub issue bug)
+- Top 5 feature requests
+- Longitudinal data analysis across GitHub, Reddit, Discord, Twitter
+- Recommendations for Claude Code documentation
+
+**Claimed Coverage**: GitHub (4,697 open issues), Reddit sentiment (28-35/100), Discord discussions, Twitter mentions
+
+---
+
+## 🎯 Evaluation Score
+
+**Initial Score**: 5/5 (Critical - Major gap in guide)
+**Post-Challenge Score**: 3/5 (Relevant - Useful complement)
+**Post-Fact-Check Score**: **2/5** (Marginal - Minimal mention or skip)
+
+### Score Justification
+
+**Downgrade reasons**:
+1. **Major factual errors**: Version 2.0.61 doesn't exist (confused with v2.1.1)
+2. **Timing errors**: Token bug was January 2026, not December 2025
+3. **Unverifiable stats**: 4,697 issues (reality: 5,702), sentiment scores lack methodology
+4. **Ephemeral data**: Monthly community reports become obsolete quickly
+5. **Maintenance burden**: Would require monthly updates (unsustainable)
+
+**Upgrade reasons**:
+1. ✅ **Confirmed critical bugs**: GitHub issue auto-creation (Issue #13797), token consumption (Issue #16856)
+2. ✅ **Verified with sources**: Anthropic postmortem on Aug 2025 model degradation
+3. ✅ **Actionable workarounds**: Practical solutions for users
+4. ✅ **Security impact**: Privacy risks from accidental public disclosures
+
+---
+
+## ✅ Fact-Check Results
+
+### Verification Methods
+
+1. **Perplexity Pro searches** (4 queries):
+   - Token consumption bug v2.0.61
+   - GitHub issues count verification
+   - Accidental issue creation bug
+   - Model quality degradation August 2025
+
+2. **GitHub API direct queries**:
+   - `gh api repos/anthropics/claude-code` → Stats verification
+   - `gh search issues` → Bug confirmation, wrong repo issues count
+   - `gh issue view` → Specific issue details
+   - `gh api releases` → Version existence check
+
+### Key Findings
+
+| Claim | Status | Reality |
+|-------|--------|---------|
+| v2.0.61 token bug (Dec 2025) | ❌ **FALSE** | v2.0.61 doesn't exist; real bug: v2.1.1 (Jan 2026) |
+| 4,697 open issues | ❌ **FALSE** | 5,702 issues (as of Jan 28, 2026) |
+| 263 issues labeled "invalid" | ❌ **FALSE** | 527 issues with "invalid" label |
+| GitHub auto-creation bug | ✅ **TRUE** | Issue #13797 confirmed, 17+ examples found |
+| Token consumption issues | ✅ **PARTIAL** | 20+ reports found, but Anthropic denies official bug |
+| Model degradation Aug 2025 | ✅ **TRUE** | Anthropic official postmortem confirms 3 infrastructure bugs |
+
+### Sources Verified
+
+**✅ Confirmed**:
+- [Anthropic Postmortem](https://www.anthropic.com/engineering/a-postmortem-of-three-recent-issues) (Sept 17, 2025)
+- [Issue #13797](https://github.com/anthropics/claude-code/issues/13797) - GitHub auto-creation bug
+- [Issue #16856](https://github.com/anthropics/claude-code/issues/16856) - Token consumption v2.1.1
+- [The Register](https://www.theregister.com/2026/01/05/claude_devs_usage_limits/) - Holiday bonus context
+
+**❌ Not Found**:
+- No mention of v2.0.61 in any source
+- No public documentation of "263 invalid issues" stat
+- No verifiable methodology for "sentiment 28-35/100" score
+
+---
+
+## 🚨 Critical Errors in Report
+
+### Error #1: Version Confusion
+
+**Report claim**:
+> "Depuis décembre 2025 (version 2.0.61), les utilisateurs signalent une consommation de tokens 5-20x normale"
+
+**Reality**:
+- **v2.0.61 does not exist** in GitHub releases (only v2.0.73, v2.0.74, v2.0.76 found)
+- **Real bug**: v2.1.1 (published Jan 7, 2026)
+- **First report**: Issue #16856 on January 8, 2026
+- **Timing**: January 2026, not December 2025
+
+**Impact**: Critical factual error invalidating major section of report
+
+---
+
+### Error #2: Stats Inflation/Deflation
+
+| Metric | Report | Reality (Jan 28) | Variance |
+|--------|--------|------------------|----------|
+| Open issues | 4,697 | 5,702 | -1,005 (-17.6%) |
+| Issues "invalid" | 263 | 527 | -264 (-50%) |
+| Wrong repo issues | 116 (44% of 263) | 17+ confirmed | Overestimated |
+
+**Impact**: Undermines credibility of statistical analysis
+
+---
+
+### Error #3: Unverifiable Sentiment Scores
+
+**Report claim**: "Sentiment: 28-35/100 (janvier 2026)"
+
+**Problem**:
+- No methodology disclosed
+- No tool/source specified
+- Cannot be independently verified
+- Likely manual interpretation without systematic measurement
+
+**Impact**: Non-scientific claim presented as quantitative data
+
+---
+
+## ✅ What Was Integrated
+
+### Created: `guide/known-issues.md` (285 lines)
+
+**Section 1: Active Critical Issues**
+
+1. **GitHub Issue Auto-Creation Bug** (Issue #13797)
+   - Verified with 17+ examples
+   - Security/privacy risk documented
+   - Workarounds provided
+   - Examples of accidental disclosures
+
+2. **Excessive Token Consumption** (Issue #16856, v2.1.1)
+   - 20+ reports documented
+   - Anthropic response quoted
+   - Holiday bonus context clarified
+   - Workarounds for users
+
+**Section 2: Resolved Historical Issues**
+
+3. **Model Quality Degradation (Aug-Sep 2025)**
+   - Official Anthropic postmortem linked
+   - 3 infrastructure bugs detailed
+   - Community theories (quantization) debunked
+   - Resolution timeline confirmed
+
+**Section 3: Resources**
+
+- Issue statistics (verified via GitHub API)
+- Tracking commands for users
+- Official channels list
+- Contributing guidelines
+
+---
+
+## ❌ What Was Rejected
+
+1. **Version 2.0.61 references** (non-existent)
+2. **December 2025 timing** for token bug (incorrect)
+3. **Sentiment scores** without methodology
+4. **Unverifiable statistics** (4,697 issues, 263 invalid)
+5. **Recommendations for Anthropic** (out of scope for user guide)
+6. **Monthly update commitment** (unsustainable maintenance)
+
+---
+
+## 📊 Integration Impact
+
+### Files Modified
+
+1. **guide/known-issues.md** (NEW, 285 lines)
+   - Comprehensive critical bugs tracker
+   - Verified sources only
+   - Actionable workarounds
+   - Security awareness focus
+
+2. **guide/README.md** (1 line added)
+   - Added known-issues.md to table of contents
+   - Description: "Critical bugs tracker: security issues, token consumption, verified community reports"
+
+3. **machine-readable/reference.yaml** (4 entries added)
+   - `known_issues`: Main file reference
+   - `known_issues_github_bug`: Line 16 (GitHub auto-creation)
+   - `known_issues_token_consumption`: Line 136 (Token usage)
+   - `known_issues_model_quality_aug2025`: Line 231 (Aug 2025 resolved)
+
+4. **CHANGELOG.md** (16 lines added)
+   - Documented integration in [Unreleased] > Added
+   - Listed all 3 critical issues
+   - Noted fact-checking process
+   - Verified stats (5,702 issues, 527 invalid labels)
+
+### User Benefits
+
+1. **Security awareness**: Users warned about GitHub auto-creation bug (privacy risk)
+2. **Cost management**: Token consumption workarounds documented
+3. **Trust building**: Verified facts only, no speculation
+4. **Historical context**: Aug 2025 model degradation explained (resolved)
+5. **Actionable guidance**: Practical workarounds, not just problem descriptions
+
+---
+
+## 🔍 Methodology Evaluation
+
+### Strengths
+
+- Comprehensive multi-platform analysis (GitHub, Reddit, Discord, Twitter)
+- Longitudinal tracking (7 months)
+- Identified real patterns (GitHub bug, token issues, model degradation)
+- Detailed recommendations structure
+
+### Weaknesses
+
+- **Version confusion**: Mixed up v2.0.61, v2.0.65, v2.1.1
+- **Unverified stats**: 4,697 issues, sentiment scores lack source
+- **Timing errors**: December vs January for token bug
+- **No primary sources cited**: "Mentions 1,250+" without platform breakdown
+- **Survivorship bias**: Community discussions over-represent problems
+- **No control group**: No comparison with other tools' issue patterns
+
+### Lesson Learned
+
+**For future resource evaluations**:
+1. ✅ **Always fact-check claims** via Perplexity + direct API queries
+2. ✅ **Verify versions exist** before documenting bugs
+3. ✅ **Request methodology** for statistical claims
+4. ✅ **Cross-reference dates** with release timelines
+5. ✅ **Challenge auto-agents** to find flaws before integration
+6. ❌ **Don't trust community reports blindly** - verify with official sources
+
+---
+
+## 🎯 Final Decision
+
+**Action Taken**: **PARTIAL INTEGRATION** (verified facts only)
+
+**Rationale**:
+- Report contained valuable findings (3 real bugs verified)
+- But also contained critical errors (version confusion, stat errors)
+- Integration limited to fact-checked content only
+- Rejected speculative/unverifiable claims
+
+**Confidence Level**: **Medium** (verified sources exist, but report had errors)
+
+**Would Recommend This Resource**: ❌ NO (too many factual errors, use primary sources instead)
+
+**Better Alternative**: Direct GitHub Issues search + Anthropic official communications
+
+---
+
+## 📝 Evaluator Notes
+
+This evaluation demonstrates the importance of **systematic fact-checking** before integrating community-sourced content. Even comprehensive analytical reports can contain:
+- Version confusion
+- Timing errors
+- Unverifiable statistics
+- Methodology gaps
+
+**Best practice**: Treat analytical reports as **leads to investigate**, not facts to copy. Always verify with:
+1. Primary sources (GitHub Issues, official docs)
+2. API queries (GitHub API, not web search)
+3. Official communications (Anthropic blog, status page)
+4. Multiple independent sources for controversial claims
+
+**Result**: Successfully extracted 3 verified critical bugs while filtering out errors, maintaining guide credibility.
+
+---
+
+**Evaluation completed**: January 28, 2026
+**Time invested**: ~2 hours (research, fact-checking, integration, documentation)
+**Token cost**: ~130K tokens (Perplexity searches, GitHub queries, document creation)