NEW: guide/known-issues.md (285 lines) - GitHub issue auto-creation bug (Issue #13797, v2.0.65+, ACTIVE) * 17+ confirmed accidental public disclosures * Security/privacy risk documented * Workarounds: explicit repo, manual approval, pre-execution verification - Excessive token consumption (Issue #16856, v2.1.1+, Jan 2026) * 20+ reports of 4x+ faster consumption * Anthropic: "Not officially confirmed as bug" (investigating) * Workarounds: /context monitoring, shorter sessions, disable auto-compact - Model quality degradation (Aug-Sep 2025, RESOLVED) * Anthropic official postmortem: 3 infrastructure bugs * Community theories (quantization) debunked FACT-CHECKED: Perplexity Pro + GitHub API direct queries - Verified: 5,702 open issues (not 4,697), 527 invalid labels - Corrected: v2.1.1 token bug (not non-existent v2.0.61) - Sources: GitHub Issues, Anthropic postmortem, The Register UPDATED: - guide/README.md: Added known-issues.md to docs table - machine-readable/reference.yaml: 4 new entries for issue tracking - CHANGELOG.md: Documented integration process NEW: docs/resource-evaluations/023-community-discussions-report-jan2026.md - Full evaluation process documented - Fact-check methodology: Perplexity + GitHub API - Score: 2/5 (Marginal - partial integration only) - Lesson: Always verify community reports with primary sources Impact: Critical security awareness for users, actionable workarounds Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
9.9 KiB
Resource Evaluation: Community Discussions Analysis Report (January 2026)
Evaluated: January 28, 2026 Resource Type: Analytical report (copied text, not URL) Target: Claude Code Ultimate Guide Evaluator: Claude Sonnet 4.5 via /eval-resource skill
📄 Resource Summary
Comprehensive analytical report titled "Analyse Mensuelle des Discussions Communautaires Claude Code - Janvier 2026" covering:
- 7 months of community sentiment tracking (July 2025 - January 2026)
- Top 5 technical problems (token consumption, context window, model quality, performance, GitHub issue bug)
- Top 5 feature requests
- Longitudinal data analysis across GitHub, Reddit, Discord, Twitter
- Recommendations for Claude Code documentation
Claimed Coverage: GitHub (4,697 open issues), Reddit sentiment (28-35/100), Discord discussions, Twitter mentions
🎯 Evaluation Score
Initial Score: 5/5 (Critical - Major gap in guide) Post-Challenge Score: 3/5 (Relevant - Useful complement) Post-Fact-Check Score: 2/5 (Marginal - Minimal mention or skip)
Score Justification
Downgrade reasons:
- Major factual errors: Version 2.0.61 doesn't exist (confused with v2.1.1)
- Timing errors: Token bug was January 2026, not December 2025
- Unverifiable stats: 4,697 issues (reality: 5,702), sentiment scores lack methodology
- Ephemeral data: Monthly community reports become obsolete quickly
- Maintenance burden: Would require monthly updates (unsustainable)
Upgrade reasons:
- ✅ Confirmed critical bugs: GitHub issue auto-creation (Issue #13797), token consumption (Issue #16856)
- ✅ Verified with sources: Anthropic postmortem on Aug 2025 model degradation
- ✅ Actionable workarounds: Practical solutions for users
- ✅ Security impact: Privacy risks from accidental public disclosures
✅ Fact-Check Results
Verification Methods
-
Perplexity Pro searches (4 queries):
- Token consumption bug v2.0.61
- GitHub issues count verification
- Accidental issue creation bug
- Model quality degradation August 2025
-
GitHub API direct queries:
gh api repos/anthropics/claude-code→ Stats verificationgh search issues→ Bug confirmation, wrong repo issues countgh issue view→ Specific issue detailsgh api releases→ Version existence check
Key Findings
| Claim | Status | Reality |
|---|---|---|
| v2.0.61 token bug (Dec 2025) | ❌ FALSE | v2.0.61 doesn't exist; real bug: v2.1.1 (Jan 2026) |
| 4,697 open issues | ❌ FALSE | 5,702 issues (as of Jan 28, 2026) |
| 263 issues labeled "invalid" | ❌ FALSE | 527 issues with "invalid" label |
| GitHub auto-creation bug | ✅ TRUE | Issue #13797 confirmed, 17+ examples found |
| Token consumption issues | ✅ PARTIAL | 20+ reports found, but Anthropic denies official bug |
| Model degradation Aug 2025 | ✅ TRUE | Anthropic official postmortem confirms 3 infrastructure bugs |
Sources Verified
✅ Confirmed:
- Anthropic Postmortem (Sept 17, 2025)
- Issue #13797 - GitHub auto-creation bug
- Issue #16856 - Token consumption v2.1.1
- The Register - Holiday bonus context
❌ Not Found:
- No mention of v2.0.61 in any source
- No public documentation of "263 invalid issues" stat
- No verifiable methodology for "sentiment 28-35/100" score
🚨 Critical Errors in Report
Error #1: Version Confusion
Report claim:
"Depuis décembre 2025 (version 2.0.61), les utilisateurs signalent une consommation de tokens 5-20x normale"
Reality:
- v2.0.61 does not exist in GitHub releases (only v2.0.73, v2.0.74, v2.0.76 found)
- Real bug: v2.1.1 (published Jan 7, 2026)
- First report: Issue #16856 on January 8, 2026
- Timing: January 2026, not December 2025
Impact: Critical factual error invalidating major section of report
Error #2: Stats Inflation/Deflation
| Metric | Report | Reality (Jan 28) | Variance |
|---|---|---|---|
| Open issues | 4,697 | 5,702 | -1,005 (-17.6%) |
| Issues "invalid" | 263 | 527 | -264 (-50%) |
| Wrong repo issues | 116 (44% of 263) | 17+ confirmed | Overestimated |
Impact: Undermines credibility of statistical analysis
Error #3: Unverifiable Sentiment Scores
Report claim: "Sentiment: 28-35/100 (janvier 2026)"
Problem:
- No methodology disclosed
- No tool/source specified
- Cannot be independently verified
- Likely manual interpretation without systematic measurement
Impact: Non-scientific claim presented as quantitative data
✅ What Was Integrated
Created: guide/known-issues.md (285 lines)
Section 1: Active Critical Issues
-
GitHub Issue Auto-Creation Bug (Issue #13797)
- Verified with 17+ examples
- Security/privacy risk documented
- Workarounds provided
- Examples of accidental disclosures
-
Excessive Token Consumption (Issue #16856, v2.1.1)
- 20+ reports documented
- Anthropic response quoted
- Holiday bonus context clarified
- Workarounds for users
Section 2: Resolved Historical Issues
- Model Quality Degradation (Aug-Sep 2025)
- Official Anthropic postmortem linked
- 3 infrastructure bugs detailed
- Community theories (quantization) debunked
- Resolution timeline confirmed
Section 3: Resources
- Issue statistics (verified via GitHub API)
- Tracking commands for users
- Official channels list
- Contributing guidelines
❌ What Was Rejected
- Version 2.0.61 references (non-existent)
- December 2025 timing for token bug (incorrect)
- Sentiment scores without methodology
- Unverifiable statistics (4,697 issues, 263 invalid)
- Recommendations for Anthropic (out of scope for user guide)
- Monthly update commitment (unsustainable maintenance)
📊 Integration Impact
Files Modified
-
guide/known-issues.md (NEW, 285 lines)
- Comprehensive critical bugs tracker
- Verified sources only
- Actionable workarounds
- Security awareness focus
-
guide/README.md (1 line added)
- Added known-issues.md to table of contents
- Description: "Critical bugs tracker: security issues, token consumption, verified community reports"
-
machine-readable/reference.yaml (4 entries added)
known_issues: Main file referenceknown_issues_github_bug: Line 16 (GitHub auto-creation)known_issues_token_consumption: Line 136 (Token usage)known_issues_model_quality_aug2025: Line 231 (Aug 2025 resolved)
-
CHANGELOG.md (16 lines added)
- Documented integration in [Unreleased] > Added
- Listed all 3 critical issues
- Noted fact-checking process
- Verified stats (5,702 issues, 527 invalid labels)
User Benefits
- Security awareness: Users warned about GitHub auto-creation bug (privacy risk)
- Cost management: Token consumption workarounds documented
- Trust building: Verified facts only, no speculation
- Historical context: Aug 2025 model degradation explained (resolved)
- Actionable guidance: Practical workarounds, not just problem descriptions
🔍 Methodology Evaluation
Strengths
- Comprehensive multi-platform analysis (GitHub, Reddit, Discord, Twitter)
- Longitudinal tracking (7 months)
- Identified real patterns (GitHub bug, token issues, model degradation)
- Detailed recommendations structure
Weaknesses
- Version confusion: Mixed up v2.0.61, v2.0.65, v2.1.1
- Unverified stats: 4,697 issues, sentiment scores lack source
- Timing errors: December vs January for token bug
- No primary sources cited: "Mentions 1,250+" without platform breakdown
- Survivorship bias: Community discussions over-represent problems
- No control group: No comparison with other tools' issue patterns
Lesson Learned
For future resource evaluations:
- ✅ Always fact-check claims via Perplexity + direct API queries
- ✅ Verify versions exist before documenting bugs
- ✅ Request methodology for statistical claims
- ✅ Cross-reference dates with release timelines
- ✅ Challenge auto-agents to find flaws before integration
- ❌ Don't trust community reports blindly - verify with official sources
🎯 Final Decision
Action Taken: PARTIAL INTEGRATION (verified facts only)
Rationale:
- Report contained valuable findings (3 real bugs verified)
- But also contained critical errors (version confusion, stat errors)
- Integration limited to fact-checked content only
- Rejected speculative/unverifiable claims
Confidence Level: Medium (verified sources exist, but report had errors)
Would Recommend This Resource: ❌ NO (too many factual errors, use primary sources instead)
Better Alternative: Direct GitHub Issues search + Anthropic official communications
📝 Evaluator Notes
This evaluation demonstrates the importance of systematic fact-checking before integrating community-sourced content. Even comprehensive analytical reports can contain:
- Version confusion
- Timing errors
- Unverifiable statistics
- Methodology gaps
Best practice: Treat analytical reports as leads to investigate, not facts to copy. Always verify with:
- Primary sources (GitHub Issues, official docs)
- API queries (GitHub API, not web search)
- Official communications (Anthropic blog, status page)
- Multiple independent sources for controversial claims
Result: Successfully extracted 3 verified critical bugs while filtering out errors, maintaining guide credibility.
Evaluation completed: January 28, 2026 Time invested: ~2 hours (research, fact-checking, integration, documentation) Token cost: ~130K tokens (Perplexity searches, GitHub queries, document creation)