- Add examples/scripts/smart-suggest-roi.py: stdlib-only analyzer correlating suggestion log with session JSONL files to measure command acceptance rate. 4 acceptance signals, tier breakdown, daily trend, --json/--since/--no-sessions CLI. - Tune Aristote smart-suggest hook: tighten 5 over-firing triggers (/tech:commit, /tech:sonarqube, /tech:dupes, /check-conventions a11y, /tech:worktree) - Guide: identity re-injection hook, context engineering maturity grid, code review workflow, 1M context window GA update, Spring Break promo, security audit patterns - Resource evaluations: Nick Tune hooks (3/5), VicKayro security audit (2/5), Karl Mazier CLAUDE.md templates, Paul Rayner ContextFlow, Siddhant agent trace, Andrew Yng context hub, JP Caparas 1M context window Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
5.4 KiB
Resource Evaluation: "What Claude's 1M Token Context Window Means for Your Work"
Source: https://reading.sh/what-claudes-1m-token-context-window-means-for-your-work-3c9f900f04c6 Author: JP Caparas (Medium) Published: ~March 15, 2026 (13 min read, 62 claps) Type: Plain-language explainer / pricing analysis Evaluated: 2026-03-15 Score: 2/5 (Marginal — do not integrate)
Summary
Plain-language explainer covering: what tokens are and what 1M tokens can hold (codebases, legal docs, academic papers), claim that 1M context went GA on March 13 for Opus 4.6, pricing comparison against OpenAI GPT-5.4 / Google Gemini 3.1 Pro / Meta Llama 4, Claude Code workflow impact (compaction reduction, full-codebase sessions), enterprise use cases (compliance review, codebase migration, incident response), "lost in the middle" effect as ongoing limitation, MRCR v2 score (78.3% Opus 4.6), and quotes from Jon Bell (CPO Codeium) and Anton Biryukov (Ramp).
Fact-Check
| Claim | Status | Notes |
|---|---|---|
| 1M context GA March 13, 2026 | Unconfirmed | Perplexity shows "beta" still active; March 13 maps to a usage promotion, not a GA announcement |
| Flat pricing $5/$25 MTok, no surcharge at any length | FALSE | 2x input / 1.5x output surcharge confirmed above 200K tokens (awesomeagents.ai, puter.com); this is the article's central thesis |
| Opus 4.6 base price $5/$25 MTok | Confirmed | Consistent across multiple pricing sources |
| Sonnet 4.6 $3/$15 MTok | Confirmed | Consistent across multiple pricing sources |
| Cached reads $0.50/MTok (90% discount) | Plausible | Standard Anthropic caching discount |
| Cache writes $6.25/MTok | Plausible | Standard Anthropic caching pricing |
| OpenAI GPT-5.4 2x rate limit above 272K | Unverified | No primary source found |
| Google Gemini 3.1 Pro surcharge above 200K | Partially confirmed | Google does surcharge; exact numbers differ from article |
| MRCR v2: 78.3% Opus 4.6 | Discrepancy | Guide cites 76% from Anthropic blog; article may reference a different variant or a revised benchmark run |
| Jon Bell (Codeium) — 15% compaction decrease | Unverifiable | No independent corroboration found |
| Anton Biryukov (Ramp) quote | Unverifiable | No independent corroboration found |
| 600 images/PDF per request (up from 100) | Unverified | Not confirmed in official API docs |
| Haiku caps at 200K | Confirmed | Consistent with known specs |
Critical finding: The article's differentiating claim is "Anthropic's bet is flat pricing — no surcharge regardless of length." This is factually wrong. Anthropic charges 2x input and 1.5x output above 200K tokens. The entire competitive pricing analysis built on this premise is unreliable.
Comparative Analysis
| Aspect | This resource | Our guide |
|---|---|---|
| Token basics (what is a token) | Covered, beginner-friendly | Not covered — assumes dev audience |
| 1M context capabilities | Generic scenarios | Covered with MRCR benchmarks + cost tables |
| Pricing vs competitors | Covered — but factually wrong on flat pricing | Partially covered (Gemini comparison, line 2053) |
| Compaction events | Mentions 15% reduction (unverifiable quote) | Covered in depth (architecture.md lines 391-438) |
| "Lost in the middle" effect | Mentioned with arxiv:2307.03172 reference | Not explicitly covered |
| Enterprise use cases | Hypothetical, no measured data | Covered across multiple sections |
| MRCR v2 benchmark | 78.3% (discrepancy with our 76%) | 76% from Anthropic blog |
| Claude Code workflow impact | Good framing (search phase = 100K+ tokens) | Covered via compaction + context engineering |
Challenge Notes
The technical-writer review agreed score 2/5 is justified. An initial proposal of 3/5 was revised downward because the fact-check demolishes the article's core value proposition — wrong pricing data cannot complement a guide that aims for accuracy. The Povilas Korop anecdote (83% context utilization on a Laravel project) is a nice real-world datapoint, but anecdotal and insufficient to shift the score.
Key insight from review: "Fix stale info in the guide" is the real action item, independent of this resource. If 1M context is actually GA, our guide still says "beta" at lines 2028-2070 — that needs a fix regardless of this article.
Decision
Do not integrate.
Specific exclusions
- Pricing comparison table (central claim on flat pricing is wrong)
- Enterprise use cases (hypothetical, no measured data)
- Unverifiable quotes (Jon Bell, Anton Biryukov)
Independent action items triggered by this review
- Verify 1M GA status via official Anthropic docs / API changelog
- If confirmed GA: Update
guide/ultimate-guide.mdlines 2028-2070 (currently says "beta") - If confirmed GA: Verify whether the 200K surcharge structure changed at all
- Consider adding: One line on "lost in the middle" effect in
guide/core/context-engineering.md(well-documented limitation, arxiv:2307.03172 — independent of this article)
Worth independent verification
The Ramp workflow pattern (search Datadog + Braintrust + DB + source = 100K tokens before writing a single fix) is a useful illustration of why large context windows matter for real engineering workflows — worth adding if an independent source confirms it.
Confidence: High. The fact-check identified a critical error in the central thesis; no ambiguity on the decision.