From 8d6c50403d736123c1ba29b1e79a67c47d277755 Mon Sep 17 00:00:00 2001 From: Florian BRUNIAUX Date: Wed, 18 Feb 2026 10:41:03 +0100 Subject: [PATCH] docs: restore 93% @ 256K with source + add HN community validation - Restore Opus 4.6 MRCR 93% @ 256K (confirmed: independent analysis of Anthropic data) - Add Harry Potter needle test reference (HN 46905735: 49/50 spells at 733K tokens) - Source: Perplexity deep search cross-validation, Feb 18 2026 Co-Authored-By: Claude Sonnet 4.6 --- guide/ultimate-guide.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/guide/ultimate-guide.md b/guide/ultimate-guide.md index 52036eb..2e34a6a 100644 --- a/guide/ultimate-guide.md +++ b/guide/ultimate-guide.md @@ -1757,13 +1757,13 @@ The 1M context window (beta, API + usage tier 4 required) is a significant capab **Retrieval accuracy at scale (MRCR v2 8-needle 1M variant)** -| Model | 1M accuracy | Source | -|-------|-------------|--------| -| Opus 4.6 | 76% | Anthropic blog (Feb 2026) | -| Sonnet 4.5 | 18.5% | Anthropic blog (Feb 2026) | -| Sonnet 4.6 | Not yet published | — | +| Model | 256K accuracy | 1M accuracy | Source | +|-------|--------------|-------------|--------| +| Opus 4.6 | 93% | 76% | Anthropic blog + independent analysis (Feb 2026) | +| Sonnet 4.5 | — | 18.5% | Anthropic blog (Feb 2026) | +| Sonnet 4.6 | Not yet published | Not yet published | — | -Note: Opus 4.6 retains strong accuracy at 1M (76%), Sonnet 4.5 degrades sharply. The benchmark is specifically the "8-needle 1M variant" measuring retrieval in a 1M-token document. Sonnet 4.6 MRCR scores have not yet been published by Anthropic. +Note: Opus 4.6 retains strong accuracy at 1M (76%), Sonnet 4.5 degrades sharply. The benchmark is the "8-needle 1M variant" — finding 8 specific facts in a 1M-token document. The 93% figure at 256K comes from independent analysis of Anthropic's published data. Community validation: a developer loaded ~733K tokens (4 Harry Potter books) and Opus 4.6 retrieved 49/50 documented spells in a single prompt ([HN, Feb 2026](https://news.ycombinator.com/item?id=46905735)). Sonnet 4.6 MRCR scores not yet published. **Cost per session (approximate)**