From d7c29f49c91e10ff3990520d9fbd7d229880184f Mon Sep 17 00:00:00 2001 From: Florian BRUNIAUX Date: Sat, 17 Jan 2026 15:29:33 +0100 Subject: [PATCH] docs: prioritize grepai over mgrep as recommended semantic search MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Reorder sections in ultimate-guide.md (grepai first, mgrep second) - Update cheatsheet.md MCP table (mgrep → grepai) - Update quiz question 08-013 to reference grepai - Sync versions to 3.8.2 Rationale: grepai is fully open-source, runs locally (privacy-first), and offers call graph analysis that mgrep doesn't provide. Co-Authored-By: Claude Opus 4.5 --- CHANGELOG.md | 8 ++ guide/cheatsheet.md | 6 +- guide/ultimate-guide.md | 175 +++++++++++++++++++++++------ quiz/questions/08-mcp-servers.yaml | 23 ++-- 4 files changed, 161 insertions(+), 51 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 236f666..d32c954 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,14 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). ## [Unreleased] +### Changed + +- **Semantic search tools priority**: grepai now recommended over mgrep + - `guide/ultimate-guide.md`: Sections reordered (grepai first as "Recommended", mgrep as "Alternative") + - `guide/cheatsheet.md`: MCP Servers table updated (mgrep → grepai) + - `quiz/questions/08-mcp-servers.yaml`: Question 08-013 updated to reference grepai + - Rationale: grepai is fully open-source, runs locally (privacy), and offers call graph analysis + --- ## [3.8.2] - 2026-01-17 diff --git a/guide/cheatsheet.md b/guide/cheatsheet.md index 8655d4a..d8c2837 100644 --- a/guide/cheatsheet.md +++ b/guide/cheatsheet.md @@ -6,7 +6,7 @@ **Written with**: Claude (Anthropic) -**Version**: 3.8.1 | **Last Updated**: January 2026 +**Version**: 3.8.2 | **Last Updated**: January 2026 --- @@ -193,7 +193,7 @@ Model: Sonnet | Ctx: 89.5k | Cost: $2.11 | Ctx(u): 56.0% | Server | Purpose | |--------|---------| | **Serena** | Indexation + session memory + symbol search | -| **mgrep** | Semantic search by intent (alternative) | +| **grepai** | Semantic search + call graph analysis | | **Context7** | Library documentation | | **Sequential** | Structured reasoning | | **Playwright** | Browser automation | @@ -396,4 +396,4 @@ where.exe claude; claude doctor; claude mcp list **Author**: Florian BRUNIAUX | [@Méthode Aristote](https://methode-aristote.fr) | Written with Claude -*Last updated: January 2026 | Version 3.8.1* +*Last updated: January 2026 | Version 3.8.2* diff --git a/guide/ultimate-guide.md b/guide/ultimate-guide.md index 433f85e..52f1999 100644 --- a/guide/ultimate-guide.md +++ b/guide/ultimate-guide.md @@ -10,7 +10,7 @@ **Last updated**: January 2026 -**Version**: 3.8.1 +**Version**: 3.8.2 --- @@ -1143,6 +1143,107 @@ Claude Code has two distinct memory systems. Understanding the difference is cru - **Persistent memory**: Decisions you'll need in future sessions - **CLAUDE.md**: Team conventions, project structure (versioned with git) +### Fresh Context Pattern (Ralph Loop) + +#### The Problem: Context Rot + +Research shows LLM performance degrades significantly with accumulated context: +- **20-30% performance gap** between focused and polluted prompts ([Chroma, 2025](https://research.trychroma.com/context-rot)) +- Degradation starts at ~16K tokens for Claude models +- Failed attempts, error traces, and iteration history dilute attention + +Instead of managing context within a session, you can **restart with a fresh session per task** while persisting state externally. + +#### The Pattern + +```bash +# Canonical "Ralph Loop" (Geoffrey Huntley) +while :; do cat TASK.md PROGRESS.md | claude -p ; done +``` + +**State persists via**: +- `TASK.md` — Current task definition with acceptance criteria +- `PROGRESS.md` — Learnings, completed tasks, blockers +- Git commits — Each iteration commits atomically + +| Traditional | Fresh Context | +|-------------|---------------| +| Accumulate in chat history | Reset per task | +| `/compact` to compress | State in files + git | +| Context bleeds across tasks | Each task gets full attention | + +#### When to Use + +| Situation | Use | +|-----------|-----| +| Context 70-90%, staying interactive | `/compact` | +| Context 90%+, need fresh start | `/clear` then continue | +| Long autonomous run, task-based | Fresh Context Pattern | +| Overnight/AFK execution | Fresh Context Pattern | + +**Good fit**: +- Autonomous sessions >1 hour +- Migrations, large refactorings +- Tasks with clear success criteria (tests pass, build succeeds) + +**Poor fit**: +- Interactive exploration +- Design without clear spec +- Tasks with slow/ambiguous feedback loops + +#### Practical Implementation + +**Option 1: Manual loop** + +```bash +# Simple fresh-context loop +for i in {1..10}; do + echo "=== Iteration $i ===" + claude -p "$(cat TASK.md PROGRESS.md)" + git diff --stat # Check progress + read -p "Continue? (y/n) " -n 1 -r + [[ ! $REPLY =~ ^[Yy]$ ]] && break +done +``` + +**Option 2: Script** (see `examples/scripts/fresh-context-loop.sh`) + +```bash +./fresh-context-loop.sh 10 TASK.md PROGRESS.md +``` + +**Option 3: External orchestrators** + +- [AFK CLI](https://github.com/m0nkmaster/afk) — Zero-config orchestration across task sources + +#### Task Definition Template + +```markdown +# TASK.md + +## Current Focus +[Single atomic task with clear deliverable] + +## Acceptance Criteria +- [ ] Tests pass +- [ ] Build succeeds +- [ ] [Specific verification] + +## Context +- Related files: [paths] +- Constraints: [rules] + +## Do NOT +- Start other tasks +- Refactor unrelated code +``` + +#### Key Insight + +`/compact` preserves conversation flow. Fresh context maximizes per-task attention at the cost of continuity. + +> **Sources**: [Chroma Research - Context Rot](https://research.trychroma.com/context-rot) | [Ralph Loop Origin](https://block.github.io/goose/docs/tutorials/ralph-loop/) | [METR - Long Task Capability](https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/) | [Anthropic - Context Engineering](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents) + ### What Consumes Context? | Action | Context Cost | @@ -5046,44 +5147,11 @@ uvx --from git+https://github.com/oraios/serena serena project index > **Source**: [Serena GitHub](https://github.com/oraios/serena) -### mgrep (Semantic Search Alternative) - -**Purpose**: Natural language semantic search across code, docs, PDFs, and images. - -**Why consider mgrep**: While Serena focuses on symbol-level analysis, mgrep excels at **intent-based search** — finding code by describing what it does rather than exact patterns. Their benchmarks show ~2x fewer tokens used compared to grep-based workflows. - -**Key Features**: - -| Feature | Description | -|---------|-------------| -| **Semantic search** | Find code by natural language description | -| **Background indexing** | `mgrep watch` indexes respecting `.gitignore` | -| **Multi-format** | Search code, PDFs, images, text | -| **Web integration** | Web search fallback capability | - -**Example**: - -```bash -# Traditional grep (exact match required) -grep -r "authenticate.*user" . - -# mgrep (intent-based) -mgrep "code that handles user authentication" -``` - -**Use when**: -- Onboarding to unfamiliar codebases -- Exploring code by intent, not exact patterns -- Searching across mixed content (code + docs) - -> **Note**: I haven't tested mgrep personally. Consider it an alternative worth exploring. -> **Source**: [mgrep GitHub](https://github.com/mixedbread-ai/mgrep) - -### grepai (Semantic Search + Call Graph) +### grepai (Recommended Semantic Search) **Purpose**: Privacy-first semantic code search with call graph analysis. -**Why consider grepai**: Unlike mgrep, grepai is **fully open-source** and runs entirely locally using Ollama embeddings. Its killer feature is **call graph analysis** — trace who calls what function and visualize dependencies. +**Why grepai is recommended**: It's **fully open-source**, runs entirely locally using Ollama embeddings (no cloud/privacy concerns), and offers **call graph analysis** — trace who calls what function and visualize dependencies. This combination makes it the best choice for most semantic search needs. **Key Features**: @@ -5175,6 +5243,39 @@ grepai search "session creation logic" > **Source**: [grepai GitHub](https://github.com/yoanbernabeu/grepai) +### mgrep (Alternative Semantic Search) + +**Purpose**: Natural language semantic search across code, docs, PDFs, and images. + +**Why consider mgrep**: If you need **multi-format search** (code + PDFs + images) or prefer a cloud-based solution, mgrep is an alternative to grepai. Their benchmarks show ~2x fewer tokens used compared to grep-based workflows. + +**Key Features**: + +| Feature | Description | +|---------|-------------| +| **Semantic search** | Find code by natural language description | +| **Background indexing** | `mgrep watch` indexes respecting `.gitignore` | +| **Multi-format** | Search code, PDFs, images, text | +| **Web integration** | Web search fallback capability | + +**Example**: + +```bash +# Traditional grep (exact match required) +grep -r "authenticate.*user" . + +# mgrep (intent-based) +mgrep "code that handles user authentication" +``` + +**Use when**: +- Need to search across mixed content (code + PDFs + images) +- Prefer cloud-based embeddings over local Ollama setup +- grepai's call graph analysis isn't needed + +> **Note**: I haven't tested mgrep personally. Consider it an alternative worth exploring. +> **Source**: [mgrep GitHub](https://github.com/mixedbread-ai/mgrep) + ### Context7 (Documentation Lookup) **Purpose**: Access official library documentation. @@ -9634,4 +9735,4 @@ Thumbs.db **Contributions**: Issues and PRs welcome. -**Last updated**: January 2026 | **Version**: 3.8.1 +**Last updated**: January 2026 | **Version**: 3.8.2 diff --git a/quiz/questions/08-mcp-servers.yaml b/quiz/questions/08-mcp-servers.yaml index 12efd7a..41f16f4 100644 --- a/quiz/questions/08-mcp-servers.yaml +++ b/quiz/questions/08-mcp-servers.yaml @@ -311,32 +311,33 @@ questions: - id: "08-013" difficulty: "power" profiles: ["power"] - question: "What is mgrep's key differentiator compared to Serena?" + question: "What is grepai's key differentiator compared to Serena?" options: a: "It's faster" - b: "It provides semantic search by intent, not just patterns" + b: "It provides semantic search by intent plus call graph analysis" c: "It supports more languages" d: "It has better documentation" correct: "b" explanation: | - mgrep excels at intent-based search using natural language: + grepai excels at intent-based search using natural language, plus offers + call graph analysis to trace function dependencies: ```bash - # Traditional grep (exact match required) - grep -r "authenticate.*user" . + # Semantic search (finds code by meaning, not exact text) + grepai search "user authentication flow" - # mgrep (intent-based) - mgrep "code that handles user authentication" + # Who calls this function? + grepai trace callers "createSession" ``` - While Serena focuses on symbol-level analysis, mgrep finds code by - describing what it does rather than exact patterns. + While Serena focuses on symbol-level analysis, grepai finds code by + describing what it does and traces caller/callee relationships. - Use mgrep for onboarding to unfamiliar codebases or exploring by intent. + Use grepai for exploring unfamiliar codebases or understanding dependencies. doc_reference: file: "guide/ultimate-guide.md" section: "8.2 Available Servers" - anchor: "#mgrep-semantic-search-alternative" + anchor: "#grepai-recommended-semantic-search" - id: "08-014" difficulty: "senior"