docs: add AI productivity research, trust calibration, and exploration workflow
## New Content ### Trust & Verification (ultimate-guide.md) - Section 1.7 "Trust Calibration: When and How Much to Verify" (~155 lines) - Research-backed stats (ACM, Veracode, CodeRabbit, Cortex.io) - Verification spectrum by code type - Solo vs Team strategies with workflow diagrams - "Prove It Works" checklist - New pitfall: "Trust AI output without proportional verification" - CLAUDE.md size guideline: 4-8KB optimal, >16K degrades coherence ### AI Productivity (learning-with-ai.md) - Section "The Reality of AI Productivity" (~55 lines) - Productivity curve phases (Wow Effect → Targeted Gains → Plateau) - High-gain vs low/negative-gain task categorization - Team success factors - Productivity trajectory table by pattern (Dependent/Avoidant/Augmented) - 5 new sources (GitHub, McKinsey, Stack Overflow, Uplevel, DORA) ### Session Limits (architecture.md) - "Session Degradation Limits" section - Turn limits (15-25), token thresholds (80-100K) - Success rates by scope (1-3 files: ~85%, 8+ files: ~40%) ### Exploration Workflow - NEW: guide/workflows/exploration-workflow.md - Anti-anchoring prompts, 3-5 approaches pattern - iterative-refinement.md: Script Generation Workflow (3-7 iteration pattern) - anchor-catalog.md: Anti-Anchoring Techniques, Exploration/Iteration Prompts ### Reference Updates - adoption-approaches.md: Empirical data section - reference.yaml: New deep_dive entries, updated line numbers Sources: MetalBear engineering blog, arXiv studies, Addy Osmani (Jan 2026) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
a9d302326c
commit
fd17414abb
10 changed files with 775 additions and 20 deletions
318
guide/workflows/exploration-workflow.md
Normal file
318
guide/workflows/exploration-workflow.md
Normal file
|
|
@ -0,0 +1,318 @@
|
|||
# Exploration Before Implementation
|
||||
|
||||
> **Confidence**: Tier 2 — Validated by practitioner studies (+20-30% decision quality, +40% alternatives identified).
|
||||
> **Source**: [MetalBear Engineering Blog](https://metalbear.com/blog/engineering-ai-use/), arXiv practitioner studies
|
||||
|
||||
Before coding, ask Claude for multiple approaches with trade-offs. This prevents anchoring bias—the tendency to fixate on the first solution proposed.
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [TL;DR](#tldr)
|
||||
2. [The Pattern](#the-pattern)
|
||||
3. [Anti-Anchoring Prompts](#anti-anchoring-prompts)
|
||||
4. [When to Use](#when-to-use)
|
||||
5. [Integration with Claude Code](#integration-with-claude-code)
|
||||
6. [Anti-Patterns](#anti-patterns)
|
||||
7. [See Also](#see-also)
|
||||
|
||||
---
|
||||
|
||||
## TL;DR
|
||||
|
||||
```
|
||||
1. Describe problem (no code, no preconception)
|
||||
2. Request 3-5 approaches with trade-offs
|
||||
3. Ask for quantified comparison
|
||||
4. Choose approach
|
||||
5. Then implement
|
||||
```
|
||||
|
||||
Key insight: **Once a model proposes a concrete solution, it can unintentionally narrow your thinking.**
|
||||
|
||||
---
|
||||
|
||||
## The Pattern
|
||||
|
||||
### Step 1: Problem Statement Only
|
||||
|
||||
Start with the problem, not a solution direction:
|
||||
|
||||
```
|
||||
I need to handle user sessions in a Node.js API.
|
||||
Requirements:
|
||||
- Support 10K concurrent users
|
||||
- Session data: user ID, permissions, preferences
|
||||
- Must survive server restarts
|
||||
```
|
||||
|
||||
**Not this** (anchors on Redis):
|
||||
```
|
||||
I'm thinking of using Redis for sessions. How should I implement it?
|
||||
```
|
||||
|
||||
### Step 2: Request Multiple Approaches
|
||||
|
||||
```
|
||||
Give me 4 different approaches to solve this.
|
||||
For each, include:
|
||||
- Architecture overview
|
||||
- Pros and cons
|
||||
- Performance characteristics
|
||||
- Complexity to implement
|
||||
```
|
||||
|
||||
### Step 3: Quantified Comparison
|
||||
|
||||
```
|
||||
Now rank these approaches on a 1-10 scale for:
|
||||
- Latency (lower is better)
|
||||
- Scalability (10K → 100K users)
|
||||
- Operational complexity
|
||||
- Development time
|
||||
```
|
||||
|
||||
### Step 4: Choose, Then Implement
|
||||
|
||||
```
|
||||
I'll go with approach B (JWT + Redis hybrid).
|
||||
Now implement it following our existing patterns in src/auth/.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Anchoring Prompts
|
||||
|
||||
LLMs can fixate on their first suggestion. These prompts combat that:
|
||||
|
||||
| Prompt Type | Template | Effect |
|
||||
|-------------|----------|--------|
|
||||
| **Fresh start** | "Ignore any prior ideas. Generate 4 novel approaches to [X]" | Forces diversity |
|
||||
| **Reflection loop** | "Generate 3 options, then critique each, then recommend" | Self-correction (-25% anchoring bias) |
|
||||
| **Quantified trade-offs** | "Rank by [metric1], [metric2], [metric3] with scores 1-10" | Objective comparison |
|
||||
| **Devil's advocate** | "What are the strongest arguments against your recommendation?" | Surface hidden trade-offs |
|
||||
| **Constraint variation** | "Now solve the same problem with [opposite constraint]" | Expand solution space |
|
||||
|
||||
### Example: Anti-Anchoring Prompt
|
||||
|
||||
```
|
||||
I need pagination for a REST API with 1M+ records.
|
||||
|
||||
IMPORTANT: Don't suggest offset-based pagination first.
|
||||
Generate 4 different pagination strategies, including at least one
|
||||
unconventional approach. For each:
|
||||
|
||||
1. How it works (2-3 sentences)
|
||||
2. Best use case
|
||||
3. Worst use case
|
||||
4. Performance at 1M records
|
||||
|
||||
Then recommend one, explaining why it beats the others for my use case.
|
||||
```
|
||||
|
||||
### Reflection Loop Prompt
|
||||
|
||||
```
|
||||
For implementing real-time notifications:
|
||||
|
||||
Phase 1: Generate 3 approaches (WebSockets, SSE, Long Polling)
|
||||
Phase 2: For each, list 2 things that could go wrong in production
|
||||
Phase 3: Based on Phase 2, which approach is most resilient?
|
||||
|
||||
Show your reasoning for each phase.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## When to Use
|
||||
|
||||
### Use Exploration
|
||||
|
||||
| Scenario | Why |
|
||||
|----------|-----|
|
||||
| Greenfield features | No existing pattern to follow |
|
||||
| Architecture decisions | High impact, hard to reverse |
|
||||
| Multiple valid approaches | Need informed choice |
|
||||
| Unfamiliar domain | Don't know what you don't know |
|
||||
| Team disagreement | Get neutral analysis of options |
|
||||
|
||||
### Skip Exploration
|
||||
|
||||
| Scenario | Why |
|
||||
|----------|-----|
|
||||
| Bug fixes | Solution usually obvious from symptoms |
|
||||
| Single valid approach | No real choice to make |
|
||||
| Time-critical hotfixes | Speed > perfection |
|
||||
| Following existing pattern | Decision already made |
|
||||
| Trivial changes | Overhead not worth it |
|
||||
|
||||
---
|
||||
|
||||
## Integration with Claude Code
|
||||
|
||||
### With /plan Mode
|
||||
|
||||
Exploration happens **before** `/plan`:
|
||||
|
||||
```
|
||||
# Step 1: Explore (no /plan yet)
|
||||
I need to add caching to the API. What are my options?
|
||||
|
||||
# Claude responds with 4 approaches
|
||||
|
||||
# Step 2: Choose
|
||||
Let's go with approach C (edge caching with Cloudflare).
|
||||
|
||||
# Step 3: Plan
|
||||
/plan
|
||||
Implement edge caching using Cloudflare Workers.
|
||||
Follow the patterns in our existing middleware.
|
||||
```
|
||||
|
||||
### With CLAUDE.md
|
||||
|
||||
Add exploration triggers to your project instructions:
|
||||
|
||||
```markdown
|
||||
## Workflow Preferences
|
||||
|
||||
### Before New Features
|
||||
When implementing new features, first explore 3-4 approaches
|
||||
with trade-offs before committing to implementation.
|
||||
Use quantified comparison (1-10 scale) for:
|
||||
- Performance
|
||||
- Maintainability
|
||||
- Time to implement
|
||||
```
|
||||
|
||||
### With TodoWrite
|
||||
|
||||
Track exploration as a task:
|
||||
|
||||
```
|
||||
TodoWrite:
|
||||
- [x] Explore caching approaches (4 options analyzed)
|
||||
- [x] Choose approach: edge caching with Cloudflare
|
||||
- [ ] Implement cache invalidation
|
||||
- [ ] Add cache headers to responses
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### Premature Anchoring
|
||||
|
||||
```
|
||||
# Wrong
|
||||
"I want to use Redis for caching. How do I set it up?"
|
||||
|
||||
# Right
|
||||
"I need caching for API responses. What are my options?"
|
||||
```
|
||||
|
||||
You've eliminated 90% of solutions before exploring.
|
||||
|
||||
### Shallow Comparison
|
||||
|
||||
```
|
||||
# Wrong
|
||||
"Which is better: Redis or Memcached?"
|
||||
|
||||
# Right
|
||||
"Compare Redis, Memcached, and in-process caching for:
|
||||
- 50MB cache size
|
||||
- Read-heavy workload (95% reads)
|
||||
- Single server deployment
|
||||
Include latency, memory overhead, and operational complexity."
|
||||
```
|
||||
|
||||
### Ignoring Trade-offs
|
||||
|
||||
```
|
||||
# Wrong
|
||||
"Just tell me the best approach"
|
||||
|
||||
# Right
|
||||
"What would I give up by choosing approach A over B?
|
||||
What scenarios would make me regret this choice?"
|
||||
```
|
||||
|
||||
### Analysis Paralysis
|
||||
|
||||
```
|
||||
# Wrong
|
||||
Exploring 10 options for a simple feature
|
||||
|
||||
# Right
|
||||
Set a limit: "Give me your top 3 approaches, no more"
|
||||
Make a decision and iterate if needed
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Example Session
|
||||
|
||||
### Initial Exploration
|
||||
|
||||
```
|
||||
User: I need to implement real-time updates in our React dashboard.
|
||||
The dashboard shows ~50 metrics that update every 5 seconds.
|
||||
Currently we poll, but users complain about stale data.
|
||||
|
||||
Claude: Here are 4 approaches:
|
||||
|
||||
**A) WebSockets**
|
||||
- Full duplex, server can push anytime
|
||||
- Pros: Lowest latency, true real-time
|
||||
- Cons: Connection management, scaling complexity
|
||||
- Best for: Sub-second updates, bidirectional communication
|
||||
|
||||
**B) Server-Sent Events (SSE)**
|
||||
- Unidirectional server → client stream
|
||||
- Pros: Simple, HTTP-based, auto-reconnect
|
||||
- Cons: One-way only, limited browser connections
|
||||
- Best for: Dashboard feeds, notifications
|
||||
|
||||
**C) Long Polling**
|
||||
- Client holds request until data ready
|
||||
- Pros: Works everywhere, simple fallback
|
||||
- Cons: Higher latency, more server load
|
||||
- Best for: Legacy compatibility
|
||||
|
||||
**D) GraphQL Subscriptions**
|
||||
- WebSocket-based with GraphQL schema
|
||||
- Pros: Type-safe, selective updates
|
||||
- Cons: Requires GraphQL infrastructure
|
||||
- Best for: Existing GraphQL APIs
|
||||
|
||||
User: Rank these for our use case (React, 50 metrics, 5s refresh,
|
||||
team has no WebSocket experience).
|
||||
|
||||
Claude: Scores (1-10, higher is better):
|
||||
|
||||
| Approach | Latency | Complexity | Team Fit | Scalability |
|
||||
|----------|---------|------------|----------|-------------|
|
||||
| WebSockets | 10 | 4 | 3 | 7 |
|
||||
| SSE | 8 | 8 | 8 | 8 |
|
||||
| Long Polling | 5 | 9 | 9 | 5 |
|
||||
| GraphQL Subs | 9 | 3 | 2 | 7 |
|
||||
|
||||
**Recommendation: SSE**
|
||||
Best balance of simplicity and real-time capability.
|
||||
Your team can implement it in a day, and it handles your
|
||||
50 metrics × 5s updates easily. Upgrade to WebSockets later
|
||||
if you need bidirectional or sub-second updates.
|
||||
|
||||
User: Let's go with SSE. Now implement it.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [plan-driven.md](./plan-driven.md) — For codebase exploration (/plan mode)
|
||||
- [iterative-refinement.md](./iterative-refinement.md) — Refine after choosing approach
|
||||
- [../examples/semantic-anchors/anchor-catalog.md](../../examples/semantic-anchors/anchor-catalog.md) — Precision vocabulary for prompts
|
||||
- [spec-first.md](./spec-first.md) — Define requirements before exploring
|
||||
|
|
@ -13,8 +13,10 @@ Prompt, observe, reprompt until satisfied. The core loop of effective AI-assiste
|
|||
3. [Feedback Patterns](#feedback-patterns)
|
||||
4. [Autonomous Loops](#autonomous-loops)
|
||||
5. [Integration with Claude Code](#integration-with-claude-code)
|
||||
6. [Anti-Patterns](#anti-patterns)
|
||||
7. [See Also](#see-also)
|
||||
6. [Script Generation Workflow](#script-generation-workflow)
|
||||
7. [Iteration Strategies](#iteration-strategies)
|
||||
8. [Anti-Patterns](#anti-patterns)
|
||||
9. [See Also](#see-also)
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -189,6 +191,79 @@ Good progress. Let's checkpoint:
|
|||
|
||||
---
|
||||
|
||||
## Script Generation Workflow
|
||||
|
||||
Script and automation generation delivers the highest ROI for iterative refinement—70-90% time savings in practitioner reports. Scripts are self-contained, testable in isolation, and yield immediate value.
|
||||
|
||||
### The 3-7 Iteration Pattern
|
||||
|
||||
Most production-ready scripts emerge after 3-7 iterations:
|
||||
|
||||
| Iteration | Focus | Prompt Pattern |
|
||||
|-----------|-------|----------------|
|
||||
| 1 | Basic functionality | "Create a script that [goal]" |
|
||||
| 2-3 | Constraints + edge cases | "Add [constraint]. Handle [edge case]." |
|
||||
| 4-5 | Hardening | "Add error handling, logging, input validation" |
|
||||
| 6-7 | Polish | "Optimize for [metric]. Add usage docs." |
|
||||
|
||||
### Example: Kubernetes Pod Manager (PowerShell)
|
||||
|
||||
**Iteration 1 — Basic**
|
||||
```
|
||||
Create a PowerShell function to list pods in a Kubernetes namespace.
|
||||
```
|
||||
|
||||
**Iteration 2 — Add filtering**
|
||||
```
|
||||
Add: filter by label selector and pod status.
|
||||
Show: pod name, status, age, restarts.
|
||||
```
|
||||
|
||||
**Iteration 3 — Add actions**
|
||||
```
|
||||
Add: ability to delete pods matching filter.
|
||||
Require: confirmation before deletion.
|
||||
```
|
||||
|
||||
**Iteration 4 — Error handling**
|
||||
```
|
||||
Handle: kubectl not found, invalid namespace, permission denied.
|
||||
Add: verbose logging with -Verbose flag.
|
||||
```
|
||||
|
||||
**Iteration 5 — Production ready**
|
||||
```
|
||||
Add: dry-run mode, output to JSON for piping, help documentation.
|
||||
Ensure: works on Windows, Linux, macOS.
|
||||
```
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
| Pitfall | Example | Mitigation |
|
||||
|---------|---------|------------|
|
||||
| Hallucinated commands | `apt-get` on macOS | Specify OS: "Ubuntu 22.04 only" |
|
||||
| Security gaps | No input validation | Always request: "validate all user inputs" |
|
||||
| Over-engineering | Adds unnecessary libs | Request: "minimal dependencies, stdlib preferred" |
|
||||
| Context drift | Forgets requirements after iteration 5 | Checkpoint prompt: "Recap current requirements before next change" |
|
||||
| Platform assumptions | Assumes bash features in sh | Specify: "POSIX-compliant" or "bash 4+" |
|
||||
|
||||
### Script Iteration Template
|
||||
|
||||
```
|
||||
Current script: [paste or reference]
|
||||
|
||||
Iteration goal: [specific improvement]
|
||||
|
||||
Constraints:
|
||||
- Must preserve: [existing behavior to keep]
|
||||
- Must not: [things to avoid]
|
||||
- Target environment: [OS, shell, runtime]
|
||||
|
||||
Success criteria: [how to verify this iteration works]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Iteration Strategies
|
||||
|
||||
### Breadth-First
|
||||
|
|
@ -307,6 +382,7 @@ Perfect. Commit this as "feat: add debounce utility with full TypeScript support
|
|||
|
||||
## See Also
|
||||
|
||||
- [exploration-workflow.md](./exploration-workflow.md) — Explore alternatives before iterating
|
||||
- [tdd-with-claude.md](./tdd-with-claude.md) — TDD is iterative refinement with tests
|
||||
- [plan-driven.md](./plan-driven.md) — Plan before iterating
|
||||
- [../methodologies.md](../methodologies.md) — Iterative Loops methodology
|
||||
|
|
|
|||
|
|
@ -244,6 +244,7 @@ Plans in `.claude/plans/` serve as decision documentation:
|
|||
|
||||
## See Also
|
||||
|
||||
- [exploration-workflow.md](./exploration-workflow.md) — Explore alternatives before planning
|
||||
- [../ultimate-guide.md](../ultimate-guide.md) — Section 2.3 Plan Mode
|
||||
- [tdd-with-claude.md](./tdd-with-claude.md) — Combine with TDD
|
||||
- [spec-first.md](./spec-first.md) — Combine with Spec-First
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue