- Create guide/data-privacy.md with retention policies (5y/30d/0) - Add privacy notice to README.md - Add section 2.6 "Data Flow & Privacy" to ultimate-guide.md - Add Golden Rule #7 to cheatsheet.md (know what's sent) - Add Phase 0.5 Privacy Awareness to onboarding-prompt.md - Add privacy checks to audit-prompt.md - Add PRIVACY CHECK section to audit-scan.sh (human + JSON) - Add privacy reminder to check-claude.sh - Create privacy-warning.sh SessionStart hook Addresses user awareness of Anthropic data retention and opt-out options. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
9 KiB
Data Privacy & Retention Guide
Critical: Everything you share with Claude Code is sent to Anthropic servers. This guide explains what data leaves your machine and how to protect sensitive information.
TL;DR - Retention Summary
| Configuration | Retention Period | Training | How to Enable |
|---|---|---|---|
| Default | 5 years | Yes | (default state) |
| Opt-out | 30 days | No | claude.ai/settings |
| Enterprise (ZDR) | 0 days | No | Enterprise contract |
Immediate action: Disable training data usage to reduce retention from 5 years to 30 days.
1. Understanding the Data Flow
What Leaves Your Machine
When you use Claude Code, the following data is sent to Anthropic:
┌─────────────────────────────────────────────────────────────┐
│ YOUR LOCAL MACHINE │
├─────────────────────────────────────────────────────────────┤
│ • Prompts you type │
│ • Files Claude reads (including .env if not excluded!) │
│ • MCP server results (SQL queries, API responses) │
│ • Bash command outputs │
│ • Error messages and stack traces │
└───────────────────────┬─────────────────────────────────────┘
│
▼ HTTPS
┌─────────────────────────────────────────────────────────────┐
│ ANTHROPIC API │
├─────────────────────────────────────────────────────────────┤
│ • Processes your request │
│ • Stores conversation based on retention policy │
│ • May use data for model training (if not opted out) │
└─────────────────────────────────────────────────────────────┘
What This Means in Practice
| Scenario | Data Sent to Anthropic |
|---|---|
You ask Claude to read src/app.ts |
Full file contents |
You run git status via Claude |
Command output |
MCP executes SELECT * FROM users |
Query results with user data |
Claude reads .env file |
API keys, passwords, secrets |
| Error occurs in your code | Full stack trace with paths |
2. Anthropic Retention Policies
Tier 1: Default (Training Enabled)
- Retention: 5 years
- Usage: Model improvement, training data
- Applies to: Free, Pro, Max plans without opt-out
Tier 2: Training Disabled (Opt-Out)
- Retention: 30 days
- Usage: Safety monitoring, abuse prevention only
- How to enable:
- Go to https://claude.ai/settings/data-privacy-controls
- Disable "Allow model training on your conversations"
- Changes apply immediately
Tier 3: Enterprise API (Zero Data Retention)
- Retention: 0 days (real-time processing only)
- Usage: None - data not stored
- Requires: Enterprise contract with Anthropic
- Use cases: HIPAA, GDPR, PCI-DSS compliance, government contracts
3. Known Risks
Risk 1: Automatic File Reading
Claude Code reads files to understand context. By default, this includes:
.envand.env.localfiles (API keys, passwords)credentials.json,secrets.yaml(service accounts)- SSH keys if in workspace scope
- Database connection strings
Mitigation: Configure excludePatterns (see Section 4).
Risk 2: MCP Database Access
When you configure database MCP servers (Neon, Supabase, PlanetScale):
Your Query: "Show me recent orders"
↓
MCP Executes: SELECT * FROM orders LIMIT 100
↓
Results Sent: 100 rows with customer names, emails, addresses
↓
Stored at Anthropic: According to your retention tier
Mitigation: Never connect production databases. Use dev/staging with anonymized data.
Risk 3: Shell Command Output
Bash commands and their output are included in context:
# This output goes to Anthropic:
$ env | grep API
OPENAI_API_KEY=sk-abc123...
STRIPE_SECRET_KEY=sk_live_...
Mitigation: Use hooks to filter sensitive command outputs.
Risk 4: Documented Community Incidents
| Incident | Source |
|---|---|
Claude reads .env by default |
r/ClaudeAI, GitHub issues |
| DROP TABLE attempts on poorly configured MCP | r/ClaudeAI |
| Credentials exposed via environment variables | GitHub issues |
| Prompt injection via malicious MCP servers | r/programming |
4. Protective Measures
Immediate Actions
4.1 Opt-Out of Training
- Visit https://claude.ai/settings/data-privacy-controls
- Toggle OFF "Allow model training"
- Retention reduces from 5 years to 30 days
4.2 Configure File Exclusions
In .claude/settings.json:
{
"excludePatterns": [
".env",
".env.*",
"**/.env",
"**/.env.*",
"**/credentials*",
"**/secrets*",
"**/*.pem",
"**/*.key",
"**/service-account*.json"
]
}
Or create .claudeignore in project root:
# Secrets
.env
.env.*
*.pem
*.key
credentials.json
secrets/
# Sensitive configs
**/config/production.*
4.3 Use Security Hooks
Create .claude/hooks/PreToolUse.sh:
#!/bin/bash
INPUT=$(cat)
TOOL_NAME=$(echo "$INPUT" | jq -r '.tool.name')
if [[ "$TOOL_NAME" == "Read" ]]; then
FILE_PATH=$(echo "$INPUT" | jq -r '.tool.input.file_path')
# Block reading sensitive files
if [[ "$FILE_PATH" =~ \.env|credentials|secrets|\.pem|\.key ]]; then
echo "BLOCKED: Attempted to read sensitive file: $FILE_PATH" >&2
exit 2 # Block the operation
fi
fi
MCP Best Practices
| Rule | Rationale |
|---|---|
| Never connect production databases | All query results sent to Anthropic |
| Use read-only database users | Prevents DROP/DELETE/UPDATE accidents |
| Anonymize development data | Reduces PII exposure risk |
| Create minimal test datasets | Less data = less risk |
| Audit MCP server sources | Third-party MCPs may have vulnerabilities |
For Teams
| Environment | Recommendation |
|---|---|
| Development | Opt-out + exclusions + anonymized data |
| Staging | Consider Enterprise API if handling real data |
| Production | NEVER connect Claude Code directly |
5. Comparison with Other Tools
| Feature | Claude Code + MCP | Cursor | GitHub Copilot |
|---|---|---|---|
| Data scope sent | Full SQL results, files | Code snippets | Code snippets |
| Production DB access | Yes (via MCP) | Limited | Not designed for |
| Default retention | 5 years | Variable | 30 days |
| Training by default | Yes | Opt-in | Opt-in |
Key difference: MCP creates a unique attack surface because MCP servers are separate processes with independent network/filesystem access.
6. Enterprise Considerations
When to Use Enterprise API (ZDR)
- Handling PII (names, emails, addresses)
- Regulated industries (HIPAA, GDPR, PCI-DSS)
- Client data processing
- Government contracts
- Financial services
Evaluation Checklist
- Data classification policy exists for your organization
- API tier matches data sensitivity requirements
- Team trained on privacy controls
- Incident response plan for potential data exposure
- Legal/compliance review completed
7. Quick Reference
Links
| Resource | URL |
|---|---|
| Privacy settings | https://claude.ai/settings/data-privacy-controls |
| Anthropic usage policy | https://www.anthropic.com/policies |
| Enterprise information | https://www.anthropic.com/enterprise |
| Terms of service | https://www.anthropic.com/legal/consumer-terms |
Commands
# Check current Claude config
claude /config
# Verify exclusions are loaded
claude /status
# Run privacy audit
./examples/scripts/audit-scan.sh
Quick Checklist
- Training opt-out enabled at claude.ai/settings
.env*files in excludePatterns or .claudeignore- No production database connections via MCP
- Security hooks installed for sensitive file access
- Team aware of data flow to Anthropic
Changelog
- 2026-01: Initial version - documenting retention policies and protective measures