Infrastructure: - security/: PII sanitizer with scan/sanitize modes, pre-commit hook, configurable blocklists - telemetry/: GStack-style opt-in usage analytics, local stats viewer, version checker - CONTRIBUTING.md: Privacy-first contributor guidelines with anonymization rules - VERSION: 1.0.0 README updated with Privacy & Security and Telemetry sections.
2.3 KiB
2.3 KiB
Security Sanitizer
Scans and redacts PII / sensitive data from files in this repo.
What It Catches
| Type | Examples |
|---|---|
| user@example.com | |
| PHONE | (555) 123-4567, +1-555-123-4567 |
| SSN | 123-45-6789 |
| API_KEY | sk-xxx, ghp_xxx, op_xxx, Bearer tokens, KEY=value patterns |
| IP_ADDRESS | Public IPv4 addresses (skips localhost/private ranges) |
| URL_CREDENTIALS | https://user:pass@host.com |
| AMOUNT | $1,234.56, $1.2M, $500K |
| COMPANY | Names from blocklist (configurable) |
| PERSON | Names from blocklist + title-prefixed names (Mr./Dr./etc.) |
| CUSTOM | Any regex you add to the config |
Quick Start
# Scan the whole repo (dry run — changes nothing)
python3 security/sanitizer.py --scan --dir . --recursive
# Scan a single file
python3 security/sanitizer.py --scan --file path/to/file.py
# Redact PII in place
python3 security/sanitizer.py --sanitize --file path/to/file.py
# Redact everything recursively
python3 security/sanitizer.py --sanitize --dir . --recursive
Exit codes: 0 = clean, 1 = PII found (useful for CI).
Configuration
Edit security/sanitizer-config.json:
{
"company_blocklist": ["Single Grain", "ClickFlow", "Nextiva"],
"person_blocklist": ["Jane Doe"],
"custom_patterns": [
"ACME-\\d{6}",
{"label": "PROJECT_ID", "pattern": "proj_[A-Za-z0-9]{12}"}
],
"skip_paths": ["node_modules", ".git", "__pycache__", ".env.example"],
"placeholder_format": "bracket"
}
- company_blocklist — company names to always redact
- person_blocklist — person names to always redact
- custom_patterns — additional regex (string or
{label, pattern}object) - skip_paths — directory names to skip during recursive scan
- placeholder_format —
"bracket"for[EMAIL]or"redacted"for[REDACTED]
Pre-Commit Hook
Install to block commits containing PII:
cp security/pre-commit-hook.sh .git/hooks/pre-commit
chmod +x .git/hooks/pre-commit
The hook scans staged files and blocks the commit if anything is detected.
To bypass in emergencies: git commit --no-verify
Supported File Types
.py, .md, .txt, .json, .yaml, .yml, .env
No External Dependencies
Uses only Python standard library (re, json, os, sys, argparse, pathlib).