docs: add Anthropic governance resources

- Add Claude's Constitution (CC0) to data-privacy guide - Add Petri 2.0 research tool to README - Update reference.yaml with external research section Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 09:31:19 +01:00 · 2026-01-26 09:31:19 +01:00 · 6ed7e92e64
commit 6ed7e92e64
parent 257f2ff65d
3 changed files with 64 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -335,6 +335,12 @@ This guide is the result of several months of daily practice with Claude Code. I
 - [zebbern/claude-code-guide](https://github.com/zebbern/claude-code-guide) — Comprehensive reference with security focus
 - [ykdojo/claude-code-tips](https://github.com/ykdojo/claude-code-tips) — Practical productivity techniques

+**External Research Tools**:
+- [Petri 2.0](https://github.com/safety-research/petri) — Open-source AI behavior audit tool (Anthropic Alignment)
+  - 70 scenarios for collusion, ethics conflicts, info sensitivity
+  - Eval-awareness mitigations + benchmarks (Claude Opus 4.5, GPT-5.2, Gemini 3 Pro, Grok 4)
+  - [Blog](https://alignment.anthropic.com/2026/petri-v2/)
+
 </details>

 <details>
--- a/guide/data-privacy.md
+++ b/guide/data-privacy.md
@ -293,7 +293,45 @@ This guide focuses on Claude Code usage—not legal strategy. For IP guidance, c

 ---

+## 9. Claude's Governance & Values
+
+### Constitutional AI Framework
+
+Anthropic published Claude's constitution in January 2026 (CC0 license - public domain). This document defines the value hierarchy that guides Claude's behavior:
+
+**Priority Order** (used to resolve conflicts):
+
+1. **Broadly safe** - Never compromise human supervision and control
+2. **Broadly ethical** - Honesty, harm avoidance, good conduct
+3. **Anthropic compliance** - Internal guidelines and policies
+4. **Genuinely helpful** - Real utility for users and society
+
+### What This Means for Claude Code Users
+
+| Scenario | Expected Behavior |
+|----------|-------------------|
+| Security-sensitive requests | Claude prioritizes safety over helpfulness (may be more conservative) |
+| Borderline biology/chemistry | May decline or ask for context to assess safety implications |
+| Ethical conflicts | Will follow hierarchy: safety > ethics > compliance > utility |
+
+### Why This Matters
+
+- **Training data source**: Constitution is used to generate synthetic training examples
+- **Behavior specification**: Reference document explaining intended vs. accidental outputs
+- **Audit & governance**: Provides legal/ethical foundation for compliance reviews
+- **Your own agents**: CC0 license allows reuse/adaptation for custom models
+
+### Resources
+
+- Constitution full text: https://www.anthropic.com/constitution
+- PDF version: https://www-cdn.anthropic.com/.../claudes-constitution.pdf
+- Announcement: https://www.anthropic.com/news/claude-new-constitution
+- Alignment research: https://alignment.anthropic.com/
+
+---
+
 ## Changelog

+- 2026-01: Added Claude's governance & constitutional AI framework section
 - 2026-01: Added intellectual property considerations section
 - 2026-01: Initial version - documenting retention policies and protective measures
--- a/machine-readable/reference.yaml
+++ b/machine-readable/reference.yaml
@ -277,6 +277,26 @@ deep_dive:
      description: "Real-time monitoring UI for Gas Town and multiclaude (SSE + SQLite)"
      status: "Early preview (Jan 2026, v0.2.0)"
      guide_section: "guide/ai-ecosystem.md:850"
+  # External research & alignment tools
+  external_research:
+    claude_constitution:
+      url: "https://www.anthropic.com/constitution"
+      pdf: "https://www-cdn.anthropic.com/9214f02e82c4489fb6cf45441d448a1ecd1a3aca/claudes-constitution.pdf"
+      announcement: "https://www.anthropic.com/news/claude-new-constitution"
+      description: "Claude's Constitutional AI framework - value hierarchy (safety > ethics > compliance > utility)"
+      license: "CC0 1.0 (public domain)"
+      published: "2026-01-21"
+      guide_section: "guide/data-privacy.md:296"
+    petri_v2:
+      repo: "https://github.com/safety-research/petri"
+      blog: "https://alignment.anthropic.com/2026/petri-v2/"
+      description: "Open-source AI behavior audit tool (Anthropic Alignment Science)"
+      features:
+        - "70 scenarios: collusion, ethics conflicts, info sensitivity"
+        - "Eval-awareness mitigations"
+        - "Benchmarks: Claude Opus 4.5, GPT-5.2, Gemini 3 Pro, Grok 4"
+      published: "2026-01-21"
+      guide_section: "README.md:338"
  # Section 9.18 - Codebase Design for Agent Productivity
  codebase_design_agents: 9976
  codebase_design_source: "https://marmelab.com/blog/2026/01/21/agent-experience.html"