diff --git a/skills/dcf-valuation/SKILL.md b/skills/dcf-valuation/SKILL.md
new file mode 100644
index 00000000..31f9ff50
--- /dev/null
+++ b/skills/dcf-valuation/SKILL.md
@@ -0,0 +1,213 @@
+---
+name: DCF Valuation
+description: Perform Discounted Cash Flow (DCF) valuation analysis for public companies. Use when the user asks to value a stock, calculate intrinsic value, fair value, perform DCF analysis, determine if a stock is undervalued or overvalued, or estimate a price target.
+version: 1.1.1
+metadata:
+ emoji: "\U0001F9EE"
+ tags:
+ - finance
+ - valuation
+ - dcf
+userInvocable: true
+disableModelInvocation: false
+---
+
+## Instructions
+
+Perform a rigorous Discounted Cash Flow (DCF) valuation. Follow all steps and show your work. Use external macro context when assumptions are time-sensitive (for example, risk-free rate regime shifts).
+
+### Progress Checklist
+
+```
+DCF Analysis Progress:
+- [ ] Step 1: Gather financial data
+- [ ] Step 2: Calculate historical FCF and growth
+- [ ] Step 3: Estimate WACC
+- [ ] Step 4: Project future cash flows
+- [ ] Step 5: Calculate present value and fair value
+- [ ] Step 6: Sensitivity analysis
+- [ ] Step 7: Validate results
+- [ ] Step 8: Present findings
+```
+
+### Step 1: Gather Financial Data
+
+Use `data` tool with `domain="finance"` for all calls:
+
+1. **Cash Flow History** (5 years):
+ ```
+ action: "get_cash_flow_statements"
+ params: { ticker: "[TICKER]", period: "annual", limit: 5 }
+ ```
+ Extract: `free_cash_flow`, `net_cash_flow_from_operations`, `capital_expenditure`
+ Fallback: FCF = Operating Cash Flow - CapEx
+
+2. **Income Statements** (5 years):
+ ```
+ action: "get_income_statements"
+ params: { ticker: "[TICKER]", period: "annual", limit: 5 }
+ ```
+ Extract: `revenue`, `operating_income`, `net_income`, `income_tax_expense`
+
+3. **Balance Sheet** (latest):
+ ```
+ action: "get_balance_sheets"
+ params: { ticker: "[TICKER]", period: "annual", limit: 1 }
+ ```
+ Extract: `total_debt`, `cash_and_equivalents`, `outstanding_shares`
+
+4. **Financial Metrics** (current):
+ ```
+ action: "get_financial_metrics_snapshot"
+ params: { ticker: "[TICKER]" }
+ ```
+ Extract: `market_cap`, `enterprise_value`, `return_on_invested_capital`, `debt_to_equity`, `free_cash_flow_per_share`
+
+5. **Analyst Estimates**:
+ ```
+ action: "get_analyst_estimates"
+ params: { ticker: "[TICKER]", period: "annual" }
+ ```
+ Extract: Forward EPS estimates for growth validation
+
+6. **Current Price**:
+ ```
+ action: "get_price_snapshot"
+ params: { ticker: "[TICKER]" }
+ ```
+
+7. **Company Facts**:
+ ```
+ action: "get_company_facts"
+ params: { ticker: "[TICKER]" }
+ ```
+ Extract: `sector` β use to determine WACC range from [sector-wacc.md](references/sector-wacc.md)
+
+8. **Recent Event Context**:
+- Pull company-specific headlines with:
+ ```
+ action: "get_news"
+ params: { ticker: "[TICKER]", limit: 10 }
+ ```
+- Use this to flag event risk (guidance reset, litigation, regulation, one-off gains/losses) that may distort near-term FCF extrapolation.
+
+### Step 2: Calculate Historical FCF and Growth
+
+- Compute FCF for each of the last 5 years
+- Calculate 5-year FCF CAGR: `(FCF_latest / FCF_earliest)^(1/years) - 1`
+- Cross-validate with: revenue growth, operating income growth, analyst EPS growth
+- **Cap projected growth at 15%** (sustained higher growth is rare)
+- If FCF is volatile, weight analyst estimates more heavily
+
+### Step 3: Estimate WACC
+
+Use the company's `sector` to look up the base WACC range from [sector-wacc.md](references/sector-wacc.md).
+
+**Calculate WACC:**
+```
+WACC = (E/V) * Re + (D/V) * Rd * (1 - Tax Rate)
+
+Where:
+ E = Market cap (equity value)
+ D = Total debt
+ V = E + D
+ Re = Risk-free rate + Beta * Equity Risk Premium
+ Rd = Cost of debt (estimate from interest expense / total debt)
+ Tax Rate = Effective tax rate from income statements
+```
+
+**Default assumptions:**
+- Risk-free rate: pull latest 10-year Treasury yield using `web_search` (preferred) and cite date/source. Fallback range: ~4.0-4.5%.
+- Equity risk premium: ~5.5%
+- If beta unavailable, use sector average
+
+**Sanity check:** WACC should be 2-4% below ROIC for value-creating companies.
+
+### Step 4: Project Future Cash Flows (Years 1-5)
+
+- Apply growth rate with annual decay (multiply by 0.95 each year)
+- Year 1: FCF * (1 + growth_rate)
+- Year 2: FCF * (1 + growth_rate * 0.95)
+- Year 3: FCF * (1 + growth_rate * 0.90)
+- Year 4: FCF * (1 + growth_rate * 0.85)
+- Year 5: FCF * (1 + growth_rate * 0.80)
+
+**Terminal Value** (Gordon Growth Model):
+```
+TV = FCF_Year5 * (1 + g) / (WACC - g)
+Where g = terminal growth rate (2.5% default, GDP proxy)
+```
+
+### Step 5: Calculate Present Value and Fair Value
+
+```
+PV of each FCF = FCF_t / (1 + WACC)^t
+PV of Terminal Value = TV / (1 + WACC)^5
+
+Enterprise Value = Sum of PV(FCFs) + PV(Terminal Value)
+Net Debt = Total Debt - Cash and Equivalents
+Equity Value = Enterprise Value - Net Debt
+Fair Value per Share = Equity Value / Shares Outstanding
+```
+
+### Step 6: Sensitivity Analysis
+
+Create a matrix varying two key assumptions:
+
+| | TG 2.0% | TG 2.5% | TG 3.0% |
+|---|---|---|---|
+| **WACC -1%** | $ | $ | $ |
+| **WACC base** | $ | $ | $ |
+| **WACC +1%** | $ | $ | $ |
+
+(TG = Terminal Growth Rate)
+
+### Step 7: Validate Results
+
+Before presenting, check:
+
+1. **EV comparison**: Calculated EV within 30% of reported enterprise_value
+ - If off by >30%, revisit WACC or growth assumptions
+2. **Terminal value ratio**: Should be 50-80% of total EV for mature companies
+ - If >90%, growth rate may be too high
+ - If <40%, near-term projections may be aggressive
+3. **FCF yield check**: Compare fair value FCF yield to current market FCF yield
+
+If validation fails, adjust assumptions and recalculate.
+
+### Step 8: Present Results
+
+Format clearly with:
+
+1. **Executive Summary**
+ - Current price vs. fair value estimate
+ - Upside/downside percentage
+ - Verdict: Undervalued / Fairly Valued / Overvalued
+
+2. **Key Assumptions Table**
+ | Assumption | Value | Source |
+ |---|---|---|
+ | Growth Rate | X% | 5Y CAGR + analyst cross-check |
+ | WACC | X% | Sector range + company adjustments |
+ | Terminal Growth | X% | GDP proxy |
+ | Tax Rate | X% | Effective rate from financials |
+
+3. **Projected FCF Table**
+ | Year | FCF | Growth | PV of FCF |
+ |---|---|---|---|
+
+4. **Valuation Bridge**
+ - PV of projected FCFs
+ - PV of Terminal Value
+ - = Enterprise Value
+ - - Net Debt
+ - = Equity Value
+ - / Shares Outstanding
+ - = **Fair Value per Share**
+
+5. **Sensitivity Matrix** (from Step 6)
+
+6. **Risks & Caveats**
+ - Key risks to the valuation thesis
+ - DCF limitations (sensitive to growth and WACC assumptions)
+ - Company-specific caveats (high debt, cyclicality, early-stage, etc.)
diff --git a/skills/dcf-valuation/references/sector-wacc.md b/skills/dcf-valuation/references/sector-wacc.md
new file mode 100644
index 00000000..5ddc1bd1
--- /dev/null
+++ b/skills/dcf-valuation/references/sector-wacc.md
@@ -0,0 +1,40 @@
+# Sector WACC Reference
+
+Use the company's `sector` from `get_company_facts` to look up the base WACC range below, then adjust for company-specific factors.
+
+## WACC by Sector
+
+| Sector | Typical WACC Range | Notes |
+|--------|-------------------|-------|
+| Communication Services | 8-10% | Mix of stable telecom and growth media |
+| Consumer Discretionary | 8-10% | Cyclical exposure |
+| Consumer Staples | 7-8% | Defensive, stable demand |
+| Energy | 9-11% | Commodity price exposure |
+| Financials | 8-10% | Leverage already in business model |
+| Health Care | 8-10% | Regulatory and pipeline risk |
+| Industrials | 8-9% | Moderate cyclicality |
+| Information Technology | 8-12% | Higher end for high-growth; lower for mature |
+| Materials | 8-10% | Cyclical, commodity exposure |
+| Real Estate | 7-9% | Interest rate sensitivity |
+| Utilities | 6-7% | Regulated, stable cash flows |
+
+## Adjustment Factors
+
+**Add to base WACC:**
+- High debt (D/E > 1.5): +1-2%
+- Small cap (< $2B market cap): +1-2%
+- Emerging markets exposure: +1-3%
+- Concentrated customer base: +0.5-1%
+- Regulatory uncertainty: +0.5-1.5%
+
+**Subtract from base WACC:**
+- Market leader with moat: -0.5-1%
+- Recurring revenue model: -0.5-1%
+- Investment grade credit: -0.5%
+
+## Sanity Checks
+
+- WACC should typically be 2-4% below ROIC for value-creating companies
+- If WACC > ROIC, the company may be destroying value
+- Typical range for US large-cap: 7-12%
+- Anything below 6% or above 14% warrants extra scrutiny
diff --git a/skills/docx/SKILL.md b/skills/docx/SKILL.md
new file mode 100644
index 00000000..ce1f04e5
--- /dev/null
+++ b/skills/docx/SKILL.md
@@ -0,0 +1,513 @@
+---
+name: Word Document
+description: "Use this skill whenever the user wants to create, read, edit, or manipulate Word documents (.docx files). Triggers include: any mention of \"Word doc\", \"word document\", \".docx\", or requests to produce professional documents with formatting like tables of contents, headings, page numbers, or letterheads. Also use when extracting or reorganizing content from .docx files, inserting or replacing images in documents, performing find-and-replace in Word files, working with tracked changes or comments, or converting content into a polished Word document. If the user asks for a \"report\", \"memo\", \"letter\", \"template\", or similar deliverable as a Word or .docx file, use this skill. Do NOT use for PDFs, spreadsheets, Google Docs, or general coding tasks unrelated to document generation."
+version: 1.0.0
+metadata:
+ emoji: "π"
+ tags:
+ - office
+ - document
+ - docx
+ install:
+ - id: brew-pandoc
+ kind: brew
+ formula: pandoc
+ bins: [pandoc]
+ label: "Install pandoc for text extraction"
+ os: [darwin, linux]
+ - id: brew-libreoffice
+ kind: brew
+ formula: libreoffice
+ bins: [soffice]
+ label: "Install LibreOffice for PDF conversion"
+ os: [darwin]
+ - id: brew-poppler
+ kind: brew
+ formula: poppler
+ bins: [pdftoppm]
+ label: "Install poppler for PDF to image conversion"
+ os: [darwin, linux]
+ - id: npm-docx
+ kind: node
+ formula: docx
+ bins: []
+ label: "Install docx-js for document creation"
+userInvocable: true
+disableModelInvocation: false
+---
+
+# DOCX creation, editing, and analysis
+
+## Overview
+
+A .docx file is a ZIP archive containing XML files.
+
+## Quick Reference
+
+| Task | Approach |
+|------|----------|
+| Read/analyze content | `pandoc` or unpack for raw XML |
+| Create new document | Use `docx-js` - see Creating New Documents below |
+| Edit existing document | Unpack β edit XML β repack - see Editing Existing Documents below |
+
+### Converting .doc to .docx
+
+Legacy `.doc` files must be converted before editing:
+
+```bash
+python scripts/office/soffice.py --headless --convert-to docx document.doc
+```
+
+### Reading Content
+
+```bash
+# Text extraction with tracked changes
+pandoc --track-changes=all document.docx -o output.md
+
+# Raw XML access
+python scripts/office/unpack.py document.docx unpacked/
+```
+
+### Converting to Images
+
+```bash
+python scripts/office/soffice.py --headless --convert-to pdf document.docx
+pdftoppm -jpeg -r 150 document.pdf page
+```
+
+### Accepting Tracked Changes
+
+To produce a clean document with all tracked changes accepted (requires LibreOffice):
+
+```bash
+python scripts/accept_changes.py input.docx output.docx
+```
+
+---
+
+## Creating New Documents
+
+Generate .docx files with JavaScript, then validate. Install: `npm install -g docx`
+
+### Setup
+```javascript
+const { Document, Packer, Paragraph, TextRun, Table, TableRow, TableCell, ImageRun,
+ Header, Footer, AlignmentType, PageOrientation, LevelFormat, ExternalHyperlink,
+ TableOfContents, HeadingLevel, BorderStyle, WidthType, ShadingType,
+ VerticalAlign, PageNumber, PageBreak } = require('docx');
+
+const doc = new Document({ sections: [{ children: [/* content */] }] });
+Packer.toBuffer(doc).then(buffer => fs.writeFileSync("doc.docx", buffer));
+```
+
+### Validation
+After creating the file, validate it. If validation fails, unpack, fix the XML, and repack.
+```bash
+python scripts/office/validate.py doc.docx
+```
+
+### Page Size
+
+```javascript
+// CRITICAL: docx-js defaults to A4, not US Letter
+// Always set page size explicitly for consistent results
+sections: [{
+ properties: {
+ page: {
+ size: {
+ width: 12240, // 8.5 inches in DXA
+ height: 15840 // 11 inches in DXA
+ },
+ margin: { top: 1440, right: 1440, bottom: 1440, left: 1440 } // 1 inch margins
+ }
+ },
+ children: [/* content */]
+}]
+```
+
+**Common page sizes (DXA units, 1440 DXA = 1 inch):**
+
+| Paper | Width | Height | Content Width (1" margins) |
+|-------|-------|--------|---------------------------|
+| US Letter | 12,240 | 15,840 | 9,360 |
+| A4 (default) | 11,906 | 16,838 | 9,026 |
+
+**Landscape orientation:** docx-js swaps width/height internally, so pass portrait dimensions and let it handle the swap:
+```javascript
+size: {
+ width: 12240, // Pass SHORT edge as width
+ height: 15840, // Pass LONG edge as height
+ orientation: PageOrientation.LANDSCAPE // docx-js swaps them in the XML
+},
+// Content width = 15840 - left margin - right margin (uses the long edge)
+```
+
+### Styles (Override Built-in Headings)
+
+Use Arial as the default font (universally supported). Keep titles black for readability.
+
+```javascript
+const doc = new Document({
+ styles: {
+ default: { document: { run: { font: "Arial", size: 24 } } }, // 12pt default
+ paragraphStyles: [
+ // IMPORTANT: Use exact IDs to override built-in styles
+ { id: "Heading1", name: "Heading 1", basedOn: "Normal", next: "Normal", quickFormat: true,
+ run: { size: 32, bold: true, font: "Arial" },
+ paragraph: { spacing: { before: 240, after: 240 }, outlineLevel: 0 } }, // outlineLevel required for TOC
+ { id: "Heading2", name: "Heading 2", basedOn: "Normal", next: "Normal", quickFormat: true,
+ run: { size: 28, bold: true, font: "Arial" },
+ paragraph: { spacing: { before: 180, after: 180 }, outlineLevel: 1 } },
+ ]
+ },
+ sections: [{
+ children: [
+ new Paragraph({ heading: HeadingLevel.HEADING_1, children: [new TextRun("Title")] }),
+ ]
+ }]
+});
+```
+
+### Lists (NEVER use unicode bullets)
+
+```javascript
+// WRONG - never manually insert bullet characters
+new Paragraph({ children: [new TextRun("Item")] }) // BAD
+new Paragraph({ children: [new TextRun("\u2022 Item")] }) // BAD
+
+// CORRECT - use numbering config with LevelFormat.BULLET
+const doc = new Document({
+ numbering: {
+ config: [
+ { reference: "bullets",
+ levels: [{ level: 0, format: LevelFormat.BULLET, text: "\u2022", alignment: AlignmentType.LEFT,
+ style: { paragraph: { indent: { left: 720, hanging: 360 } } } }] },
+ { reference: "numbers",
+ levels: [{ level: 0, format: LevelFormat.DECIMAL, text: "%1.", alignment: AlignmentType.LEFT,
+ style: { paragraph: { indent: { left: 720, hanging: 360 } } } }] },
+ ]
+ },
+ sections: [{
+ children: [
+ new Paragraph({ numbering: { reference: "bullets", level: 0 },
+ children: [new TextRun("Bullet item")] }),
+ new Paragraph({ numbering: { reference: "numbers", level: 0 },
+ children: [new TextRun("Numbered item")] }),
+ ]
+ }]
+});
+
+// Each reference creates INDEPENDENT numbering
+// Same reference = continues (1,2,3 then 4,5,6)
+// Different reference = restarts (1,2,3 then 1,2,3)
+```
+
+### Tables
+
+**CRITICAL: Tables need dual widths** - set both `columnWidths` on the table AND `width` on each cell. Without both, tables render incorrectly on some platforms.
+
+```javascript
+// CRITICAL: Always set table width for consistent rendering
+// CRITICAL: Use ShadingType.CLEAR (not SOLID) to prevent black backgrounds
+const border = { style: BorderStyle.SINGLE, size: 1, color: "CCCCCC" };
+const borders = { top: border, bottom: border, left: border, right: border };
+
+new Table({
+ width: { size: 9360, type: WidthType.DXA }, // Always use DXA (percentages break in Google Docs)
+ columnWidths: [4680, 4680], // Must sum to table width (DXA: 1440 = 1 inch)
+ rows: [
+ new TableRow({
+ children: [
+ new TableCell({
+ borders,
+ width: { size: 4680, type: WidthType.DXA }, // Also set on each cell
+ shading: { fill: "D5E8F0", type: ShadingType.CLEAR }, // CLEAR not SOLID
+ margins: { top: 80, bottom: 80, left: 120, right: 120 }, // Cell padding (internal, not added to width)
+ children: [new Paragraph({ children: [new TextRun("Cell")] })]
+ })
+ ]
+ })
+ ]
+})
+```
+
+**Table width calculation:**
+
+Always use `WidthType.DXA` β `WidthType.PERCENTAGE` breaks in Google Docs.
+
+```javascript
+// Table width = sum of columnWidths = content width
+// US Letter with 1" margins: 12240 - 2880 = 9360 DXA
+width: { size: 9360, type: WidthType.DXA },
+columnWidths: [7000, 2360] // Must sum to table width
+```
+
+**Width rules:**
+- **Always use `WidthType.DXA`** β never `WidthType.PERCENTAGE` (incompatible with Google Docs)
+- Table width must equal the sum of `columnWidths`
+- Cell `width` must match corresponding `columnWidth`
+- Cell `margins` are internal padding - they reduce content area, not add to cell width
+- For full-width tables: use content width (page width minus left and right margins)
+
+### Images
+
+```javascript
+// CRITICAL: type parameter is REQUIRED
+new Paragraph({
+ children: [new ImageRun({
+ type: "png", // Required: png, jpg, jpeg, gif, bmp, svg
+ data: fs.readFileSync("image.png"),
+ transformation: { width: 200, height: 150 },
+ altText: { title: "Title", description: "Desc", name: "Name" } // All three required
+ })]
+})
+```
+
+### Page Breaks
+
+```javascript
+// CRITICAL: PageBreak must be inside a Paragraph
+new Paragraph({ children: [new PageBreak()] })
+
+// Or use pageBreakBefore
+new Paragraph({ pageBreakBefore: true, children: [new TextRun("New page")] })
+```
+
+### Table of Contents
+
+```javascript
+// CRITICAL: Headings must use HeadingLevel ONLY - no custom styles
+new TableOfContents("Table of Contents", { hyperlink: true, headingStyleRange: "1-3" })
+```
+
+### Headers/Footers
+
+```javascript
+sections: [{
+ properties: {
+ page: { margin: { top: 1440, right: 1440, bottom: 1440, left: 1440 } } // 1440 = 1 inch
+ },
+ headers: {
+ default: new Header({ children: [new Paragraph({ children: [new TextRun("Header")] })] })
+ },
+ footers: {
+ default: new Footer({ children: [new Paragraph({
+ children: [new TextRun("Page "), new TextRun({ children: [PageNumber.CURRENT] })]
+ })] })
+ },
+ children: [/* content */]
+}]
+```
+
+### Critical Rules for docx-js
+
+- **Set page size explicitly** - docx-js defaults to A4; use US Letter (12240 x 15840 DXA) for US documents
+- **Landscape: pass portrait dimensions** - docx-js swaps width/height internally; pass short edge as `width`, long edge as `height`, and set `orientation: PageOrientation.LANDSCAPE`
+- **Never use `\n`** - use separate Paragraph elements
+- **Never use unicode bullets** - use `LevelFormat.BULLET` with numbering config
+- **PageBreak must be in Paragraph** - standalone creates invalid XML
+- **ImageRun requires `type`** - always specify png/jpg/etc
+- **Always set table `width` with DXA** - never use `WidthType.PERCENTAGE` (breaks in Google Docs)
+- **Tables need dual widths** - `columnWidths` array AND cell `width`, both must match
+- **Table width = sum of columnWidths** - for DXA, ensure they add up exactly
+- **Always add cell margins** - use `margins: { top: 80, bottom: 80, left: 120, right: 120 }` for readable padding
+- **Use `ShadingType.CLEAR`** - never SOLID for table shading
+- **TOC requires HeadingLevel only** - no custom styles on heading paragraphs
+- **Override built-in styles** - use exact IDs: "Heading1", "Heading2", etc.
+- **Include `outlineLevel`** - required for TOC (0 for H1, 1 for H2, etc.)
+
+---
+
+## Editing Existing Documents
+
+**Follow all 3 steps in order.**
+
+### Step 1: Unpack
+```bash
+python scripts/office/unpack.py document.docx unpacked/
+```
+Extracts XML, pretty-prints, merges adjacent runs, and converts smart quotes to XML entities (`“` etc.) so they survive editing. Use `--merge-runs false` to skip run merging.
+
+### Step 2: Edit XML
+
+Edit files in `unpacked/word/`. See XML Reference below for patterns.
+
+**Use "Claude" as the author** for tracked changes and comments, unless the user explicitly requests use of a different name.
+
+**Use the Edit tool directly for string replacement. Do not write Python scripts.** Scripts introduce unnecessary complexity. The Edit tool shows exactly what is being replaced.
+
+**CRITICAL: Use smart quotes for new content.** When adding text with apostrophes or quotes, use XML entities to produce smart quotes:
+```xml
+
+Here’s a quote: “Hello”
+```
+| Entity | Character |
+|--------|-----------|
+| `‘` | ' (left single) |
+| `’` | ' (right single / apostrophe) |
+| `“` | " (left double) |
+| `”` | " (right double) |
+
+**Adding comments:** Use `comment.py` to handle boilerplate across multiple XML files (text must be pre-escaped XML):
+```bash
+python scripts/comment.py unpacked/ 0 "Comment text with & and ’"
+python scripts/comment.py unpacked/ 1 "Reply text" --parent 0 # reply to comment 0
+python scripts/comment.py unpacked/ 0 "Text" --author "Custom Author" # custom author name
+```
+Then add markers to document.xml (see Comments in XML Reference).
+
+### Step 3: Pack
+```bash
+python scripts/office/pack.py unpacked/ output.docx --original document.docx
+```
+Validates with auto-repair, condenses XML, and creates DOCX. Use `--validate false` to skip.
+
+**Auto-repair will fix:**
+- `durableId` >= 0x7FFFFFFF (regenerates valid ID)
+- Missing `xml:space="preserve"` on `` with whitespace
+
+**Auto-repair won't fix:**
+- Malformed XML, invalid element nesting, missing relationships, schema violations
+
+### Common Pitfalls
+
+- **Replace entire `` elements**: When adding tracked changes, replace the whole `...` block with `......` as siblings. Don't inject tracked change tags inside a run.
+- **Preserve `` formatting**: Copy the original run's `` block into your tracked change runs to maintain bold, font size, etc.
+
+---
+
+## XML Reference
+
+### Schema Compliance
+
+- **Element order in ``**: ``, ``, ``, ``, ``, `` last
+- **Whitespace**: Add `xml:space="preserve"` to `` with leading/trailing spaces
+- **RSIDs**: Must be 8-digit hex (e.g., `00AB1234`)
+
+### Tracked Changes
+
+**Insertion:**
+```xml
+
+ inserted text
+
+```
+
+**Deletion:**
+```xml
+
+ deleted text
+
+```
+
+**Inside ``**: Use `` instead of ``, and `` instead of ``.
+
+**Minimal edits** - only mark what changes:
+```xml
+
+The term is
+
+ 30
+
+
+ 60
+
+ days.
+```
+
+**Deleting entire paragraphs/list items** - when removing ALL content from a paragraph, also mark the paragraph mark as deleted so it merges with the next paragraph. Add `` inside ``:
+```xml
+
+
+ ...
+
+
+
+
+
+ Entire paragraph content being deleted...
+
+
+```
+Without the `` in ``, accepting changes leaves an empty paragraph/list item.
+
+**Rejecting another author's insertion** - nest deletion inside their insertion:
+```xml
+
+
+ their inserted text
+
+
+```
+
+**Restoring another author's deletion** - add insertion after (don't modify their deletion):
+```xml
+
+ deleted text
+
+
+ deleted text
+
+```
+
+### Comments
+
+After running `comment.py` (see Step 2), add markers to document.xml. For replies, use `--parent` flag and nest markers inside the parent's.
+
+**CRITICAL: `` and `` are siblings of ``, never inside ``.**
+
+```xml
+
+
+
+ deleted
+
+ more text
+
+
+
+
+
+
+ text
+
+
+
+
+```
+
+### Images
+
+1. Add image file to `word/media/`
+2. Add relationship to `word/_rels/document.xml.rels`:
+```xml
+
+```
+3. Add content type to `[Content_Types].xml`:
+```xml
+
+```
+4. Reference in document.xml:
+```xml
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+---
+
+## Dependencies
+
+- **pandoc**: Text extraction
+- **docx**: `npm install -g docx` (new documents)
+- **LibreOffice**: PDF conversion (auto-configured for sandboxed environments via `scripts/office/soffice.py`)
+- **Poppler**: `pdftoppm` for images
diff --git a/skills/earnings-analysis/SKILL.md b/skills/earnings-analysis/SKILL.md
new file mode 100644
index 00000000..450e1f06
--- /dev/null
+++ b/skills/earnings-analysis/SKILL.md
@@ -0,0 +1,463 @@
+---
+name: Earnings Analysis
+description: >-
+ Analyze a company's financial statements (income statement, balance sheet,
+ cash flow statement) to assess financial health, earnings quality, and
+ competitive advantage. Use when the user asks to read/analyze financial
+ statements, check earnings quality, assess financial health, evaluate
+ profitability trends, or screen for competitive moats.
+version: 1.0.0
+metadata:
+ emoji: "\U0001F4D1"
+ requires:
+ env:
+ - FINANCIAL_DATASETS_API_KEY
+ tags:
+ - finance
+ - earnings
+ - analysis
+ - statements
+ - buffett
+userInvocable: true
+disableModelInvocation: false
+---
+
+## Instructions
+
+You are performing a structured financial statement analysis. Follow all steps in order and show your work. Output language must match the user's input language.
+
+**IMPORTANT: This analysis requires BOTH structured data AND external context.** You MUST use `web_search` to gather earnings call insights, industry context, and explanations for data anomalies. An analysis based only on API data without any web research is incomplete. Expect to make 3-6 web searches throughout the analysis.
+
+### Progress Checklist
+
+```
+Earnings Analysis Progress:
+- [ ] Step 1: Gather financial data
+- [ ] Step 2: Income statement analysis
+- [ ] Step 3: Balance sheet analysis
+- [ ] Step 4: Cash flow statement analysis
+- [ ] Step 5: Buffett competitive advantage scoring
+- [ ] Step 6: Quality of earnings assessment
+- [ ] Step 7: SEC filing qualitative analysis
+- [ ] Step 8: Peer comparison (if requested)
+- [ ] Step 9: Present findings
+```
+
+### Step 1: Gather Financial Data
+
+Use `data` tool with `domain="finance"` for all structured data calls.
+
+#### 1a. Structured Data
+
+1. **Annual financial statements** (5 years):
+ ```
+ action: "get_all_financial_statements"
+ params: { ticker: "[TICKER]", period: "annual", limit: 5 }
+ ```
+ This returns income statements, balance sheets, and cash flow statements together.
+
+2. **Quarterly financial statements** (last 4 quarters):
+ ```
+ action: "get_all_financial_statements"
+ params: { ticker: "[TICKER]", period: "quarterly", limit: 4 }
+ ```
+
+3. **Current financial metrics**:
+ ```
+ action: "get_financial_metrics_snapshot"
+ params: { ticker: "[TICKER]" }
+ ```
+
+4. **Company facts**:
+ ```
+ action: "get_company_facts"
+ params: { ticker: "[TICKER]" }
+ ```
+ Extract: `sector`, `industry` β needed for benchmark comparisons in later steps.
+
+5. **Current stock price**:
+ ```
+ action: "get_price_snapshot"
+ params: { ticker: "[TICKER]" }
+ ```
+
+6. **Recent news**:
+ ```
+ action: "get_news"
+ params: { ticker: "[TICKER]", limit: 10 }
+ ```
+ Scan headlines for material events (earnings surprises, guidance changes, M&A, restructuring).
+
+#### 1b. External Context (Web Search) β MANDATORY
+
+You MUST run the following two web searches after gathering structured data. These are not optional.
+
+1. **Latest earnings call highlights** (REQUIRED):
+ ```
+ web_search("[COMPANY] latest earnings call highlights key takeaways [CURRENT_YEAR]")
+ ```
+ Extract: management guidance, segment commentary, strategic priorities, forward outlook.
+ This provides the "why" behind the numbers that structured data cannot explain.
+
+2. **Industry/macro backdrop** (REQUIRED):
+ ```
+ web_search("[INDUSTRY] industry outlook trends [CURRENT_YEAR]")
+ ```
+ Extract: industry growth rate, tailwinds/headwinds, regulatory changes, competitive dynamics.
+ This is needed to assess whether the company's performance is company-specific or industry-wide.
+
+3. **Company-specific events** (conditional β run if news headlines or data show a material event):
+ ```
+ web_search("[COMPANY] [EVENT_KEYWORD] impact analysis")
+ ```
+ Examples: acquisition, restructuring, product launch, lawsuit, management change.
+
+**Checkpoint:** Before proceeding to Step 2, verify that you have completed at least 2 web searches above. If you have not, go back and run them now.
+
+### Step 2: Income Statement Analysis
+
+Analyze the income statement across all 5 annual periods. Calculate and present:
+
+1. **Revenue trend**:
+ - Year-over-year growth rate for each year
+ - 5-year CAGR: `(Revenue_latest / Revenue_earliest)^(1/years) - 1`
+ - Flag any years with revenue decline
+
+2. **Margin analysis** (calculate for each year, show the trend):
+ - Gross Margin = Gross Profit / Revenue
+ - Operating Margin = Operating Income / Revenue
+ - Net Margin = Net Income / Revenue
+
+3. **Margin benchmarks** (from [financial-ratios-benchmarks.md](references/financial-ratios-benchmarks.md)):
+ - Compare each margin to sector benchmarks
+ - Flag margins that are significantly above or below sector range
+
+4. **EPS analysis**:
+ - EPS trend over 5 years
+ - EPS growth consistency (note any years of decline)
+
+5. **Expense structure**:
+ - Cost of revenue as % of revenue (trend)
+ - SG&A as % of revenue (trend)
+ - R&D as % of revenue (trend, if applicable)
+ - Flag any expense category growing faster than revenue
+
+6. **Contextual explanation** (REQUIRED β use web search results from Step 1b):
+ - For each significant trend or inflection point in the data above, provide a **why** explanation using the earnings call and industry context gathered in Step 1b.
+ - If revenue growth changed direction significantly (acceleration or deceleration > 10pp), run an additional search:
+ `web_search("[COMPANY] revenue [growth/decline] reason [YEAR]")`
+ - If margins shifted by more than 5pp year-over-year, run an additional search:
+ `web_search("[COMPANY] margin [expansion/compression] [YEAR]")`
+ - **Do not present a data table without narrative.** Every major trend must have a "why" attached, citing the source (earnings call, industry report, or company announcement).
+
+Present as a table:
+
+| Metric | Year 1 | Year 2 | Year 3 | Year 4 | Year 5 | 5Y CAGR |
+|--------|--------|--------|--------|--------|--------|---------|
+
+### Step 3: Balance Sheet Analysis
+
+Analyze the balance sheet across all 5 annual periods:
+
+1. **Liquidity**:
+ - Current Ratio = Current Assets / Current Liabilities
+ - Quick Ratio = (Current Assets - Inventory) / Current Liabilities
+ - Cash and equivalents trend
+
+2. **Leverage**:
+ - Cash vs. Total Debt (short-term + long-term debt)
+ - Debt-to-Equity = Total Liabilities / Total Shareholders' Equity
+ - Interest Coverage = Operating Income / Interest Expense
+ - Debt payoff capacity = Total Debt / Net Income (in years)
+
+3. **Asset quality**:
+ - Receivables Turnover = Revenue / Accounts Receivable
+ - Inventory Turnover = Cost of Revenue / Inventory (if applicable)
+ - Goodwill as % of Total Assets (flag if > 30%)
+
+4. **Equity structure**:
+ - Retained earnings: year-over-year changes (growing?)
+ - Preferred stock: present or absent?
+ - Treasury stock: present? growing? (indicates buybacks)
+
+5. **Working capital trend**:
+ - Net Working Capital = Current Assets - Current Liabilities
+ - Direction of change over 5 years
+
+6. **Contextual explanation** (use web search results from Step 1b + additional searches as needed):
+ - Explain major balance sheet changes using earnings call context from Step 1b.
+ - If total debt changed significantly (> 30% YoY), you MUST search for the reason:
+ `web_search("[COMPANY] debt [issuance/repayment] [YEAR]")`
+ - If goodwill jumped, you MUST search for acquisition context:
+ `web_search("[COMPANY] acquisition [YEAR]")`
+ - Large treasury stock changes β confirm buyback program details:
+ `web_search("[COMPANY] share buyback program")`
+
+Compare key ratios to sector benchmarks from [financial-ratios-benchmarks.md](references/financial-ratios-benchmarks.md).
+
+### Step 4: Cash Flow Statement Analysis
+
+Analyze cash flow statements across all 5 annual periods:
+
+1. **Operating cash flow quality**:
+ - OCF vs. Net Income ratio for each year
+ - Target: OCF/NI > 1.0 (cash earnings exceed accrual earnings)
+ - Trend direction
+
+2. **Free cash flow**:
+ - FCF = Operating Cash Flow - Capital Expenditure
+ - FCF Margin = FCF / Revenue
+ - 5-year FCF trend and CAGR
+
+3. **Capital intensity**:
+ - CapEx / Revenue ratio
+ - CapEx / Net Income ratio (Buffett benchmark: < 25% excellent, < 50% acceptable)
+ - Is CapEx growing faster than revenue? (potential red flag)
+
+4. **Cash flow composition**:
+ - Net cash from operating activities (should be consistently positive)
+ - Net cash from investing activities (negative = investing in growth)
+ - Net cash from financing activities (pattern: debt vs. equity funded?)
+
+5. **Shareholder returns**:
+ - Dividends paid (from financing activities)
+ - Share buybacks / treasury stock repurchase
+ - Total payout ratio = (Dividends + Buybacks) / Net Income
+ - Is the company returning cash while maintaining growth?
+
+6. **Contextual explanation** (use web search results from Step 1b + additional searches as needed):
+ - Explain cash flow patterns using earnings call context from Step 1b.
+ - If CapEx spiked significantly in a particular year, you MUST search for what was built:
+ `web_search("[COMPANY] capital expenditure investment [YEAR]")`
+ - If FCF diverged sharply from net income, search for restructuring or working capital events.
+
+Present a summary table:
+
+| Metric | Year 1 | Year 2 | Year 3 | Year 4 | Year 5 |
+|--------|--------|--------|--------|--------|--------|
+
+### Step 5: Buffett Competitive Advantage Scoring
+
+Apply the scoring framework from [buffett-checklist.md](references/buffett-checklist.md).
+
+For each of the 13 criteria across 4 categories:
+1. Calculate the metric value from the data gathered in Steps 1-4
+2. Determine the score based on the threshold table
+3. Note the sector-specific caveats (Financials, Utilities, REITs, Growth-stage)
+
+Present the full scorecard table and the overall rating (Excellent / Good / Average / Weak).
+
+### Step 6: Quality of Earnings Assessment
+
+Assess whether reported earnings are backed by real cash and sustainable operations:
+
+1. **Accrual ratio**:
+ - Formula: (Net Income - Operating Cash Flow) / Total Assets
+ - Interpretation: Lower is better. High positive values suggest earnings are driven by accruals rather than cash.
+ - Red flag threshold: > 10%
+
+2. **Revenue recognition quality**:
+ - Compare Accounts Receivable growth rate vs. Revenue growth rate
+ - If AR grows significantly faster than revenue β potential aggressive revenue recognition
+ - Red flag threshold: AR growth > Revenue growth + 5 percentage points
+
+3. **Inventory quality** (if applicable):
+ - Compare Inventory growth rate vs. Cost of Revenue growth rate
+ - Rising inventory vs. flat/declining COGS β potential obsolescence risk
+ - Red flag threshold: Inventory growth > COGS growth + 10 percentage points
+
+4. **One-time items**:
+ - Identify significant non-recurring charges or gains in the income statement
+ - Calculate adjusted net income excluding one-time items
+ - Compare adjusted vs. reported margins
+
+5. **Deferred revenue trend** (if applicable):
+ - Growing deferred revenue is a positive signal (future revenue already contracted)
+ - Declining deferred revenue may signal weakening demand pipeline
+
+6. **External validation** (web search):
+ - If any red flags were triggered above, search for corroborating or mitigating context:
+ `web_search("[COMPANY] accounting concerns OR restatement OR SEC inquiry")`
+ - Check for auditor changes (can signal accounting issues):
+ `web_search("[COMPANY] auditor change OR audit opinion")`
+ - Only run these searches if quantitative red flags exist. Do not search proactively for every company.
+
+Summarize quality of earnings as: **High** / **Moderate** / **Low** with supporting evidence.
+
+### Step 7: SEC Filing Qualitative Analysis
+
+Pull and analyze the most recent annual or quarterly filing:
+
+1. **Get filing list**:
+ ```
+ action: "get_filings"
+ params: { ticker: "[TICKER]", filing_type: "10-K", limit: 1 }
+ ```
+ If 10-K is not recent enough, also pull 10-Q:
+ ```
+ action: "get_filings"
+ params: { ticker: "[TICKER]", filing_type: "10-Q", limit: 1 }
+ ```
+
+2. **Read MD&A section** (Management's Discussion and Analysis):
+ ```
+ action: "get_filing_items"
+ params: { ticker: "[TICKER]", filing_type: "10-K", item: "7" }
+ ```
+ For 10-Q, MD&A is item "2":
+ ```
+ action: "get_filing_items"
+ params: { ticker: "[TICKER]", filing_type: "10-Q", item: "2" }
+ ```
+
+3. **Read Risk Factors**:
+ ```
+ action: "get_filing_items"
+ params: { ticker: "[TICKER]", filing_type: "10-K", item: "1A" }
+ ```
+
+4. **Extract and analyze**:
+ - Management's explanation of revenue and margin trends
+ - Forward-looking statements and guidance
+ - Key risk factors that could impact financial health
+ - Any disclosures about accounting policy changes
+ - Cross-validate: Does management narrative align with the quantitative data from Steps 2-4?
+ - Flag contradictions between management tone and actual numbers
+
+5. **Supplement with earnings call transcript** (REQUIRED β web search/fetch):
+ You MUST search for and incorporate the most recent earnings call. This is critical for understanding management's forward-looking view.
+ - Search for the transcript:
+ `web_search("[COMPANY] [QUARTER] [YEAR] earnings call transcript")`
+ - If a transcript URL is found, use `web_fetch` to read key sections (CEO/CFO prepared remarks, Q&A highlights).
+ - Extract: forward guidance, segment-level commentary, management tone on competitive position, key analyst concerns.
+ - Cross-reference earnings call statements with MD&A disclosures β flag any inconsistencies.
+
+6. **Summarize key insights**:
+ - What management says about the business trajectory
+ - Material risks not visible in the numbers alone
+ - Any changes in risk factors vs. prior filings (if noticeable)
+ - Key analyst questions and management responses from earnings call (if available)
+
+### Step 8: Peer Comparison (Conditional)
+
+**Execute this step only when the user explicitly requests peer comparison or industry benchmarking.**
+
+1. **Identify peers**:
+ - Use the `sector` and `industry` from `get_company_facts`
+ - Select 2-3 publicly traded competitors in the same industry
+ - If the user specifies peers, use those instead
+
+2. **Pull peer data** (for each peer):
+ ```
+ action: "get_financial_metrics_snapshot"
+ params: { ticker: "[PEER_TICKER]" }
+ ```
+ ```
+ action: "get_income_statements"
+ params: { ticker: "[PEER_TICKER]", period: "annual", limit: 1 }
+ ```
+ ```
+ action: "get_balance_sheets"
+ params: { ticker: "[PEER_TICKER]", period: "annual", limit: 1 }
+ ```
+
+3. **Comparative table**:
+
+ | Metric | [TARGET] | [PEER 1] | [PEER 2] | [PEER 3] | Sector Avg |
+ |--------|----------|----------|----------|----------|------------|
+ | Revenue Growth (YoY) | | | | | |
+ | Gross Margin | | | | | |
+ | Net Margin | | | | | |
+ | ROE | | | | | |
+ | D/E Ratio | | | | | |
+ | FCF Margin | | | | | |
+ | P/E Ratio | | | | | |
+
+4. **Competitive position assessment**:
+ - Where does the target company rank among peers on each metric?
+ - Identify clear advantages and disadvantages relative to peers
+ - Note if the target trades at a premium or discount to peers and whether it's justified
+
+### Step 9: Present Findings
+
+Compile the full analysis into a structured report. Follow this exact structure:
+
+#### 1. Executive Summary
+- Company name, ticker, sector, current price
+- One-paragraph thesis: Is this a financially healthy company with a durable competitive advantage?
+- Financial health rating from Buffett scorecard (Excellent / Good / Average / Weak)
+- Earnings quality assessment (High / Moderate / Low)
+
+#### 2. Financial Health Scorecard
+- Full Buffett checklist scorecard table from Step 5
+- Total score and rating
+
+#### 3. Trend Dashboard
+- 5-year key metrics trend table from Steps 2-4:
+
+| Metric | Y1 | Y2 | Y3 | Y4 | Y5 | Trend |
+|--------|----|----|----|----|----|----|
+| Revenue | | | | | | arrow |
+| Gross Margin | | | | | | arrow |
+| Net Margin | | | | | | arrow |
+| ROE | | | | | | arrow |
+| D/E Ratio | | | | | | arrow |
+| FCF | | | | | | arrow |
+| OCF/NI | | | | | | arrow |
+| CapEx/NI | | | | | | arrow |
+
+Use directional indicators in the Trend column.
+
+#### 4. Quality of Earnings
+- Summary from Step 6 with key metrics and assessment
+
+#### 5. Key Strengths & Red Flags
+- **Strengths**: List 3-5 financial strengths with supporting data
+- **Red Flags**: List any warning signs discovered during analysis. If none, state "No material red flags identified."
+
+Common red flags to watch for:
+- Revenue growth but declining margins
+- Net income growing but OCF declining
+- AR growing faster than revenue
+- Inventory building up vs. flat COGS
+- Rising debt with declining interest coverage
+- Retained earnings declining
+- Large goodwill relative to total assets
+- CapEx consistently > 50% of net income
+- Management tone in MD&A contradicts financial data
+
+#### 6. SEC Filing Insights
+- Key findings from Step 7
+- Management's outlook and material risks
+
+#### 7. Peer Comparison (if Step 8 was executed)
+- Comparative table and competitive position assessment
+
+### Guardrails
+
+- Always state the date range of financial data used.
+- If any data is missing or unavailable, explicitly note it and adjust the analysis scope.
+- Do not present calculated ratios as precise β round to one decimal place.
+- Clearly distinguish between facts (from data) and interpretive conclusions.
+- The Buffett scorecard is a screening framework, not a buy/sell recommendation. State this in the output.
+- For non-US companies or companies not filing with the SEC, skip Step 7 and note the limitation.
+- Output language must match the user's input language (Chinese input β Chinese output, English input β English output).
+
+### Web Search Requirements
+
+**Minimum mandatory searches (you MUST perform these):**
+1. Earnings call highlights (Step 1b) β for management's own explanation of results
+2. Industry outlook (Step 1b) β for macro/sector context
+3. Earnings call transcript (Step 7) β for forward guidance and analyst Q&A
+
+**Additional searches (trigger when data shows anomalies):**
+- Revenue or margin inflection points (Steps 2-4)
+- Major debt changes or acquisitions (Step 3)
+- CapEx spikes (Step 4)
+- Quality-of-earnings red flags (Step 6)
+
+**Search principles:**
+- **Source quality**: Prefer primary sources (SEC filings, company press releases, earnings call transcripts) over secondary sources (analyst blogs, news aggregators).
+- **Cite with dates**: Always include source name and date when referencing external information.
+- **Separate fact from opinion**: Label analyst or media commentary as external opinion, not fact.
+- **Total budget**: Expect 3-8 web searches per analysis. Fewer than 3 means you are likely missing critical context.
diff --git a/skills/earnings-analysis/references/buffett-checklist.md b/skills/earnings-analysis/references/buffett-checklist.md
new file mode 100644
index 00000000..ebab5bae
--- /dev/null
+++ b/skills/earnings-analysis/references/buffett-checklist.md
@@ -0,0 +1,99 @@
+# Buffett Competitive Advantage Checklist
+
+Score each criterion and calculate a total. Use this to assess whether a company has a durable competitive advantage (economic moat).
+
+## Scoring System
+
+Total: 100 points across 4 categories (25 points each).
+
+### Category 1: Profitability (25 points)
+
+| # | Criterion | Excellent | Good | Weak |
+|---|-----------|-----------|------|------|
+| 1 | **Gross Margin** | > 40% β **10 pts** | 30-40% β **6 pts** | < 30% β **2 pts** |
+| 2 | **Net Margin** | > 20% β **10 pts** | 10-20% β **6 pts** | < 10% β **2 pts** |
+| 3 | **Return on Equity (ROE)** | > 15% β **5 pts** | 10-15% β **3 pts** | < 10% β **1 pt** |
+
+How to calculate:
+- Gross Margin = Gross Profit / Revenue
+- Net Margin = Net Income / Revenue
+- ROE = Net Income / Total Shareholders' Equity
+- Use the most recent annual figures; cross-check with 5-year average
+
+### Category 2: Balance Sheet Health (25 points)
+
+| # | Criterion | Pass | Partial | Fail |
+|---|-----------|------|---------|------|
+| 4 | **Cash > Total Debt** | Yes β **8 pts** | Cash > 50% of Debt β **4 pts** | Cash < 50% of Debt β **1 pt** |
+| 5 | **Debt-to-Equity Ratio** | < 0.8 β **7 pts** | 0.8-1.5 β **4 pts** | > 1.5 β **1 pt** |
+| 6 | **No Preferred Stock** | None β **5 pts** | β | Has Preferred β **0 pts** |
+| 7 | **Retained Earnings Growth** | Growing 5 consecutive years β **5 pts** | Growing 3-4 years β **3 pts** | Declining or flat β **1 pt** |
+
+How to calculate:
+- Cash = Cash and Cash Equivalents + Short-term Investments
+- Total Debt = Short-term Debt + Long-term Debt
+- D/E = Total Liabilities / Total Shareholders' Equity
+- Retained Earnings: Compare year-over-year from balance sheets
+
+Special note on D/E:
+- Exclude operating lease liabilities from "debt" for this assessment (they are contractual obligations, not financial debt)
+- If treasury stock is large, it reduces equity and inflates D/E β note this in analysis
+
+### Category 3: Cash Flow Quality (25 points)
+
+| # | Criterion | Excellent | Good | Weak |
+|---|-----------|-----------|------|------|
+| 8 | **CapEx / Net Income** | < 25% β **10 pts** | 25-50% β **6 pts** | > 50% β **2 pts** |
+| 9 | **Operating CF > Net Income** | OCF/NI > 1.0 β **8 pts** | OCF/NI = 0.8-1.0 β **4 pts** | OCF/NI < 0.8 β **1 pt** |
+| 10 | **Shareholder Returns** | Buybacks + Dividends β **7 pts** | Dividends only β **4 pts** | Neither β **1 pt** |
+
+How to calculate:
+- CapEx: Capital Expenditure from cash flow statement (use absolute value)
+- Operating CF: Net Cash from Operating Activities
+- Buybacks: Check if Treasury Stock increased year-over-year, or look at "repurchase of common stock" in financing activities
+- Dividends: Look at "dividends paid" in financing activities
+
+Note on CapEx:
+- One-time large CapEx (e.g., new factory, data center buildout) should be noted but not penalized if the 5-year average CapEx/NI is still within range
+- Asset-light businesses (software, services) naturally score well here
+
+### Category 4: Consistency (25 points)
+
+| # | Criterion | Excellent | Good | Weak |
+|---|-----------|-----------|------|------|
+| 11 | **Revenue Growth Streak** | 5+ consecutive years growing β **10 pts** | 3-4 years β **6 pts** | < 3 years β **2 pts** |
+| 12 | **Net Income Growth Streak** | 5+ consecutive years growing β **10 pts** | 3-4 years β **6 pts** | < 3 years β **2 pts** |
+| 13 | **Recession Resilience** | Profitable through last recession β **5 pts** | Revenue dip < 10% β **3 pts** | Significant losses β **1 pt** |
+
+How to assess:
+- Revenue/NI growth: Check year-over-year changes for the last 5 years
+- Recession resilience: Check 2020 (COVID) and 2022 (rate hikes) performance. For older data, check 2008-2009 if available.
+- A single flat year in an otherwise consistent growth streak can be scored as "Good"
+
+## Score Interpretation
+
+| Total Score | Rating | Interpretation |
+|-------------|--------|----------------|
+| 80-100 | **Excellent** | Strong durable competitive advantage. Consistent profitability, fortress balance sheet, capital-light operations. Classic Buffett-style investment candidate. |
+| 60-79 | **Good** | Solid business with some competitive advantages. May have minor weaknesses in one category. Worth deeper investigation. |
+| 40-59 | **Average** | Mediocre competitive position. Multiple areas of concern. Higher risk of margin erosion or competitive disruption. |
+| < 40 | **Weak** | No clear competitive advantage. High debt, inconsistent earnings, or capital-intensive operations. Not a typical Buffett investment. |
+
+## Sector-Specific Caveats
+
+- **Financials**: Skip gross margin (criterion 1). Use net interest margin > 3% as substitute for 10 pts. D/E ratio thresholds don't apply β use Tier 1 Capital Ratio > 10% for 7 pts instead.
+- **Utilities**: Naturally capital-intensive (CapEx criterion will score low). Offset by checking regulated return stability. If regulated ROE is consistently 9-11%, award 6 pts for criterion 8.
+- **REITs**: Required to pay out 90%+ as dividends, so retained earnings won't grow. Skip criterion 7; award 5 pts if FFO per share grows consistently instead.
+- **Growth-stage Tech**: May not yet have 5 years of profitability. Score consistency based on revenue growth and gross margin expansion trajectory. Note that the overall score may be artificially low.
+
+## Output Format
+
+Present the scorecard as a table:
+
+| # | Criterion | Value | Score | Max |
+|---|-----------|-------|-------|-----|
+| 1 | Gross Margin | 43.2% | 10 | 10 |
+| 2 | Net Margin | 25.1% | 10 | 10 |
+| ... | ... | ... | ... | ... |
+| | **Total** | | **XX** | **100** |
+| | **Rating** | | **Excellent/Good/Average/Weak** | |
diff --git a/skills/earnings-analysis/references/financial-ratios-benchmarks.md b/skills/earnings-analysis/references/financial-ratios-benchmarks.md
new file mode 100644
index 00000000..5e0b537e
--- /dev/null
+++ b/skills/earnings-analysis/references/financial-ratios-benchmarks.md
@@ -0,0 +1,70 @@
+# Financial Ratios Benchmarks by Sector
+
+Use the company's `sector` from `get_company_facts` to look up benchmark ranges below. Compare the company's ratios against these benchmarks and note deviations.
+
+## Profitability Benchmarks
+
+| Sector | Gross Margin | Operating Margin | Net Margin | ROE | ROA |
+|--------|-------------|-----------------|------------|-----|-----|
+| Communication Services | 50-60% | 15-25% | 10-18% | 12-20% | 5-10% |
+| Consumer Discretionary | 35-50% | 8-15% | 5-10% | 15-25% | 5-10% |
+| Consumer Staples | 35-45% | 12-18% | 8-12% | 20-30% | 8-12% |
+| Energy | 30-50% | 10-20% | 5-15% | 10-20% | 5-10% |
+| Financials | N/A | 25-35% | 15-25% | 10-15% | 1-2% |
+| Health Care | 55-70% | 15-25% | 10-20% | 15-25% | 8-12% |
+| Industrials | 25-35% | 10-15% | 6-10% | 15-20% | 5-8% |
+| Information Technology | 55-70% | 20-30% | 15-25% | 20-35% | 10-15% |
+| Materials | 25-35% | 10-18% | 5-12% | 10-18% | 5-8% |
+| Real Estate | 55-70% | 25-40% | 15-30% | 5-10% | 2-5% |
+| Utilities | 35-50% | 15-25% | 8-15% | 8-12% | 3-5% |
+
+## Balance Sheet Benchmarks
+
+| Sector | Current Ratio | Quick Ratio | D/E Ratio | Interest Coverage |
+|--------|--------------|-------------|-----------|-------------------|
+| Communication Services | 1.0-1.5 | 0.8-1.2 | 0.8-1.5 | 4-8x |
+| Consumer Discretionary | 1.2-2.0 | 0.8-1.5 | 0.5-1.2 | 5-10x |
+| Consumer Staples | 1.0-1.5 | 0.6-1.0 | 0.5-1.0 | 8-15x |
+| Energy | 1.0-1.5 | 0.8-1.2 | 0.3-0.8 | 5-10x |
+| Financials | N/A | N/A | 2.0-8.0 | N/A |
+| Health Care | 1.5-2.5 | 1.2-2.0 | 0.3-0.8 | 8-15x |
+| Industrials | 1.2-2.0 | 0.8-1.5 | 0.5-1.0 | 6-12x |
+| Information Technology | 2.0-3.5 | 1.5-3.0 | 0.2-0.6 | 15-30x |
+| Materials | 1.5-2.5 | 1.0-1.5 | 0.4-0.8 | 6-12x |
+| Real Estate | 1.0-1.5 | 0.5-1.0 | 0.8-1.5 | 3-5x |
+| Utilities | 0.8-1.2 | 0.5-0.8 | 1.0-2.0 | 3-5x |
+
+## Cash Flow Benchmarks
+
+| Sector | FCF Margin | CapEx/Revenue | Op. CF / Net Income |
+|--------|-----------|---------------|---------------------|
+| Communication Services | 10-20% | 10-20% | 1.2-1.8x |
+| Consumer Discretionary | 5-12% | 3-8% | 1.1-1.5x |
+| Consumer Staples | 8-15% | 3-6% | 1.2-1.5x |
+| Energy | 5-15% | 15-30% | 1.5-2.5x |
+| Financials | N/A | 1-3% | N/A |
+| Health Care | 15-25% | 3-8% | 1.2-1.8x |
+| Industrials | 5-12% | 3-8% | 1.2-1.6x |
+| Information Technology | 20-35% | 3-10% | 1.2-1.8x |
+| Materials | 5-12% | 5-12% | 1.3-2.0x |
+| Real Estate | 15-30% | 5-15% | 1.5-3.0x |
+| Utilities | 5-10% | 15-25% | 2.0-3.5x |
+
+## Usage Notes
+
+- **Financials sector**: Gross margin and current/quick ratios are not meaningful for banks and insurers. Use net interest margin and capital adequacy ratios instead.
+- **Real Estate**: High depreciation makes net margin less useful. Focus on Funds From Operations (FFO).
+- **Growth-stage companies**: May have negative margins. Compare against growth-stage peers rather than mature sector benchmarks.
+- **Cyclical sectors** (Energy, Materials, Industrials): Use cycle-average margins (5-7 years) rather than single-year comparisons.
+- **Post-M&A**: Goodwill and amortization may distort margins for 1-2 years after acquisitions. Note any large acquisitions.
+
+## Buffett's Rules of Thumb (Quick Reference)
+
+| Metric | Excellent | Good | Weak |
+|--------|-----------|------|------|
+| Gross Margin | > 40% | 30-40% | < 30% |
+| Net Margin | > 20% | 10-20% | < 10% |
+| ROE | > 15% | 10-15% | < 10% |
+| D/E Ratio | < 0.5 | 0.5-0.8 | > 0.8 |
+| CapEx / Net Income | < 25% | 25-50% | > 50% |
+| Debt Payoff (years) | < 2 | 2-4 | > 4 |
diff --git a/skills/finance-research/SKILL.md b/skills/finance-research/SKILL.md
new file mode 100644
index 00000000..1aae38dc
--- /dev/null
+++ b/skills/finance-research/SKILL.md
@@ -0,0 +1,171 @@
+---
+name: Finance Research
+description: Conduct analyst-grade financial research across primary and secondary markets using structured financial data plus macro and public-information cross-checks.
+version: 1.1.1
+metadata:
+ emoji: "\U0001F4CA"
+ tags:
+ - finance
+ - research
+ - stocks
+ - data
+ - macro
+ - sentiment
+userInvocable: true
+disableModelInvocation: false
+---
+
+## Instructions
+
+You are conducting financial research with an analyst-grade standard. Tool usage is a dynamic decision. Do not force tool combinations. Choose tools based on evidence sufficiency for the specific question.
+
+### Available Data Actions
+
+#### Price Data
+- `get_price_snapshot` β Current stock price. Params: `{ ticker }`
+- `get_prices` β Historical OHLCV prices. Params: `{ ticker, start_date, end_date, interval?, interval_multiplier? }`
+ - interval: "day" (default), "week", "month", "year"
+- `get_crypto_price_snapshot` β Current crypto price. Params: `{ ticker }` (e.g. "BTC-USD")
+- `get_crypto_prices` β Historical crypto prices. Same params as get_prices.
+- `get_available_crypto_tickers` β List available crypto tickers. Params: `{}`
+
+#### Financial Statements
+All share params: `{ ticker, period, limit?, report_period_gt?, report_period_gte?, report_period_lt?, report_period_lte? }`
+- period: "annual", "quarterly", or "ttm"
+- Dates in YYYY-MM-DD format
+
+Actions:
+- `get_income_statements` β Revenue, expenses, net income, EPS
+- `get_balance_sheets` β Assets, liabilities, equity, debt, cash
+- `get_cash_flow_statements` β Operating, investing, financing cash flows, FCF
+- `get_all_financial_statements` β All three at once (more efficient when you need multiple)
+
+#### Metrics & Estimates
+- `get_financial_metrics_snapshot` β Current key ratios (P/E, market cap, margins, etc.). Params: `{ ticker }`
+- `get_financial_metrics` β Historical metrics. Params: `{ ticker, period?, limit?, report_period*? }`
+- `get_analyst_estimates` β EPS and revenue estimates. Params: `{ ticker, period? }`
+
+#### Company Info
+- `get_company_facts` β Sector, industry, employees, exchange, website. Params: `{ ticker }`
+- `get_news` β Recent company news articles. Params: `{ ticker, start_date?, end_date?, limit? }`
+- `get_insider_trades` β Insider buying/selling (SEC Form 4). Params: `{ ticker, limit?, filing_date*? }`
+- `get_segmented_revenues` β Revenue by segment/geography. Params: `{ ticker, period, limit? }`
+
+#### SEC Filings
+- `get_filings` β List filings metadata. Params: `{ ticker, filing_type?, limit? }`
+- `get_filing_items` β Read filing sections. Params: `{ ticker, filing_type, accession_number?, item? }`
+
+### Evidence Sufficiency Gate (Internal Decision)
+
+Before deep analysis, make an internal evidence decision. Do not output a technical decision block by default.
+
+If the user explicitly asks for methodology or reasoning transparency, provide a concise plain-language explanation of your research approach.
+
+Decision policy:
+
+- Start with `data_only` when structured data can support the requested conclusion.
+- Escalate to `hybrid` when the task is event-driven, time-sensitive, or requires causal explanation not visible in structured data alone.
+- Use `web_first` only when the task is mainly document/news/policy driven (common in pre-IPO without stable ticker coverage).
+- If a tool is unavailable, continue with available tools and explicitly downgrade confidence.
+
+### Core Analysis Framework
+
+1. **Scope & Market Type**
+- Identify if this is primary market (IPO, pre-IPO, follow-on, placement) or secondary market (listed stock/sector/index).
+- State region and analysis horizon (event-driven, 3-6 months, 1-3 years).
+
+2. **Core Company Data (Structured)**
+- Start with: `get_price_snapshot`, `get_company_facts`, `get_financial_metrics_snapshot`.
+- Pull statements (`get_all_financial_statements`) and estimates as needed.
+
+3. **Macro & Policy Context (Conditional)**
+- Use `web_search` / `web_fetch` only if required by your internal evidence decision.
+- If used, prefer high-signal primary sources (central bank, regulator, official releases).
+- For time-sensitive conclusions, include source dates explicitly.
+
+4. **News & Sentiment Context (Conditional)**
+- Use `get_news` for company-linked coverage when available.
+- Add web cross-checks only when event validation materially affects the conclusion.
+
+5. **Synthesis & Decision**
+- Separate **facts**, **inference**, and **assumptions**.
+- Build bull/base/bear scenarios with explicit trigger conditions.
+- Provide confidence level and explain the main uncertainty drivers.
+
+### Primary Market (δΈηΊ§εΈεΊ) Workflow
+
+When asked about IPOs, pre-IPO, or new issuance:
+
+1. **Deal Basics**
+- Identify issuer, listing venue, offering structure (primary/secondary shares), expected timeline.
+- Determine whether a reliable ticker exists in current data coverage.
+
+2. **Filing/Prospectus Review**
+- Prefer official documents (e.g., S-1/F-1/prospectus) via `web_search` + `web_fetch`.
+- Extract: use of proceeds, customer concentration, related-party transactions, share classes, lock-up, dilution risks.
+
+Primary-market capability boundary:
+- If `ticker` is available and filings are retrievable, run hybrid analysis (structured + document evidence).
+- If `ticker` is unavailable or structured filing fields are limited, run web-led analysis and clearly label it as partial-coverage with reduced confidence.
+
+3. **Valuation & Comparable Set**
+- Build peer set from listed comps (secondary market tickers) and compare growth, margin, and valuation multiples.
+- Flag gaps between issuer narrative and peer reality.
+
+4. **Deal Risk Map**
+- Highlight red flags: weak FCF quality, aggressive non-GAAP adjustments, concentrated revenue, regulatory overhang.
+- Provide post-listing watch items: lock-up expiry, first earnings, guidance revisions.
+
+### Secondary Market (δΊηΊ§εΈεΊ) Workflow
+
+When asked about listed equities:
+
+1. **Trend & Positioning**
+- Pull 1y price history (`get_prices`) and identify regime (uptrend/range/downtrend) with volatility context.
+
+2. **Fundamentals**
+- Analyze growth quality (revenue vs FCF), margin durability, leverage, and capital allocation.
+
+3. **Valuation**
+- Compare current multiples to historical bands and peers (when peer data is available).
+- Connect valuation premium/discount to expected growth and risk profile.
+
+4. **Catalysts & Risks**
+- Earnings, guidance, product cycle, policy changes, rates/FX/commodity sensitivity, insider activity.
+
+### Output Standard
+
+Always include:
+
+1. **Executive Summary** (thesis + stance + confidence)
+2. **Evidence Table** with columns:
+- Signal
+- Direction (Bull/Bear/Neutral)
+- Why it matters
+- Source
+- Date
+3. **Scenario Table** (bull/base/bear with probabilities or relative weights)
+4. **Key Monitoring Triggers** (what would invalidate current thesis)
+
+### Guardrails
+
+- Always state data cutoff dates.
+- If data is missing, explicitly mark it and show the impact on confidence.
+- Do not present assumptions as facts.
+- For event-driven conclusions, if you skip web validation, explicitly explain why structured evidence is still sufficient.
+
+
+### Example: Secondary Market Analysis
+
+For "Analyze Apple's investment outlook":
+
+1. `data(domain="finance", action="get_price_snapshot", params={ticker: "AAPL"})`
+2. `data(domain="finance", action="get_company_facts", params={ticker: "AAPL"})`
+3. `data(domain="finance", action="get_all_financial_statements", params={ticker: "AAPL", period: "annual", limit: 3})`
+4. `data(domain="finance", action="get_financial_metrics", params={ticker: "AAPL", period: "quarterly", limit: 8})`
+5. `data(domain="finance", action="get_analyst_estimates", params={ticker: "AAPL", period: "annual"})`
+6. `data(domain="finance", action="get_news", params={ticker: "AAPL", limit: 10})`
+7. `web_search(query="latest Fed policy decision impact on US mega-cap tech valuations")`
+8. `web_search(query="Apple supply chain or regulatory news latest quarter")`
+
+Then synthesize fundamental trend, macro regime, and event sentiment into a scenario-based conclusion.
diff --git a/skills/pdf/SKILL.md b/skills/pdf/SKILL.md
new file mode 100644
index 00000000..c35eb4ee
--- /dev/null
+++ b/skills/pdf/SKILL.md
@@ -0,0 +1,335 @@
+---
+name: PDF Processing
+description: Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.
+version: 1.0.0
+metadata:
+ emoji: "π"
+ tags:
+ - office
+ - document
+ - pdf
+ install:
+ - id: brew-poppler
+ kind: brew
+ formula: poppler
+ bins: [pdftoppm, pdftotext, pdfimages]
+ label: "Install poppler for PDF text/image extraction"
+ os: [darwin, linux]
+ - id: brew-qpdf
+ kind: brew
+ formula: qpdf
+ bins: [qpdf]
+ label: "Install qpdf for advanced PDF manipulation"
+ os: [darwin, linux]
+userInvocable: true
+disableModelInvocation: false
+---
+
+# PDF Processing Guide
+
+## Overview
+
+This guide covers essential PDF processing operations using Python libraries and command-line tools. For advanced features, JavaScript libraries, and detailed examples, see reference.md. If you need to fill out a PDF form, read forms.md and follow its instructions.
+
+## Quick Start
+
+```python
+from pypdf import PdfReader, PdfWriter
+
+# Read a PDF
+reader = PdfReader("document.pdf")
+print(f"Pages: {len(reader.pages)}")
+
+# Extract text
+text = ""
+for page in reader.pages:
+ text += page.extract_text()
+```
+
+## Python Libraries
+
+### pypdf - Basic Operations
+
+#### Merge PDFs
+```python
+from pypdf import PdfWriter, PdfReader
+
+writer = PdfWriter()
+for pdf_file in ["doc1.pdf", "doc2.pdf", "doc3.pdf"]:
+ reader = PdfReader(pdf_file)
+ for page in reader.pages:
+ writer.add_page(page)
+
+with open("merged.pdf", "wb") as output:
+ writer.write(output)
+```
+
+#### Split PDF
+```python
+reader = PdfReader("input.pdf")
+for i, page in enumerate(reader.pages):
+ writer = PdfWriter()
+ writer.add_page(page)
+ with open(f"page_{i+1}.pdf", "wb") as output:
+ writer.write(output)
+```
+
+#### Extract Metadata
+```python
+reader = PdfReader("document.pdf")
+meta = reader.metadata
+print(f"Title: {meta.title}")
+print(f"Author: {meta.author}")
+print(f"Subject: {meta.subject}")
+print(f"Creator: {meta.creator}")
+```
+
+#### Rotate Pages
+```python
+reader = PdfReader("input.pdf")
+writer = PdfWriter()
+
+page = reader.pages[0]
+page.rotate(90) # Rotate 90 degrees clockwise
+writer.add_page(page)
+
+with open("rotated.pdf", "wb") as output:
+ writer.write(output)
+```
+
+### pdfplumber - Text and Table Extraction
+
+#### Extract Text with Layout
+```python
+import pdfplumber
+
+with pdfplumber.open("document.pdf") as pdf:
+ for page in pdf.pages:
+ text = page.extract_text()
+ print(text)
+```
+
+#### Extract Tables
+```python
+with pdfplumber.open("document.pdf") as pdf:
+ for i, page in enumerate(pdf.pages):
+ tables = page.extract_tables()
+ for j, table in enumerate(tables):
+ print(f"Table {j+1} on page {i+1}:")
+ for row in table:
+ print(row)
+```
+
+#### Advanced Table Extraction
+```python
+import pandas as pd
+
+with pdfplumber.open("document.pdf") as pdf:
+ all_tables = []
+ for page in pdf.pages:
+ tables = page.extract_tables()
+ for table in tables:
+ if table: # Check if table is not empty
+ df = pd.DataFrame(table[1:], columns=table[0])
+ all_tables.append(df)
+
+# Combine all tables
+if all_tables:
+ combined_df = pd.concat(all_tables, ignore_index=True)
+ combined_df.to_excel("extracted_tables.xlsx", index=False)
+```
+
+### reportlab - Create PDFs
+
+#### Basic PDF Creation
+```python
+from reportlab.lib.pagesizes import letter
+from reportlab.pdfgen import canvas
+
+c = canvas.Canvas("hello.pdf", pagesize=letter)
+width, height = letter
+
+# Add text
+c.drawString(100, height - 100, "Hello World!")
+c.drawString(100, height - 120, "This is a PDF created with reportlab")
+
+# Add a line
+c.line(100, height - 140, 400, height - 140)
+
+# Save
+c.save()
+```
+
+#### Create PDF with Multiple Pages
+```python
+from reportlab.lib.pagesizes import letter
+from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, PageBreak
+from reportlab.lib.styles import getSampleStyleSheet
+
+doc = SimpleDocTemplate("report.pdf", pagesize=letter)
+styles = getSampleStyleSheet()
+story = []
+
+# Add content
+title = Paragraph("Report Title", styles['Title'])
+story.append(title)
+story.append(Spacer(1, 12))
+
+body = Paragraph("This is the body of the report. " * 20, styles['Normal'])
+story.append(body)
+story.append(PageBreak())
+
+# Page 2
+story.append(Paragraph("Page 2", styles['Heading1']))
+story.append(Paragraph("Content for page 2", styles['Normal']))
+
+# Build PDF
+doc.build(story)
+```
+
+#### Subscripts and Superscripts
+
+**IMPORTANT**: Never use Unicode subscript/superscript characters in ReportLab PDFs. The built-in fonts do not include these glyphs, causing them to render as solid black boxes.
+
+Instead, use ReportLab's XML markup tags in Paragraph objects:
+```python
+from reportlab.platypus import Paragraph
+from reportlab.lib.styles import getSampleStyleSheet
+
+styles = getSampleStyleSheet()
+
+# Subscripts: use tag
+chemical = Paragraph("H2O", styles['Normal'])
+
+# Superscripts: use tag
+squared = Paragraph("x2 + y2", styles['Normal'])
+```
+
+For canvas-drawn text (not Paragraph objects), manually adjust font the size and position rather than using Unicode subscripts/superscripts.
+
+## Command-Line Tools
+
+### pdftotext (poppler-utils)
+```bash
+# Extract text
+pdftotext input.pdf output.txt
+
+# Extract text preserving layout
+pdftotext -layout input.pdf output.txt
+
+# Extract specific pages
+pdftotext -f 1 -l 5 input.pdf output.txt # Pages 1-5
+```
+
+### qpdf
+```bash
+# Merge PDFs
+qpdf --empty --pages file1.pdf file2.pdf -- merged.pdf
+
+# Split pages
+qpdf input.pdf --pages . 1-5 -- pages1-5.pdf
+qpdf input.pdf --pages . 6-10 -- pages6-10.pdf
+
+# Rotate pages
+qpdf input.pdf output.pdf --rotate=+90:1 # Rotate page 1 by 90 degrees
+
+# Remove password
+qpdf --password=mypassword --decrypt encrypted.pdf decrypted.pdf
+```
+
+### pdftk (if available)
+```bash
+# Merge
+pdftk file1.pdf file2.pdf cat output merged.pdf
+
+# Split
+pdftk input.pdf burst
+
+# Rotate
+pdftk input.pdf rotate 1east output rotated.pdf
+```
+
+## Common Tasks
+
+### Extract Text from Scanned PDFs
+```python
+# Requires: pip install pytesseract pdf2image
+import pytesseract
+from pdf2image import convert_from_path
+
+# Convert PDF to images
+images = convert_from_path('scanned.pdf')
+
+# OCR each page
+text = ""
+for i, image in enumerate(images):
+ text += f"Page {i+1}:\n"
+ text += pytesseract.image_to_string(image)
+ text += "\n\n"
+
+print(text)
+```
+
+### Add Watermark
+```python
+from pypdf import PdfReader, PdfWriter
+
+# Create watermark (or load existing)
+watermark = PdfReader("watermark.pdf").pages[0]
+
+# Apply to all pages
+reader = PdfReader("document.pdf")
+writer = PdfWriter()
+
+for page in reader.pages:
+ page.merge_page(watermark)
+ writer.add_page(page)
+
+with open("watermarked.pdf", "wb") as output:
+ writer.write(output)
+```
+
+### Extract Images
+```bash
+# Using pdfimages (poppler-utils)
+pdfimages -j input.pdf output_prefix
+
+# This extracts all images as output_prefix-000.jpg, output_prefix-001.jpg, etc.
+```
+
+### Password Protection
+```python
+from pypdf import PdfReader, PdfWriter
+
+reader = PdfReader("input.pdf")
+writer = PdfWriter()
+
+for page in reader.pages:
+ writer.add_page(page)
+
+# Add password
+writer.encrypt("userpassword", "ownerpassword")
+
+with open("encrypted.pdf", "wb") as output:
+ writer.write(output)
+```
+
+## Quick Reference
+
+| Task | Best Tool | Command/Code |
+|------|-----------|--------------|
+| Merge PDFs | pypdf | `writer.add_page(page)` |
+| Split PDFs | pypdf | One page per file |
+| Extract text | pdfplumber | `page.extract_text()` |
+| Extract tables | pdfplumber | `page.extract_tables()` |
+| Create PDFs | reportlab | Canvas or Platypus |
+| Command line merge | qpdf | `qpdf --empty --pages ...` |
+| OCR scanned PDFs | pytesseract | Convert to image first |
+| Fill PDF forms | pdf-lib or pypdf (see forms.md) | See forms.md |
+
+## Next Steps
+
+- For advanced pypdfium2 usage, see reference.md
+- For JavaScript libraries (pdf-lib), see reference.md
+- If you need to fill out a PDF form, follow the instructions in forms.md
+- For troubleshooting guides, see reference.md
diff --git a/skills/pdf/forms.md b/skills/pdf/forms.md
new file mode 100644
index 00000000..2dece27c
--- /dev/null
+++ b/skills/pdf/forms.md
@@ -0,0 +1,294 @@
+**CRITICAL: You MUST complete these steps in order. Do not skip ahead to writing code.**
+
+If you need to fill out a PDF form, first check to see if the PDF has fillable form fields. Run this script from this file's directory:
+ `python scripts/extract_form_field_info.py --check `, and depending on the result go to either the "Fillable fields" or "Non-fillable fields" and follow those instructions.
+
+# Fillable fields
+If the PDF has fillable form fields:
+- Run this script from this file's directory: `python scripts/extract_form_field_info.py `. It will create a JSON file with a list of fields in this format:
+```
+[
+ {
+ "field_id": (unique ID for the field),
+ "page": (page number, 1-based),
+ "rect": ([left, bottom, right, top] bounding box in PDF coordinates, y=0 is the bottom of the page),
+ "type": ("text", "checkbox", "radio_group", or "choice"),
+ },
+ // Checkboxes have "checked_value" and "unchecked_value" properties:
+ {
+ "field_id": (unique ID for the field),
+ "page": (page number, 1-based),
+ "type": "checkbox",
+ "checked_value": (Set the field to this value to check the checkbox),
+ "unchecked_value": (Set the field to this value to uncheck the checkbox),
+ },
+ // Radio groups have a "radio_options" list with the possible choices.
+ {
+ "field_id": (unique ID for the field),
+ "page": (page number, 1-based),
+ "type": "radio_group",
+ "radio_options": [
+ {
+ "value": (set the field to this value to select this radio option),
+ "rect": (bounding box for the radio button for this option)
+ },
+ // Other radio options
+ ]
+ },
+ // Multiple choice fields have a "choice_options" list with the possible choices:
+ {
+ "field_id": (unique ID for the field),
+ "page": (page number, 1-based),
+ "type": "choice",
+ "choice_options": [
+ {
+ "value": (set the field to this value to select this option),
+ "text": (display text of the option)
+ },
+ // Other choice options
+ ],
+ }
+]
+```
+- Convert the PDF to PNGs (one image for each page) with this script (run from this file's directory):
+`python scripts/convert_pdf_to_images.py `
+Then analyze the images to determine the purpose of each form field (make sure to convert the bounding box PDF coordinates to image coordinates).
+- Create a `field_values.json` file in this format with the values to be entered for each field:
+```
+[
+ {
+ "field_id": "last_name", // Must match the field_id from `extract_form_field_info.py`
+ "description": "The user's last name",
+ "page": 1, // Must match the "page" value in field_info.json
+ "value": "Simpson"
+ },
+ {
+ "field_id": "Checkbox12",
+ "description": "Checkbox to be checked if the user is 18 or over",
+ "page": 1,
+ "value": "/On" // If this is a checkbox, use its "checked_value" value to check it. If it's a radio button group, use one of the "value" values in "radio_options".
+ },
+ // more fields
+]
+```
+- Run the `fill_fillable_fields.py` script from this file's directory to create a filled-in PDF:
+`python scripts/fill_fillable_fields.py