Add 4 new skill categories: revenue-intelligence, conversion-ops, podcast-ops, team-ops
New skills (8 total): - revenue-intelligence: Gong Insight Pipeline, Revenue Attribution Mapper, Client Report Generator - conversion-ops: CRO Audit, Survey-to-Lead-Magnet Engine - podcast-ops: Podcast-to-Everything Pipeline - team-ops: Elon Algorithm (Team Performance Audit), Meeting-to-Action Extractor Also adds .gitignore for __pycache__
This commit is contained in:
parent
b2c11a65aa
commit
36d6ed83e7
23 changed files with 8472 additions and 4 deletions
1
.gitignore
vendored
Normal file
1
.gitignore
vendored
Normal file
|
|
@ -0,0 +1 @@
|
|||
__pycache__/
|
||||
28
README.md
28
README.md
|
|
@ -16,6 +16,10 @@ These aren't prompts. They're complete workflows — scripts, scoring algorithms
|
|||
| [**Outbound Engine**](./outbound-engine/) | ICP definition to emails in inbox — fully automated | Cold Outbound Optimizer, Lead Pipeline, Competitive Monitor |
|
||||
| [**SEO Ops**](./seo-ops/) | Find the keywords your competitors missed | Content Attack Briefs, GSC Optimizer, Trend Scout |
|
||||
| [**Finance Ops**](./finance-ops/) | Your AI CFO that finds hidden costs in 30 minutes | CFO Briefing, Cost Estimate, Scenario Modeler |
|
||||
| [**Revenue Intelligence**](./revenue-intelligence/) | Prove content ROI and turn sales calls into strategy | Gong Insight Pipeline, Revenue Attribution, Client Report Generator |
|
||||
| [**Conversion Ops**](./conversion-ops/) | Score any landing page and turn survey data into lead magnets | CRO Audit, Survey-to-Lead-Magnet Engine |
|
||||
| [**Podcast Ops**](./podcast-ops/) | One episode → 20+ content pieces across every platform | Podcast-to-Everything Pipeline, Content Calendar |
|
||||
| [**Team Ops**](./team-ops/) | Ruthless performance audits and meeting intelligence | Elon Algorithm, Meeting-to-Action Extractor |
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -108,11 +112,27 @@ ai-marketing-skills/
|
|||
│ ├── gsc_client.py
|
||||
│ ├── trend_scout.py
|
||||
│ └── ...
|
||||
└── finance-ops/ ← Financial analysis
|
||||
├── finance-ops/ ← Financial analysis
|
||||
│ ├── SKILL.md
|
||||
│ ├── scripts/
|
||||
│ ├── references/ ← Metrics, rates, ROI models
|
||||
│ └── ...
|
||||
├── revenue-intelligence/ ← Sales call insights + attribution
|
||||
│ ├── SKILL.md
|
||||
│ ├── gong_insight_pipeline.py
|
||||
│ ├── revenue_attribution.py
|
||||
│ └── client_report_generator.py
|
||||
├── conversion-ops/ ← CRO + lead magnet generation
|
||||
│ ├── SKILL.md
|
||||
│ ├── cro_audit.py
|
||||
│ └── survey_lead_magnet.py
|
||||
├── podcast-ops/ ← Podcast → content factory
|
||||
│ ├── SKILL.md
|
||||
│ └── podcast_pipeline.py
|
||||
└── team-ops/ ← Performance audits + meeting intel
|
||||
├── SKILL.md
|
||||
├── scripts/
|
||||
├── references/ ← Metrics, rates, ROI models
|
||||
└── ...
|
||||
├── team_performance_audit.py
|
||||
└── meeting_action_extractor.py
|
||||
```
|
||||
|
||||
---
|
||||
|
|
|
|||
193
conversion-ops/README.md
Normal file
193
conversion-ops/README.md
Normal file
|
|
@ -0,0 +1,193 @@
|
|||
# AI Conversion Ops
|
||||
|
||||
**Turn landing pages into conversion machines. Turn survey data into lead magnets.**
|
||||
|
||||
An AI-powered conversion optimization suite that replaces manual CRO audits and survey analysis. These tools score your landing pages across 8 proven conversion dimensions and transform raw survey responses into segmented lead magnet strategies — all without API keys or headless browsers.
|
||||
|
||||
## What's Inside
|
||||
|
||||
### 🎯 CRO Audit Tool
|
||||
Fetches any landing page URL and runs it through a comprehensive conversion heuristics engine. Scores across 8 dimensions, compares against industry benchmarks, and generates specific fix recommendations with before/after suggestions.
|
||||
|
||||
**What it finds:**
|
||||
- Weak or missing headlines that fail the 5-second test
|
||||
- CTAs that blend in instead of standing out
|
||||
- Missing social proof that kills trust
|
||||
- Forms with too much friction
|
||||
- Mobile responsiveness gaps
|
||||
- Page weight and speed red flags
|
||||
- Missing trust signals and urgency elements
|
||||
|
||||
### 📊 Survey-to-Lead-Magnet Engine
|
||||
Ingests survey response CSVs, clusters respondents by pain point themes, ranks segments by size and commercial potential, and auto-generates complete lead magnet briefs for each segment.
|
||||
|
||||
**What it produces:**
|
||||
- Pain point clusters from free-text survey responses
|
||||
- Segments ranked by commercial opportunity
|
||||
- Complete lead magnet briefs (title, format, hook, outline, CTA)
|
||||
- Viral potential and conversion potential scores
|
||||
- Prioritized implementation roadmap
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Install dependencies
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### 2. Run a CRO audit
|
||||
|
||||
```bash
|
||||
# Single page
|
||||
python cro_audit.py --url https://yoursite.com/landing-page
|
||||
|
||||
# With industry benchmarks
|
||||
python cro_audit.py --url https://yoursite.com/landing-page --industry saas
|
||||
|
||||
# Batch mode
|
||||
python cro_audit.py --file urls.txt --industry ecommerce --output results.json
|
||||
```
|
||||
|
||||
### 3. Generate lead magnets from survey data
|
||||
|
||||
```bash
|
||||
# Basic analysis
|
||||
python survey_lead_magnet.py --csv survey_responses.csv
|
||||
|
||||
# Specify pain point columns
|
||||
python survey_lead_magnet.py --csv survey.csv --pain-columns "biggest_challenge" "what_keeps_you_up"
|
||||
|
||||
# Top 3 segments with JSON output
|
||||
python survey_lead_magnet.py --csv survey.csv --top-segments 3 --json
|
||||
```
|
||||
|
||||
## CRO Scoring Model
|
||||
|
||||
Every page is scored across **8 dimensions** (each 0–100):
|
||||
|
||||
| Dimension | What It Measures | Weight |
|
||||
|-----------|-----------------|--------|
|
||||
| Headline Clarity | Value prop visible in <5 seconds | 15% |
|
||||
| CTA Visibility | Prominent, contrasting, above fold | 20% |
|
||||
| Social Proof | Testimonials, logos, numbers, case studies | 15% |
|
||||
| Urgency | Scarcity, deadlines, limited availability | 5% |
|
||||
| Trust Signals | Security badges, guarantees, certifications | 10% |
|
||||
| Form Friction | Field count, form complexity, required fields | 15% |
|
||||
| Mobile Responsiveness | Viewport meta, responsive patterns | 10% |
|
||||
| Page Speed Indicators | Image optimization, script count, resource size | 10% |
|
||||
|
||||
**Overall CRO Score** = Weighted average → letter grade (A+ through F).
|
||||
|
||||
### Industry Benchmarks
|
||||
|
||||
Benchmarks are calibrated per industry:
|
||||
|
||||
| Industry | Avg CRO Score | Top Quartile |
|
||||
|----------|--------------|--------------|
|
||||
| SaaS | 62 | 78+ |
|
||||
| E-commerce | 58 | 74+ |
|
||||
| Agency | 55 | 72+ |
|
||||
| Finance | 60 | 76+ |
|
||||
| Healthcare | 52 | 68+ |
|
||||
| Education | 54 | 70+ |
|
||||
| B2B | 56 | 73+ |
|
||||
|
||||
## Survey Segmentation
|
||||
|
||||
The lead magnet engine uses keyword frequency analysis and TF-IDF clustering to group survey responses:
|
||||
|
||||
1. **Text preprocessing** — Normalize, tokenize, remove stopwords
|
||||
2. **Theme extraction** — TF-IDF vectorization of pain point responses
|
||||
3. **Clustering** — Group similar responses into pain segments
|
||||
4. **Ranking** — Score segments by size × commercial signal strength
|
||||
5. **Brief generation** — Create lead magnet briefs targeting each cluster
|
||||
|
||||
### Lead Magnet Formats
|
||||
|
||||
The engine recommends the best format per segment:
|
||||
|
||||
- **Guide** — Deep educational content for complex problems
|
||||
- **Checklist** — Actionable steps for process-oriented pain points
|
||||
- **Template** — Fill-in-the-blank tools for recurring tasks
|
||||
- **Calculator** — Interactive tools for quantifiable decisions
|
||||
- **Swipe File** — Example collections for creative/copy challenges
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────┐
|
||||
│ CRO Audit │
|
||||
│ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
|
||||
│ │ HTML │ │ 8-Dim │ │ Industry │ │
|
||||
│ │ Fetcher │ │ Scorer │ │ Benchmarks │ │
|
||||
│ └────┬─────┘ └────┬─────┘ └────────┬─────────┘ │
|
||||
│ └─────────────┼────────────────┘ │
|
||||
│ ▼ │
|
||||
│ ┌──────────────────────────────────────────────┐ │
|
||||
│ │ Weighted Score + Priority Fixes │ │
|
||||
│ │ Before/After · Letter Grade · Benchmarks │ │
|
||||
│ └──────────────────────────────────────────────┘ │
|
||||
└──────────────────────────────────────────────────┘
|
||||
|
||||
┌──────────────────────────────────────────────────┐
|
||||
│ Survey-to-Lead-Magnet Engine │
|
||||
│ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
|
||||
│ │ CSV │ │ TF-IDF │ │ Pain Point │ │
|
||||
│ │ Ingest │ │ Cluster │ │ Ranking │ │
|
||||
│ └────┬─────┘ └────┬─────┘ └────────┬─────────┘ │
|
||||
│ └─────────────┼────────────────┘ │
|
||||
│ ▼ │
|
||||
│ ┌──────────────────────────────────────────────┐ │
|
||||
│ │ Lead Magnet Briefs + Scoring Matrix │ │
|
||||
│ │ Title · Format · Hook · Outline · CTA │ │
|
||||
│ └──────────────────────────────────────────────┘ │
|
||||
└──────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
No API keys required. Both tools run entirely locally.
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `USER_AGENT` | No | Custom user agent for fetching pages |
|
||||
| `REQUEST_TIMEOUT` | No | HTTP request timeout in seconds (default: 15) |
|
||||
|
||||
## Using as a Claude Code Skill
|
||||
|
||||
Add this to your `.claude/agents/` directory and use the `SKILL.md` for Claude Code integration. The skill enables Claude to:
|
||||
|
||||
1. Audit landing pages for conversion issues on demand
|
||||
2. Score pages against industry benchmarks
|
||||
3. Generate lead magnet strategies from survey data
|
||||
4. Run batch CRO audits across multiple URLs
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
conversion-ops/
|
||||
├── README.md # This file
|
||||
├── SKILL.md # Claude Code agent skill definition
|
||||
├── cro_audit.py # Landing page CRO scoring engine
|
||||
├── survey_lead_magnet.py # Survey segmentation + lead magnet generator
|
||||
└── requirements.txt # Python dependencies
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
|
||||
---
|
||||
|
||||
<div align="center">
|
||||
|
||||
**🧠 [Want these built and managed for you? →](https://singlebrain.com/?utm_source=github&utm_medium=skill_repo&utm_campaign=ai_marketing_skills)**
|
||||
|
||||
*This is how we build agents at [Single Brain](https://singlebrain.com/?utm_source=github&utm_medium=skill_repo&utm_campaign=ai_marketing_skills) for our clients.*
|
||||
|
||||
[Single Grain](https://www.singlegrain.com/?utm_source=github&utm_medium=skill_repo&utm_campaign=ai_marketing_skills) · our marketing agency
|
||||
|
||||
📬 **[Level up your marketing with 14,000+ marketers and founders →](https://levelingup.beehiiv.com/subscribe)** *(free)*
|
||||
|
||||
</div>
|
||||
116
conversion-ops/SKILL.md
Normal file
116
conversion-ops/SKILL.md
Normal file
|
|
@ -0,0 +1,116 @@
|
|||
# AI Conversion Ops
|
||||
|
||||
AI-powered conversion rate optimization: landing page audits, CRO scoring, survey segmentation, and lead magnet generation.
|
||||
|
||||
## When to Use
|
||||
|
||||
- User asks for a landing page audit or CRO analysis
|
||||
- User wants to score a page across conversion dimensions
|
||||
- User needs to identify conversion bottlenecks on a URL
|
||||
- User has survey data and wants to segment respondents by pain point
|
||||
- User wants lead magnet ideas generated from survey responses
|
||||
- User needs batch CRO analysis across multiple URLs
|
||||
|
||||
## Tools
|
||||
|
||||
### CRO Audit (`cro_audit.py`)
|
||||
|
||||
Fetches a landing page and scores it across 8 conversion dimensions. No headless browser needed.
|
||||
|
||||
```bash
|
||||
# Single URL audit
|
||||
python cro_audit.py --url https://example.com/landing-page
|
||||
|
||||
# Batch mode — multiple URLs
|
||||
python cro_audit.py --urls https://example.com/page1 https://example.com/page2
|
||||
|
||||
# URLs from a file (one per line)
|
||||
python cro_audit.py --file urls.txt
|
||||
|
||||
# Specify industry for benchmark comparison
|
||||
python cro_audit.py --url https://example.com --industry saas
|
||||
|
||||
# JSON output
|
||||
python cro_audit.py --url https://example.com --json
|
||||
|
||||
# Save report to file
|
||||
python cro_audit.py --url https://example.com --output report.json
|
||||
```
|
||||
|
||||
**Scoring dimensions (each 0–100):**
|
||||
1. **Headline Clarity** — Is the value prop obvious in <5 seconds?
|
||||
2. **CTA Visibility** — Are CTAs prominent, contrasting, above the fold?
|
||||
3. **Social Proof** — Testimonials, logos, case studies, numbers?
|
||||
4. **Urgency** — Scarcity, deadlines, limited offers?
|
||||
5. **Trust Signals** — Security badges, guarantees, privacy, certifications?
|
||||
6. **Form Friction** — How many fields? Is the form intimidating?
|
||||
7. **Mobile Responsiveness** — Viewport meta, responsive patterns, touch targets?
|
||||
8. **Page Speed Indicators** — Image optimization, script count, resource size?
|
||||
|
||||
**Overall CRO Score** = Weighted average across all 8 dimensions.
|
||||
|
||||
**Output includes:**
|
||||
- Per-dimension score with specific findings
|
||||
- Priority fixes ranked by impact
|
||||
- Before/after suggestions for each issue
|
||||
- Industry benchmark comparison
|
||||
- Overall letter grade (A+ through F)
|
||||
|
||||
**Supported industries:** `saas`, `ecommerce`, `agency`, `finance`, `healthcare`, `education`, `b2b`, `general`
|
||||
|
||||
### Survey-to-Lead-Magnet Engine (`survey_lead_magnet.py`)
|
||||
|
||||
Ingests survey CSV data, clusters respondents by pain point, and generates lead magnet briefs for each segment.
|
||||
|
||||
```bash
|
||||
# Basic usage — analyze survey CSV
|
||||
python survey_lead_magnet.py --csv survey_responses.csv
|
||||
|
||||
# Specify which columns contain pain points / challenges
|
||||
python survey_lead_magnet.py --csv survey.csv --pain-columns "biggest_challenge" "top_frustration"
|
||||
|
||||
# Limit number of segments
|
||||
python survey_lead_magnet.py --csv survey.csv --top-segments 5
|
||||
|
||||
# JSON output
|
||||
python survey_lead_magnet.py --csv survey.csv --json
|
||||
|
||||
# Save output
|
||||
python survey_lead_magnet.py --csv survey.csv --output lead_magnets.json
|
||||
```
|
||||
|
||||
**What it produces:**
|
||||
- Pain point clusters with respondent counts
|
||||
- Segments ranked by size and commercial potential
|
||||
- For each top segment, a lead magnet brief:
|
||||
- Title, format (guide/checklist/template/calculator), hook
|
||||
- Content outline (5–7 sections)
|
||||
- Target CTA and distribution channel
|
||||
- Viral potential score + conversion potential score
|
||||
- Prioritized implementation roadmap
|
||||
|
||||
**CSV format:** Questions as column headers, one respondent per row. Works with any survey tool export (Typeform, Google Forms, SurveyMonkey, etc.)
|
||||
|
||||
## Configuration
|
||||
|
||||
No API keys required. Both tools work with local analysis only.
|
||||
|
||||
Optional environment variables:
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `USER_AGENT` | No | Custom user agent for page fetching (default provided) |
|
||||
| `REQUEST_TIMEOUT` | No | HTTP timeout in seconds (default: 15) |
|
||||
|
||||
## Recommended Workflow
|
||||
|
||||
1. **Weekly:** Run `cro_audit.py` on your top landing pages to track CRO scores over time
|
||||
2. **Post-survey:** Run `survey_lead_magnet.py` to turn survey data into content strategy
|
||||
3. **Pre-launch:** Audit new landing pages before driving paid traffic
|
||||
4. **Monthly:** Batch audit competitor landing pages to benchmark against
|
||||
|
||||
## Dependencies
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
946
conversion-ops/cro_audit.py
Normal file
946
conversion-ops/cro_audit.py
Normal file
|
|
@ -0,0 +1,946 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
AI CRO Audit Tool
|
||||
==================
|
||||
Fetches a landing page URL, analyzes its HTML structure, and scores it across
|
||||
8 conversion dimensions. Outputs a structured report with specific fix
|
||||
recommendations and industry benchmark comparisons.
|
||||
|
||||
No headless browser required — uses requests + BeautifulSoup.
|
||||
|
||||
Usage:
|
||||
python cro_audit.py --url https://example.com/landing-page
|
||||
python cro_audit.py --urls https://example.com/page1 https://example.com/page2
|
||||
python cro_audit.py --file urls.txt --industry saas
|
||||
python cro_audit.py --url https://example.com --json --output report.json
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from dataclasses import dataclass, field, asdict
|
||||
from typing import Optional
|
||||
from urllib.parse import urlparse
|
||||
|
||||
import requests
|
||||
from bs4 import BeautifulSoup, Comment
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Constants
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
DEFAULT_UA = (
|
||||
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
|
||||
"AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
|
||||
)
|
||||
REQUEST_TIMEOUT = int(os.getenv("REQUEST_TIMEOUT", "15"))
|
||||
USER_AGENT = os.getenv("USER_AGENT", DEFAULT_UA)
|
||||
|
||||
# Dimension weights for overall score
|
||||
DIMENSION_WEIGHTS = {
|
||||
"headline_clarity": 0.15,
|
||||
"cta_visibility": 0.20,
|
||||
"social_proof": 0.15,
|
||||
"urgency": 0.05,
|
||||
"trust_signals": 0.10,
|
||||
"form_friction": 0.15,
|
||||
"mobile_responsiveness": 0.10,
|
||||
"page_speed_indicators": 0.10,
|
||||
}
|
||||
|
||||
# Industry benchmarks: {industry: {avg, top_quartile}}
|
||||
INDUSTRY_BENCHMARKS = {
|
||||
"saas": {"avg": 62, "top_quartile": 78},
|
||||
"ecommerce": {"avg": 58, "top_quartile": 74},
|
||||
"agency": {"avg": 55, "top_quartile": 72},
|
||||
"finance": {"avg": 60, "top_quartile": 76},
|
||||
"healthcare": {"avg": 52, "top_quartile": 68},
|
||||
"education": {"avg": 54, "top_quartile": 70},
|
||||
"b2b": {"avg": 56, "top_quartile": 73},
|
||||
"general": {"avg": 56, "top_quartile": 72},
|
||||
}
|
||||
|
||||
# CTA keyword patterns
|
||||
CTA_PATTERNS = re.compile(
|
||||
r"\b(get started|sign up|start free|try free|book a? ?demo|schedule|"
|
||||
r"download|buy now|add to cart|subscribe|join|register|request|"
|
||||
r"claim|grab|unlock|access|learn more|contact us|talk to|"
|
||||
r"start now|begin|enroll|apply now|shop now|order now)\b",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
# Social proof patterns
|
||||
SOCIAL_PROOF_PATTERNS = re.compile(
|
||||
r"\b(testimonial|review|rating|stars?|customers?|clients?|"
|
||||
r"companies|trusted by|used by|loved by|join \d|"
|
||||
r"case stud|success stor|\d+\s*\+?\s*(users?|customers?|clients?|companies|businesses)|"
|
||||
r"as seen|featured in|featured on|logo|partner)\b",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
# Urgency patterns
|
||||
URGENCY_PATTERNS = re.compile(
|
||||
r"\b(limited time|act now|hurry|expires?|deadline|only \d|"
|
||||
r"last chance|don'?t miss|ending soon|today only|"
|
||||
r"while supplies|few (left|remaining|spots)|countdown|"
|
||||
r"offer ends|sale ends|hours left|minutes left|spots? left|"
|
||||
r"exclusive|one-time|flash sale|clearance)\b",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
# Trust signal patterns
|
||||
TRUST_PATTERNS = re.compile(
|
||||
r"\b(ssl|secure|encrypt|privacy|guarantee|money.?back|"
|
||||
r"refund|no.?risk|free trial|cancel any ?time|"
|
||||
r"gdpr|hipaa|soc.?2|iso|pci|complian|certif|"
|
||||
r"bbb|accredit|verified|badge|shield|lock|"
|
||||
r"norton|mcafee|trustpilot|stripe|paypal)\b",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Data classes
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@dataclass
|
||||
class DimensionScore:
|
||||
name: str
|
||||
score: int # 0-100
|
||||
findings: list = field(default_factory=list)
|
||||
recommendations: list = field(default_factory=list)
|
||||
|
||||
|
||||
@dataclass
|
||||
class CROReport:
|
||||
url: str
|
||||
overall_score: float = 0.0
|
||||
letter_grade: str = ""
|
||||
dimensions: dict = field(default_factory=dict)
|
||||
priority_fixes: list = field(default_factory=list)
|
||||
benchmark_comparison: dict = field(default_factory=dict)
|
||||
fetch_error: Optional[str] = None
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Fetcher
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def fetch_page(url: str) -> tuple[Optional[str], Optional[str]]:
|
||||
"""Fetch page HTML. Returns (html, error)."""
|
||||
try:
|
||||
resp = requests.get(
|
||||
url,
|
||||
headers={"User-Agent": USER_AGENT},
|
||||
timeout=REQUEST_TIMEOUT,
|
||||
allow_redirects=True,
|
||||
)
|
||||
resp.raise_for_status()
|
||||
return resp.text, None
|
||||
except requests.RequestException as e:
|
||||
return None, str(e)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Dimension Scorers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def score_headline_clarity(soup: BeautifulSoup, text: str) -> DimensionScore:
|
||||
"""Score headline clarity — is the value prop obvious in <5 seconds?"""
|
||||
dim = DimensionScore(name="Headline Clarity", score=50, findings=[], recommendations=[])
|
||||
|
||||
h1_tags = soup.find_all("h1")
|
||||
h2_tags = soup.find_all("h2")
|
||||
|
||||
# Check H1 exists
|
||||
if not h1_tags:
|
||||
dim.score -= 30
|
||||
dim.findings.append("No H1 tag found on the page")
|
||||
dim.recommendations.append("Add a clear H1 headline that states your primary value proposition")
|
||||
else:
|
||||
h1_text = h1_tags[0].get_text(strip=True)
|
||||
dim.findings.append(f"H1 found: \"{h1_text[:80]}{'...' if len(h1_text) > 80 else ''}\"")
|
||||
|
||||
# Length check
|
||||
word_count = len(h1_text.split())
|
||||
if word_count < 3:
|
||||
dim.score -= 10
|
||||
dim.findings.append(f"H1 is very short ({word_count} words) — may lack specificity")
|
||||
dim.recommendations.append("Expand headline to include a specific benefit or outcome")
|
||||
elif word_count > 15:
|
||||
dim.score -= 10
|
||||
dim.findings.append(f"H1 is long ({word_count} words) — may lose attention")
|
||||
dim.recommendations.append("Shorten headline to 6-12 words for maximum clarity")
|
||||
else:
|
||||
dim.score += 15
|
||||
|
||||
# Check for benefit/outcome language
|
||||
benefit_words = re.compile(
|
||||
r"\b(grow|increase|boost|save|reduce|eliminate|transform|"
|
||||
r"automate|simplify|faster|better|easier|free|revenue|"
|
||||
r"profit|leads|sales|customers|results|roi)\b",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
if benefit_words.search(h1_text):
|
||||
dim.score += 15
|
||||
dim.findings.append("Headline contains benefit-oriented language")
|
||||
else:
|
||||
dim.recommendations.append("Include a specific benefit or outcome in the headline (e.g., 'Get 2x more leads')")
|
||||
|
||||
# Multiple H1s is bad
|
||||
if len(h1_tags) > 1:
|
||||
dim.score -= 10
|
||||
dim.findings.append(f"Multiple H1 tags found ({len(h1_tags)}) — confuses hierarchy")
|
||||
dim.recommendations.append("Use only one H1 tag per page for clear message hierarchy")
|
||||
|
||||
# Check for supporting subheadline
|
||||
if h1_tags and h2_tags:
|
||||
# Check if an H2 is near the H1 (within first few elements)
|
||||
dim.score += 10
|
||||
dim.findings.append("Supporting subheadline (H2) found")
|
||||
elif h1_tags:
|
||||
dim.recommendations.append("Add a subheadline (H2) that elaborates on the H1 value proposition")
|
||||
|
||||
# Check hero section has text content
|
||||
hero_selectors = ["[class*='hero']", "[class*='banner']", "[class*='jumbotron']", "header"]
|
||||
has_hero = False
|
||||
for sel in hero_selectors:
|
||||
hero = soup.select_one(sel)
|
||||
if hero and len(hero.get_text(strip=True)) > 20:
|
||||
has_hero = True
|
||||
dim.score += 10
|
||||
dim.findings.append("Hero/banner section detected with content")
|
||||
break
|
||||
if not has_hero:
|
||||
dim.recommendations.append("Consider adding a prominent hero section with headline + subheadline + CTA")
|
||||
|
||||
dim.score = max(0, min(100, dim.score))
|
||||
return dim
|
||||
|
||||
|
||||
def score_cta_visibility(soup: BeautifulSoup, text: str) -> DimensionScore:
|
||||
"""Score CTA visibility — are CTAs prominent, contrasting, above the fold?"""
|
||||
dim = DimensionScore(name="CTA Visibility", score=40, findings=[], recommendations=[])
|
||||
|
||||
# Find buttons and links with CTA text
|
||||
buttons = soup.find_all(["button", "a"])
|
||||
cta_elements = []
|
||||
for btn in buttons:
|
||||
btn_text = btn.get_text(strip=True)
|
||||
if CTA_PATTERNS.search(btn_text):
|
||||
cta_elements.append(btn)
|
||||
|
||||
if not cta_elements:
|
||||
dim.score -= 25
|
||||
dim.findings.append("No recognizable CTA buttons/links found")
|
||||
dim.recommendations.append(
|
||||
"Add clear call-to-action buttons with action-oriented text "
|
||||
"(e.g., 'Get Started Free', 'Book a Demo')"
|
||||
)
|
||||
else:
|
||||
dim.score += 15
|
||||
cta_texts = [el.get_text(strip=True)[:50] for el in cta_elements[:5]]
|
||||
dim.findings.append(f"Found {len(cta_elements)} CTA element(s): {', '.join(cta_texts)}")
|
||||
|
||||
# Check for styled buttons (class contains btn/button/cta)
|
||||
styled_ctas = [
|
||||
el for el in cta_elements
|
||||
if el.get("class") and any(
|
||||
c for c in el.get("class", [])
|
||||
if re.search(r"btn|button|cta", c, re.IGNORECASE)
|
||||
)
|
||||
]
|
||||
if styled_ctas:
|
||||
dim.score += 10
|
||||
dim.findings.append(f"{len(styled_ctas)} CTA(s) have button styling classes")
|
||||
else:
|
||||
dim.recommendations.append("Style CTAs as prominent buttons with contrasting colors")
|
||||
|
||||
# Check for inline styles with background color (contrasting)
|
||||
for el in cta_elements[:3]:
|
||||
style = el.get("style", "")
|
||||
if "background" in style.lower() or "color" in style.lower():
|
||||
dim.score += 5
|
||||
break
|
||||
|
||||
# Check if CTA appears early in the HTML (proxy for above-the-fold)
|
||||
page_length = len(text)
|
||||
if cta_elements:
|
||||
first_cta_pos = text.find(str(cta_elements[0]))
|
||||
if first_cta_pos > 0 and first_cta_pos < page_length * 0.3:
|
||||
dim.score += 15
|
||||
dim.findings.append("First CTA appears in the top 30% of page HTML (likely above fold)")
|
||||
elif first_cta_pos > page_length * 0.6:
|
||||
dim.score -= 10
|
||||
dim.findings.append("First CTA appears late in the page — likely below the fold")
|
||||
dim.recommendations.append("Move primary CTA above the fold so visitors see it without scrolling")
|
||||
|
||||
# Check for multiple CTAs (reinforcement)
|
||||
if len(cta_elements) >= 2:
|
||||
dim.score += 10
|
||||
dim.findings.append("Multiple CTAs found — good reinforcement throughout page")
|
||||
elif len(cta_elements) == 1:
|
||||
dim.recommendations.append("Add a second CTA further down the page to catch scrollers")
|
||||
|
||||
# Check for sticky/fixed nav with CTA
|
||||
nav = soup.find("nav")
|
||||
if nav:
|
||||
nav_ctas = [el for el in nav.find_all(["button", "a"]) if CTA_PATTERNS.search(el.get_text(strip=True))]
|
||||
if nav_ctas:
|
||||
dim.score += 10
|
||||
dim.findings.append("Navigation bar contains a CTA — always visible during scroll")
|
||||
|
||||
dim.score = max(0, min(100, dim.score))
|
||||
return dim
|
||||
|
||||
|
||||
def score_social_proof(soup: BeautifulSoup, text: str) -> DimensionScore:
|
||||
"""Score social proof presence — testimonials, logos, case studies, numbers."""
|
||||
dim = DimensionScore(name="Social Proof", score=30, findings=[], recommendations=[])
|
||||
|
||||
# Check for social proof text patterns
|
||||
matches = SOCIAL_PROOF_PATTERNS.findall(text)
|
||||
if matches:
|
||||
unique = set(m.lower() for m in matches)
|
||||
dim.score += min(25, len(unique) * 5)
|
||||
dim.findings.append(f"Social proof signals found: {', '.join(list(unique)[:8])}")
|
||||
|
||||
# Check for testimonial-like structures
|
||||
blockquotes = soup.find_all("blockquote")
|
||||
testimonial_divs = soup.select(
|
||||
"[class*='testimonial'], [class*='review'], [class*='quote'], "
|
||||
"[class*='feedback'], [class*='client'], [id*='testimonial']"
|
||||
)
|
||||
if blockquotes or testimonial_divs:
|
||||
count = len(blockquotes) + len(testimonial_divs)
|
||||
dim.score += 15
|
||||
dim.findings.append(f"Testimonial/quote elements found ({count})")
|
||||
else:
|
||||
dim.recommendations.append("Add customer testimonials with real names, titles, and photos")
|
||||
|
||||
# Check for logo bars / trust logos
|
||||
logo_sections = soup.select(
|
||||
"[class*='logo'], [class*='partner'], [class*='client'], "
|
||||
"[class*='brand'], [class*='trust'], [class*='company']"
|
||||
)
|
||||
img_tags = soup.find_all("img")
|
||||
logo_imgs = [
|
||||
img for img in img_tags
|
||||
if img.get("alt") and re.search(r"logo|client|partner|brand", img.get("alt", ""), re.IGNORECASE)
|
||||
]
|
||||
if logo_sections or logo_imgs:
|
||||
dim.score += 15
|
||||
count = max(len(logo_sections), len(logo_imgs))
|
||||
dim.findings.append(f"Client/partner logo elements detected ({count})")
|
||||
else:
|
||||
dim.recommendations.append("Add a logo bar showing recognizable client/partner brands")
|
||||
|
||||
# Check for specific numbers (e.g., "10,000+ customers")
|
||||
number_proof = re.findall(
|
||||
r"\d[\d,]*\s*\+?\s*(users?|customers?|clients?|companies|businesses|downloads?|reviews?|ratings?)",
|
||||
text, re.IGNORECASE,
|
||||
)
|
||||
if number_proof:
|
||||
dim.score += 10
|
||||
dim.findings.append(f"Quantified social proof: {', '.join(number_proof[:3])}")
|
||||
else:
|
||||
dim.recommendations.append("Add specific numbers (e.g., '10,000+ customers') to quantify trust")
|
||||
|
||||
# Star ratings
|
||||
star_elements = soup.select("[class*='star'], [class*='rating']")
|
||||
if star_elements:
|
||||
dim.score += 5
|
||||
dim.findings.append("Star/rating elements detected")
|
||||
|
||||
if not matches and not blockquotes and not testimonial_divs and not logo_sections:
|
||||
dim.recommendations.append(
|
||||
"Social proof is critically missing. Add at minimum: 1 testimonial, "
|
||||
"a client logo bar, and a quantified metric (e.g., '500+ companies trust us')"
|
||||
)
|
||||
|
||||
dim.score = max(0, min(100, dim.score))
|
||||
return dim
|
||||
|
||||
|
||||
def score_urgency(soup: BeautifulSoup, text: str) -> DimensionScore:
|
||||
"""Score urgency/scarcity elements."""
|
||||
dim = DimensionScore(name="Urgency", score=40, findings=[], recommendations=[])
|
||||
|
||||
matches = URGENCY_PATTERNS.findall(text)
|
||||
if matches:
|
||||
unique = set(m.lower() for m in matches)
|
||||
dim.score += min(35, len(unique) * 10)
|
||||
dim.findings.append(f"Urgency signals found: {', '.join(list(unique)[:5])}")
|
||||
else:
|
||||
dim.findings.append("No urgency/scarcity elements detected")
|
||||
dim.recommendations.append(
|
||||
"Consider adding subtle urgency elements: limited-time offers, "
|
||||
"countdown timers, or limited availability messaging"
|
||||
)
|
||||
|
||||
# Countdown timer elements
|
||||
countdown = soup.select("[class*='countdown'], [class*='timer'], [id*='countdown']")
|
||||
if countdown:
|
||||
dim.score += 15
|
||||
dim.findings.append("Countdown timer element detected")
|
||||
|
||||
# Note: urgency isn't always appropriate — score is less punitive
|
||||
if not matches and not countdown:
|
||||
dim.score = max(dim.score, 35) # Floor at 35 — not having urgency is okay for many pages
|
||||
|
||||
dim.score = max(0, min(100, dim.score))
|
||||
return dim
|
||||
|
||||
|
||||
def score_trust_signals(soup: BeautifulSoup, text: str) -> DimensionScore:
|
||||
"""Score trust signals — security, guarantees, compliance badges."""
|
||||
dim = DimensionScore(name="Trust Signals", score=35, findings=[], recommendations=[])
|
||||
|
||||
matches = TRUST_PATTERNS.findall(text)
|
||||
if matches:
|
||||
unique = set(m.lower() for m in matches)
|
||||
dim.score += min(30, len(unique) * 8)
|
||||
dim.findings.append(f"Trust signals found: {', '.join(list(unique)[:6])}")
|
||||
|
||||
# Privacy policy link
|
||||
privacy_links = [
|
||||
a for a in soup.find_all("a")
|
||||
if re.search(r"privacy|terms|policy", a.get_text(strip=True), re.IGNORECASE)
|
||||
]
|
||||
if privacy_links:
|
||||
dim.score += 10
|
||||
dim.findings.append("Privacy policy / terms links found")
|
||||
else:
|
||||
dim.recommendations.append("Add visible links to privacy policy and terms of service")
|
||||
|
||||
# Guarantee language
|
||||
guarantee = re.search(
|
||||
r"(money.?back|satisfaction|guarantee|risk.?free|no.?risk|full refund)",
|
||||
text, re.IGNORECASE,
|
||||
)
|
||||
if guarantee:
|
||||
dim.score += 15
|
||||
dim.findings.append(f"Guarantee messaging found: '{guarantee.group()}'")
|
||||
else:
|
||||
dim.recommendations.append("Add a guarantee or risk-reversal statement near the CTA")
|
||||
|
||||
# HTTPS check (from URL parsing — if we got here, the page loaded)
|
||||
# Security badge images
|
||||
security_imgs = [
|
||||
img for img in soup.find_all("img")
|
||||
if img.get("alt") and re.search(
|
||||
r"secure|ssl|badge|trust|verified|norton|mcafee",
|
||||
img.get("alt", ""), re.IGNORECASE,
|
||||
)
|
||||
]
|
||||
if security_imgs:
|
||||
dim.score += 10
|
||||
dim.findings.append(f"Security/trust badge images found ({len(security_imgs)})")
|
||||
else:
|
||||
dim.recommendations.append("Add trust badges (security seals, payment icons, compliance logos) near forms/CTAs")
|
||||
|
||||
dim.score = max(0, min(100, dim.score))
|
||||
return dim
|
||||
|
||||
|
||||
def score_form_friction(soup: BeautifulSoup, text: str) -> DimensionScore:
|
||||
"""Score form friction — fewer fields = less friction."""
|
||||
dim = DimensionScore(name="Form Friction", score=60, findings=[], recommendations=[])
|
||||
|
||||
forms = soup.find_all("form")
|
||||
if not forms:
|
||||
# No form could be good (simple CTA) or bad (no conversion mechanism)
|
||||
cta_links = [a for a in soup.find_all("a") if CTA_PATTERNS.search(a.get_text(strip=True))]
|
||||
if cta_links:
|
||||
dim.score = 75
|
||||
dim.findings.append("No form found — page uses link-based CTAs (low friction)")
|
||||
else:
|
||||
dim.score = 50
|
||||
dim.findings.append("No form or clear conversion mechanism found")
|
||||
dim.recommendations.append("Add a form or clear CTA link for lead capture")
|
||||
return dim
|
||||
|
||||
# Analyze the primary form (first one)
|
||||
form = forms[0]
|
||||
inputs = form.find_all(["input", "select", "textarea"])
|
||||
visible_inputs = [
|
||||
inp for inp in inputs
|
||||
if inp.get("type", "text") not in ("hidden", "submit", "button")
|
||||
]
|
||||
field_count = len(visible_inputs)
|
||||
|
||||
dim.findings.append(f"Form found with {field_count} visible field(s)")
|
||||
|
||||
if field_count <= 2:
|
||||
dim.score = 90
|
||||
dim.findings.append("Minimal form — very low friction")
|
||||
elif field_count <= 4:
|
||||
dim.score = 75
|
||||
dim.findings.append("Moderate form length — acceptable friction")
|
||||
elif field_count <= 6:
|
||||
dim.score = 55
|
||||
dim.findings.append("Form has 5-6 fields — consider reducing")
|
||||
dim.recommendations.append("Reduce form to essential fields only (name + email minimum). Every extra field drops conversion ~7%")
|
||||
elif field_count <= 10:
|
||||
dim.score = 35
|
||||
dim.findings.append(f"Long form ({field_count} fields) — high friction")
|
||||
dim.recommendations.append("Split into a multi-step form or reduce to 3-4 essential fields")
|
||||
else:
|
||||
dim.score = 15
|
||||
dim.findings.append(f"Very long form ({field_count} fields) — extreme friction")
|
||||
dim.recommendations.append("This form is too long. Use progressive profiling: capture email first, ask for details later")
|
||||
|
||||
# Check for required field indicators
|
||||
required_fields = [inp for inp in visible_inputs if inp.get("required") is not None]
|
||||
if required_fields:
|
||||
dim.findings.append(f"{len(required_fields)} required fields marked")
|
||||
|
||||
# Check for phone number field (high friction)
|
||||
phone_fields = [
|
||||
inp for inp in visible_inputs
|
||||
if re.search(r"phone|tel|mobile", inp.get("name", "") + inp.get("type", ""), re.IGNORECASE)
|
||||
]
|
||||
if phone_fields:
|
||||
dim.score -= 10
|
||||
dim.findings.append("Phone number field detected — high-friction field")
|
||||
dim.recommendations.append("Remove phone number field unless absolutely necessary. It's the #1 form abandonment cause")
|
||||
|
||||
# Check for clear submit button text
|
||||
submit_btns = form.find_all(["button", "input"], attrs={"type": ["submit", "button"]})
|
||||
if submit_btns:
|
||||
btn_text = submit_btns[0].get_text(strip=True) or submit_btns[0].get("value", "")
|
||||
if btn_text.lower() in ("submit", "send", "go"):
|
||||
dim.score -= 5
|
||||
dim.findings.append(f"Generic submit button text: '{btn_text}'")
|
||||
dim.recommendations.append(f"Change '{btn_text}' to a benefit-oriented CTA (e.g., 'Get My Free Audit')")
|
||||
elif btn_text:
|
||||
dim.findings.append(f"Submit button text: '{btn_text}'")
|
||||
|
||||
# Multiple forms
|
||||
if len(forms) > 2:
|
||||
dim.score -= 5
|
||||
dim.findings.append(f"Multiple forms on page ({len(forms)}) — may confuse visitors")
|
||||
|
||||
dim.score = max(0, min(100, dim.score))
|
||||
return dim
|
||||
|
||||
|
||||
def score_mobile_responsiveness(soup: BeautifulSoup, text: str) -> DimensionScore:
|
||||
"""Score mobile responsiveness signals from HTML/meta tags."""
|
||||
dim = DimensionScore(name="Mobile Responsiveness", score=40, findings=[], recommendations=[])
|
||||
|
||||
# Viewport meta tag
|
||||
viewport = soup.find("meta", attrs={"name": "viewport"})
|
||||
if viewport:
|
||||
content = viewport.get("content", "")
|
||||
dim.score += 25
|
||||
dim.findings.append(f"Viewport meta tag found: {content[:60]}")
|
||||
if "width=device-width" in content:
|
||||
dim.score += 10
|
||||
dim.findings.append("Viewport set to device-width — good")
|
||||
else:
|
||||
dim.score -= 20
|
||||
dim.findings.append("No viewport meta tag — page likely not mobile-optimized")
|
||||
dim.recommendations.append("Add <meta name='viewport' content='width=device-width, initial-scale=1'>")
|
||||
|
||||
# Responsive CSS indicators
|
||||
style_tags = soup.find_all("style")
|
||||
link_tags = soup.find_all("link", rel="stylesheet")
|
||||
all_css = " ".join(tag.string or "" for tag in style_tags)
|
||||
|
||||
if "@media" in all_css:
|
||||
dim.score += 10
|
||||
dim.findings.append("Media queries found in inline CSS — responsive design present")
|
||||
|
||||
# Check for responsive framework classes
|
||||
responsive_classes = re.search(
|
||||
r"(col-(?:xs|sm|md|lg|xl)|container-fluid|row|grid|flex|"
|
||||
r"sm:|md:|lg:|xl:|responsive|mobile)",
|
||||
str(soup),
|
||||
re.IGNORECASE,
|
||||
)
|
||||
if responsive_classes:
|
||||
dim.score += 10
|
||||
dim.findings.append("Responsive framework classes detected (grid/flex/breakpoint)")
|
||||
|
||||
# Touch-friendly: check for reasonable tap target sizing
|
||||
small_links = soup.find_all("a")
|
||||
inline_styled_small = [
|
||||
a for a in small_links
|
||||
if a.get("style") and re.search(r"font-size:\s*(\d+)", a.get("style", ""))
|
||||
and int(re.search(r"font-size:\s*(\d+)", a.get("style", "")).group(1)) < 12
|
||||
]
|
||||
if inline_styled_small:
|
||||
dim.score -= 5
|
||||
dim.recommendations.append("Some links have very small font sizes — ensure tap targets are at least 44x44px")
|
||||
|
||||
# AMP or mobile-specific meta
|
||||
amp = soup.find("html", attrs={"amp": True}) or soup.find("html", attrs={"⚡": True})
|
||||
if amp:
|
||||
dim.score += 5
|
||||
dim.findings.append("AMP page detected")
|
||||
|
||||
dim.score = max(0, min(100, dim.score))
|
||||
return dim
|
||||
|
||||
|
||||
def score_page_speed_indicators(soup: BeautifulSoup, html: str) -> DimensionScore:
|
||||
"""Score page speed indicators from HTML analysis (not actual load time)."""
|
||||
dim = DimensionScore(name="Page Speed Indicators", score=60, findings=[], recommendations=[])
|
||||
|
||||
# Page size
|
||||
page_size_kb = len(html.encode("utf-8")) / 1024
|
||||
dim.findings.append(f"HTML size: {page_size_kb:.0f} KB")
|
||||
if page_size_kb > 200:
|
||||
dim.score -= 15
|
||||
dim.recommendations.append(f"HTML is {page_size_kb:.0f} KB — consider reducing inline content/styles")
|
||||
elif page_size_kb > 100:
|
||||
dim.score -= 5
|
||||
|
||||
# Count images
|
||||
images = soup.find_all("img")
|
||||
dim.findings.append(f"Images found: {len(images)}")
|
||||
if len(images) > 20:
|
||||
dim.score -= 10
|
||||
dim.recommendations.append(f"Page has {len(images)} images — consider lazy loading or reducing image count")
|
||||
elif len(images) > 10:
|
||||
dim.score -= 5
|
||||
|
||||
# Check for lazy loading
|
||||
lazy_images = [img for img in images if img.get("loading") == "lazy"]
|
||||
if images and lazy_images:
|
||||
pct = len(lazy_images) / len(images) * 100
|
||||
dim.score += 10
|
||||
dim.findings.append(f"Lazy loading: {len(lazy_images)}/{len(images)} images ({pct:.0f}%)")
|
||||
elif len(images) > 5:
|
||||
dim.recommendations.append("Add loading='lazy' to below-fold images")
|
||||
|
||||
# Check for modern image formats
|
||||
modern_imgs = [
|
||||
img for img in images
|
||||
if img.get("src") and re.search(r"\.(webp|avif)", img.get("src", ""), re.IGNORECASE)
|
||||
]
|
||||
if modern_imgs:
|
||||
dim.score += 5
|
||||
dim.findings.append(f"Modern image formats (WebP/AVIF) detected: {len(modern_imgs)}")
|
||||
elif images:
|
||||
dim.recommendations.append("Convert images to WebP format for 25-35% size reduction")
|
||||
|
||||
# Count external scripts
|
||||
scripts = soup.find_all("script", src=True)
|
||||
dim.findings.append(f"External scripts: {len(scripts)}")
|
||||
if len(scripts) > 15:
|
||||
dim.score -= 15
|
||||
dim.recommendations.append(f"Page loads {len(scripts)} external scripts — audit and remove unnecessary ones")
|
||||
elif len(scripts) > 8:
|
||||
dim.score -= 5
|
||||
dim.recommendations.append("Consider deferring or async-loading non-critical scripts")
|
||||
|
||||
# Check for defer/async on scripts
|
||||
deferred = [s for s in scripts if s.get("defer") is not None or s.get("async") is not None]
|
||||
if scripts and deferred:
|
||||
pct = len(deferred) / len(scripts) * 100
|
||||
dim.findings.append(f"Deferred/async scripts: {len(deferred)}/{len(scripts)} ({pct:.0f}%)")
|
||||
dim.score += 5
|
||||
|
||||
# Count external stylesheets
|
||||
stylesheets = soup.find_all("link", rel="stylesheet")
|
||||
if len(stylesheets) > 5:
|
||||
dim.score -= 5
|
||||
dim.recommendations.append(f"Page loads {len(stylesheets)} stylesheets — consider consolidating")
|
||||
|
||||
# Inline CSS bloat
|
||||
inline_styles = soup.find_all("style")
|
||||
inline_css_size = sum(len(s.string or "") for s in inline_styles)
|
||||
if inline_css_size > 50000:
|
||||
dim.score -= 10
|
||||
dim.recommendations.append(f"Inline CSS is {inline_css_size / 1024:.0f} KB — move to external stylesheet and cache")
|
||||
|
||||
# Preconnect/preload hints
|
||||
preconnects = soup.find_all("link", rel=["preconnect", "preload", "dns-prefetch"])
|
||||
if preconnects:
|
||||
dim.score += 5
|
||||
dim.findings.append(f"Resource hints found: {len(preconnects)} preconnect/preload/dns-prefetch")
|
||||
|
||||
dim.score = max(0, min(100, dim.score))
|
||||
return dim
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Report Builder
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def compute_letter_grade(score: float) -> str:
|
||||
if score >= 95:
|
||||
return "A+"
|
||||
elif score >= 90:
|
||||
return "A"
|
||||
elif score >= 85:
|
||||
return "A-"
|
||||
elif score >= 80:
|
||||
return "B+"
|
||||
elif score >= 75:
|
||||
return "B"
|
||||
elif score >= 70:
|
||||
return "B-"
|
||||
elif score >= 65:
|
||||
return "C+"
|
||||
elif score >= 60:
|
||||
return "C"
|
||||
elif score >= 55:
|
||||
return "C-"
|
||||
elif score >= 50:
|
||||
return "D+"
|
||||
elif score >= 45:
|
||||
return "D"
|
||||
elif score >= 40:
|
||||
return "D-"
|
||||
else:
|
||||
return "F"
|
||||
|
||||
|
||||
def build_report(url: str, html: str, industry: str = "general") -> CROReport:
|
||||
"""Run all scorers and build the CRO report."""
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
|
||||
# Extract visible text (strip scripts, styles, comments)
|
||||
for element in soup(["script", "style"]):
|
||||
element.decompose()
|
||||
for comment in soup.find_all(string=lambda t: isinstance(t, Comment)):
|
||||
comment.extract()
|
||||
visible_text = soup.get_text(separator=" ", strip=True)
|
||||
|
||||
# Re-parse original for structural analysis
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
|
||||
scorers = {
|
||||
"headline_clarity": score_headline_clarity,
|
||||
"cta_visibility": score_cta_visibility,
|
||||
"social_proof": score_social_proof,
|
||||
"urgency": score_urgency,
|
||||
"trust_signals": score_trust_signals,
|
||||
"form_friction": score_form_friction,
|
||||
"mobile_responsiveness": score_mobile_responsiveness,
|
||||
"page_speed_indicators": lambda s, t: score_page_speed_indicators(s, html),
|
||||
}
|
||||
|
||||
dimensions = {}
|
||||
for key, scorer in scorers.items():
|
||||
dimensions[key] = scorer(soup, visible_text)
|
||||
|
||||
# Compute weighted overall score
|
||||
overall = sum(
|
||||
dimensions[key].score * DIMENSION_WEIGHTS[key]
|
||||
for key in DIMENSION_WEIGHTS
|
||||
)
|
||||
|
||||
# Build priority fixes (sorted by potential impact)
|
||||
priority_fixes = []
|
||||
for key, weight in sorted(DIMENSION_WEIGHTS.items(), key=lambda x: -x[1]):
|
||||
dim = dimensions[key]
|
||||
if dim.recommendations:
|
||||
impact = "HIGH" if weight >= 0.15 else ("MEDIUM" if weight >= 0.10 else "LOW")
|
||||
for rec in dim.recommendations:
|
||||
priority_fixes.append({
|
||||
"dimension": dim.name,
|
||||
"impact": impact,
|
||||
"current_score": dim.score,
|
||||
"fix": rec,
|
||||
})
|
||||
|
||||
# Sort: HIGH first, then by lowest current score
|
||||
impact_order = {"HIGH": 0, "MEDIUM": 1, "LOW": 2}
|
||||
priority_fixes.sort(key=lambda x: (impact_order[x["impact"]], x["current_score"]))
|
||||
|
||||
# Benchmark comparison
|
||||
bench = INDUSTRY_BENCHMARKS.get(industry, INDUSTRY_BENCHMARKS["general"])
|
||||
benchmark_comparison = {
|
||||
"industry": industry,
|
||||
"your_score": round(overall, 1),
|
||||
"industry_avg": bench["avg"],
|
||||
"top_quartile": bench["top_quartile"],
|
||||
"vs_avg": round(overall - bench["avg"], 1),
|
||||
"vs_top": round(overall - bench["top_quartile"], 1),
|
||||
}
|
||||
|
||||
return CROReport(
|
||||
url=url,
|
||||
overall_score=round(overall, 1),
|
||||
letter_grade=compute_letter_grade(overall),
|
||||
dimensions={k: asdict(v) for k, v in dimensions.items()},
|
||||
priority_fixes=priority_fixes,
|
||||
benchmark_comparison=benchmark_comparison,
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Output Formatters
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def format_report_text(report: CROReport) -> str:
|
||||
"""Format report as human-readable text."""
|
||||
lines = []
|
||||
lines.append("=" * 70)
|
||||
lines.append(f" CRO AUDIT REPORT")
|
||||
lines.append(f" {report.url}")
|
||||
lines.append("=" * 70)
|
||||
lines.append("")
|
||||
|
||||
if report.fetch_error:
|
||||
lines.append(f" ❌ FETCH ERROR: {report.fetch_error}")
|
||||
lines.append("")
|
||||
return "\n".join(lines)
|
||||
|
||||
# Overall score
|
||||
lines.append(f" OVERALL CRO SCORE: {report.overall_score}/100 ({report.letter_grade})")
|
||||
lines.append("")
|
||||
|
||||
# Benchmark comparison
|
||||
bc = report.benchmark_comparison
|
||||
indicator = "↑" if bc["vs_avg"] >= 0 else "↓"
|
||||
lines.append(f" Industry: {bc['industry'].upper()}")
|
||||
lines.append(f" vs. Industry Avg ({bc['industry_avg']}): {indicator} {abs(bc['vs_avg'])} points")
|
||||
top_ind = "↑" if bc["vs_top"] >= 0 else "↓"
|
||||
lines.append(f" vs. Top Quartile ({bc['top_quartile']}): {top_ind} {abs(bc['vs_top'])} points")
|
||||
lines.append("")
|
||||
|
||||
# Dimension scores
|
||||
lines.append("-" * 70)
|
||||
lines.append(" DIMENSION SCORES")
|
||||
lines.append("-" * 70)
|
||||
|
||||
for key in DIMENSION_WEIGHTS:
|
||||
dim = report.dimensions[key]
|
||||
bar_filled = int(dim["score"] / 5)
|
||||
bar = "█" * bar_filled + "░" * (20 - bar_filled)
|
||||
lines.append(f" {dim['name']:<25} {bar} {dim['score']:>3}/100")
|
||||
|
||||
for finding in dim["findings"]:
|
||||
lines.append(f" • {finding}")
|
||||
|
||||
if dim["recommendations"]:
|
||||
for rec in dim["recommendations"]:
|
||||
lines.append(f" ⚠ FIX: {rec}")
|
||||
lines.append("")
|
||||
|
||||
# Priority fixes
|
||||
if report.priority_fixes:
|
||||
lines.append("-" * 70)
|
||||
lines.append(" PRIORITY FIXES (ranked by impact)")
|
||||
lines.append("-" * 70)
|
||||
for i, fix in enumerate(report.priority_fixes[:10], 1):
|
||||
icon = {"HIGH": "🔴", "MEDIUM": "🟡", "LOW": "🟢"}[fix["impact"]]
|
||||
lines.append(f" {i}. {icon} [{fix['impact']}] {fix['dimension']} (score: {fix['current_score']})")
|
||||
lines.append(f" → {fix['fix']}")
|
||||
lines.append("")
|
||||
|
||||
lines.append("=" * 70)
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def audit_url(url: str, industry: str = "general") -> CROReport:
|
||||
"""Audit a single URL and return the report."""
|
||||
# Normalize URL
|
||||
if not url.startswith(("http://", "https://")):
|
||||
url = "https://" + url
|
||||
|
||||
html, error = fetch_page(url)
|
||||
if error:
|
||||
report = CROReport(url=url, fetch_error=error)
|
||||
return report
|
||||
|
||||
return build_report(url, html, industry)
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="AI CRO Audit — Score landing pages across 8 conversion dimensions",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""
|
||||
Examples:
|
||||
python cro_audit.py --url https://example.com/landing-page
|
||||
python cro_audit.py --urls https://a.com https://b.com --industry saas
|
||||
python cro_audit.py --file urls.txt --json --output results.json
|
||||
""",
|
||||
)
|
||||
group = parser.add_mutually_exclusive_group(required=True)
|
||||
group.add_argument("--url", help="Single URL to audit")
|
||||
group.add_argument("--urls", nargs="+", help="Multiple URLs to audit")
|
||||
group.add_argument("--file", help="File with URLs (one per line)")
|
||||
|
||||
parser.add_argument(
|
||||
"--industry",
|
||||
choices=list(INDUSTRY_BENCHMARKS.keys()),
|
||||
default="general",
|
||||
help="Industry for benchmark comparison (default: general)",
|
||||
)
|
||||
parser.add_argument("--json", action="store_true", help="Output as JSON")
|
||||
parser.add_argument("--output", help="Save report to file")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Collect URLs
|
||||
urls = []
|
||||
if args.url:
|
||||
urls = [args.url]
|
||||
elif args.urls:
|
||||
urls = args.urls
|
||||
elif args.file:
|
||||
try:
|
||||
with open(args.file) as f:
|
||||
urls = [line.strip() for line in f if line.strip() and not line.startswith("#")]
|
||||
except FileNotFoundError:
|
||||
print(f"Error: File not found: {args.file}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
if not urls:
|
||||
print("Error: No URLs provided", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
# Run audits
|
||||
reports = []
|
||||
for url in urls:
|
||||
print(f"Auditing: {url}...", file=sys.stderr)
|
||||
report = audit_url(url, args.industry)
|
||||
reports.append(report)
|
||||
|
||||
# Output
|
||||
if args.json:
|
||||
output = json.dumps(
|
||||
[asdict(r) for r in reports] if len(reports) > 1 else asdict(reports[0]),
|
||||
indent=2,
|
||||
)
|
||||
if args.output:
|
||||
with open(args.output, "w") as f:
|
||||
f.write(output)
|
||||
print(f"Report saved to {args.output}", file=sys.stderr)
|
||||
else:
|
||||
print(output)
|
||||
else:
|
||||
text_output = "\n\n".join(format_report_text(r) for r in reports)
|
||||
if args.output:
|
||||
with open(args.output, "w") as f:
|
||||
f.write(text_output)
|
||||
print(f"Report saved to {args.output}", file=sys.stderr)
|
||||
else:
|
||||
print(text_output)
|
||||
|
||||
# Summary for batch mode
|
||||
if len(reports) > 1:
|
||||
print("\n" + "=" * 70, file=sys.stderr)
|
||||
print(" BATCH SUMMARY", file=sys.stderr)
|
||||
print("=" * 70, file=sys.stderr)
|
||||
for r in sorted(reports, key=lambda x: x.overall_score, reverse=True):
|
||||
status = "✅" if not r.fetch_error else "❌"
|
||||
score = f"{r.overall_score} ({r.letter_grade})" if not r.fetch_error else "FAILED"
|
||||
print(f" {status} {score:>12} {r.url}", file=sys.stderr)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
6
conversion-ops/requirements.txt
Normal file
6
conversion-ops/requirements.txt
Normal file
|
|
@ -0,0 +1,6 @@
|
|||
requests>=2.31.0
|
||||
beautifulsoup4>=4.12.0
|
||||
lxml>=4.9.0
|
||||
scikit-learn>=1.3.0
|
||||
pandas>=2.0.0
|
||||
numpy>=1.24.0
|
||||
794
conversion-ops/survey_lead_magnet.py
Normal file
794
conversion-ops/survey_lead_magnet.py
Normal file
|
|
@ -0,0 +1,794 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
Survey-to-Lead-Magnet Engine
|
||||
==============================
|
||||
Takes survey response data (CSV), segments respondents by pain point clusters,
|
||||
ranks segments by size and commercial potential, and auto-generates lead magnet
|
||||
briefs targeting each segment.
|
||||
|
||||
Usage:
|
||||
python survey_lead_magnet.py --csv survey_responses.csv
|
||||
python survey_lead_magnet.py --csv survey.csv --pain-columns "biggest_challenge" "top_frustration"
|
||||
python survey_lead_magnet.py --csv survey.csv --top-segments 5 --json
|
||||
python survey_lead_magnet.py --csv survey.csv --output lead_magnets.json
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from collections import Counter
|
||||
from dataclasses import dataclass, field, asdict
|
||||
from typing import Optional
|
||||
|
||||
import numpy as np
|
||||
import pandas as pd
|
||||
from sklearn.feature_extraction.text import TfidfVectorizer
|
||||
from sklearn.cluster import KMeans
|
||||
from sklearn.metrics import silhouette_score
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Constants
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Columns that likely contain pain point / challenge responses
|
||||
PAIN_COLUMN_PATTERNS = re.compile(
|
||||
r"(challenge|pain|frustrat|struggle|problem|difficult|obstacle|"
|
||||
r"barrier|concern|issue|blocker|worry|fear|hard|tough|"
|
||||
r"biggest|main|top|primary|key|major|worst)",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
# Words that signal commercial intent / buying readiness
|
||||
COMMERCIAL_SIGNALS = re.compile(
|
||||
r"\b(budget|cost|price|invest|spend|pay|afford|roi|revenue|"
|
||||
r"software|tool|platform|solution|vendor|agency|consultant|"
|
||||
r"hire|outsource|automate|scale|grow|implement|upgrade|"
|
||||
r"need|want|looking for|searching|evaluating|considering)\b",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
# Lead magnet format heuristics
|
||||
FORMAT_KEYWORDS = {
|
||||
"guide": ["understand", "learn", "how", "why", "strategy", "approach", "framework", "concept", "complex"],
|
||||
"checklist": ["process", "steps", "workflow", "setup", "launch", "implement", "execute", "routine", "daily"],
|
||||
"template": ["create", "write", "build", "design", "plan", "proposal", "email", "message", "document"],
|
||||
"calculator": ["cost", "budget", "roi", "numbers", "forecast", "estimate", "pricing", "revenue", "metrics"],
|
||||
"swipe_file": ["examples", "inspiration", "copy", "ads", "headlines", "subject lines", "creative", "ideas"],
|
||||
}
|
||||
|
||||
# Stopwords for clustering (extend sklearn's default)
|
||||
EXTRA_STOPWORDS = [
|
||||
"really", "just", "like", "thing", "things", "lot", "also",
|
||||
"get", "getting", "got", "know", "dont", "don't", "can't",
|
||||
"want", "need", "think", "feel", "make", "much", "many",
|
||||
"very", "would", "could", "should", "way", "able",
|
||||
"one", "two", "first", "new", "good", "bad", "hard",
|
||||
"well", "time", "still", "even", "right", "going",
|
||||
]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Data Classes
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@dataclass
|
||||
class PainSegment:
|
||||
segment_id: int
|
||||
theme: str
|
||||
top_keywords: list
|
||||
respondent_count: int
|
||||
respondent_pct: float
|
||||
commercial_score: float # 0-100
|
||||
sample_responses: list
|
||||
representative_quotes: list
|
||||
|
||||
|
||||
@dataclass
|
||||
class LeadMagnetBrief:
|
||||
segment_id: int
|
||||
segment_theme: str
|
||||
title: str
|
||||
format: str # guide, checklist, template, calculator, swipe_file
|
||||
hook: str
|
||||
outline: list
|
||||
target_cta: str
|
||||
distribution_channel: str
|
||||
viral_potential: int # 0-100
|
||||
conversion_potential: int # 0-100
|
||||
combined_score: float
|
||||
implementation_notes: str
|
||||
|
||||
|
||||
@dataclass
|
||||
class AnalysisResult:
|
||||
total_respondents: int
|
||||
columns_analyzed: list
|
||||
segments: list
|
||||
lead_magnets: list
|
||||
implementation_roadmap: list
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Data Ingestion
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def load_survey_data(csv_path: str) -> pd.DataFrame:
|
||||
"""Load survey CSV. Tries multiple encodings."""
|
||||
for encoding in ["utf-8", "utf-8-sig", "latin-1", "cp1252"]:
|
||||
try:
|
||||
df = pd.read_csv(csv_path, encoding=encoding)
|
||||
return df
|
||||
except (UnicodeDecodeError, pd.errors.ParserError):
|
||||
continue
|
||||
raise ValueError(f"Could not read CSV file: {csv_path}")
|
||||
|
||||
|
||||
def detect_pain_columns(df: pd.DataFrame) -> list:
|
||||
"""Auto-detect columns that likely contain pain point / challenge data."""
|
||||
pain_cols = []
|
||||
for col in df.columns:
|
||||
if PAIN_COLUMN_PATTERNS.search(col):
|
||||
pain_cols.append(col)
|
||||
|
||||
# If no pattern matches, look for open-text columns (long average text)
|
||||
if not pain_cols:
|
||||
for col in df.columns:
|
||||
if df[col].dtype == object:
|
||||
avg_len = df[col].dropna().astype(str).str.len().mean()
|
||||
if avg_len > 30: # likely free-text responses
|
||||
pain_cols.append(col)
|
||||
|
||||
return pain_cols
|
||||
|
||||
|
||||
def extract_responses(df: pd.DataFrame, pain_columns: list) -> list:
|
||||
"""Extract and combine text responses from pain columns."""
|
||||
responses = []
|
||||
for _, row in df.iterrows():
|
||||
parts = []
|
||||
for col in pain_columns:
|
||||
val = row.get(col)
|
||||
if pd.notna(val) and str(val).strip():
|
||||
parts.append(str(val).strip())
|
||||
combined = " ".join(parts)
|
||||
if combined:
|
||||
responses.append(combined)
|
||||
return responses
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Clustering
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def preprocess_text(text: str) -> str:
|
||||
"""Clean and normalize text for clustering."""
|
||||
text = text.lower()
|
||||
text = re.sub(r"[^a-z\s]", " ", text)
|
||||
text = re.sub(r"\s+", " ", text).strip()
|
||||
return text
|
||||
|
||||
|
||||
def cluster_responses(responses: list, n_clusters: Optional[int] = None) -> tuple:
|
||||
"""
|
||||
Cluster responses using TF-IDF + KMeans.
|
||||
Returns (labels, vectorizer, tfidf_matrix, n_clusters).
|
||||
"""
|
||||
if len(responses) < 5:
|
||||
# Too few responses — treat as single cluster
|
||||
return [0] * len(responses), None, None, 1
|
||||
|
||||
cleaned = [preprocess_text(r) for r in responses]
|
||||
|
||||
# Build TF-IDF matrix
|
||||
stop_words = list(TfidfVectorizer(stop_words="english").get_stop_words()) + EXTRA_STOPWORDS
|
||||
vectorizer = TfidfVectorizer(
|
||||
max_features=500,
|
||||
stop_words=stop_words,
|
||||
min_df=2 if len(responses) > 20 else 1,
|
||||
max_df=0.85,
|
||||
ngram_range=(1, 2),
|
||||
)
|
||||
|
||||
try:
|
||||
tfidf_matrix = vectorizer.fit_transform(cleaned)
|
||||
except ValueError:
|
||||
# All responses too similar or empty after preprocessing
|
||||
return [0] * len(responses), None, None, 1
|
||||
|
||||
# Auto-determine cluster count if not specified
|
||||
if n_clusters is None:
|
||||
max_k = min(10, len(responses) // 3, tfidf_matrix.shape[0] - 1)
|
||||
max_k = max(2, max_k)
|
||||
|
||||
best_k = 3
|
||||
best_score = -1
|
||||
|
||||
for k in range(2, max_k + 1):
|
||||
try:
|
||||
km = KMeans(n_clusters=k, random_state=42, n_init=10)
|
||||
labels = km.fit_predict(tfidf_matrix)
|
||||
score = silhouette_score(tfidf_matrix, labels)
|
||||
if score > best_score:
|
||||
best_score = score
|
||||
best_k = k
|
||||
except ValueError:
|
||||
continue
|
||||
|
||||
n_clusters = best_k
|
||||
|
||||
km = KMeans(n_clusters=n_clusters, random_state=42, n_init=10)
|
||||
labels = km.fit_predict(tfidf_matrix)
|
||||
|
||||
return labels, vectorizer, tfidf_matrix, n_clusters
|
||||
|
||||
|
||||
def extract_cluster_keywords(
|
||||
vectorizer: TfidfVectorizer,
|
||||
tfidf_matrix,
|
||||
labels: list,
|
||||
cluster_id: int,
|
||||
top_n: int = 8,
|
||||
) -> list:
|
||||
"""Get top keywords for a specific cluster."""
|
||||
if vectorizer is None:
|
||||
return ["general"]
|
||||
|
||||
mask = np.array(labels) == cluster_id
|
||||
cluster_matrix = tfidf_matrix[mask]
|
||||
|
||||
if cluster_matrix.shape[0] == 0:
|
||||
return []
|
||||
|
||||
mean_tfidf = cluster_matrix.mean(axis=0).A1
|
||||
feature_names = vectorizer.get_feature_names_out()
|
||||
top_indices = mean_tfidf.argsort()[-top_n:][::-1]
|
||||
|
||||
return [feature_names[i] for i in top_indices if mean_tfidf[i] > 0]
|
||||
|
||||
|
||||
def generate_theme_label(keywords: list) -> str:
|
||||
"""Generate a human-readable theme label from top keywords."""
|
||||
if not keywords:
|
||||
return "General Challenges"
|
||||
|
||||
# Take top 2-3 keywords and create a label
|
||||
top = keywords[:3]
|
||||
# Capitalize and join
|
||||
theme = " & ".join(word.replace("_", " ").title() for word in top)
|
||||
return theme
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Scoring
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def score_commercial_potential(responses: list) -> float:
|
||||
"""Score how commercially valuable a segment is (0-100)."""
|
||||
if not responses:
|
||||
return 0
|
||||
|
||||
total_signals = 0
|
||||
for resp in responses:
|
||||
matches = COMMERCIAL_SIGNALS.findall(resp)
|
||||
total_signals += len(matches)
|
||||
|
||||
# Normalize: avg signals per response, scaled to 0-100
|
||||
avg_signals = total_signals / len(responses)
|
||||
score = min(100, avg_signals * 25) # 4+ avg signals = 100
|
||||
return round(score, 1)
|
||||
|
||||
|
||||
def recommend_format(keywords: list, responses: list) -> str:
|
||||
"""Recommend the best lead magnet format based on pain cluster."""
|
||||
combined_text = " ".join(keywords) + " " + " ".join(responses[:10])
|
||||
combined_lower = combined_text.lower()
|
||||
|
||||
scores = {}
|
||||
for fmt, trigger_words in FORMAT_KEYWORDS.items():
|
||||
score = sum(1 for word in trigger_words if word in combined_lower)
|
||||
scores[fmt] = score
|
||||
|
||||
best = max(scores, key=scores.get)
|
||||
if scores[best] == 0:
|
||||
return "guide" # default
|
||||
return best
|
||||
|
||||
|
||||
def score_viral_potential(title: str, fmt: str, segment_size_pct: float) -> int:
|
||||
"""Score how likely a lead magnet is to be shared (0-100)."""
|
||||
score = 30 # baseline
|
||||
|
||||
# Larger segments = more sharing potential
|
||||
score += min(25, segment_size_pct * 1.5)
|
||||
|
||||
# Templates and checklists are more shareable
|
||||
format_boost = {
|
||||
"template": 15,
|
||||
"checklist": 12,
|
||||
"swipe_file": 18,
|
||||
"calculator": 10,
|
||||
"guide": 5,
|
||||
}
|
||||
score += format_boost.get(fmt, 0)
|
||||
|
||||
# Titles with numbers or specific outcomes
|
||||
if re.search(r"\d+", title):
|
||||
score += 10
|
||||
if re.search(r"(ultimate|complete|definitive|proven|secret)", title, re.IGNORECASE):
|
||||
score += 5
|
||||
|
||||
return min(100, int(score))
|
||||
|
||||
|
||||
def score_conversion_potential(commercial_score: float, segment_size_pct: float, fmt: str) -> int:
|
||||
"""Score how likely a lead magnet is to convert to leads/customers (0-100)."""
|
||||
score = 20 # baseline
|
||||
|
||||
# Commercial intent is the strongest signal
|
||||
score += commercial_score * 0.4
|
||||
|
||||
# Segment size matters but with diminishing returns
|
||||
score += min(15, segment_size_pct * 0.8)
|
||||
|
||||
# Some formats convert better
|
||||
conversion_boost = {
|
||||
"calculator": 15,
|
||||
"template": 12,
|
||||
"checklist": 10,
|
||||
"guide": 5,
|
||||
"swipe_file": 8,
|
||||
}
|
||||
score += conversion_boost.get(fmt, 0)
|
||||
|
||||
return min(100, int(score))
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Lead Magnet Brief Generator
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
FORMAT_LABELS = {
|
||||
"guide": "Comprehensive Guide",
|
||||
"checklist": "Actionable Checklist",
|
||||
"template": "Ready-to-Use Template",
|
||||
"calculator": "Interactive Calculator",
|
||||
"swipe_file": "Swipe File Collection",
|
||||
}
|
||||
|
||||
|
||||
def generate_title(theme: str, fmt: str, keywords: list) -> str:
|
||||
"""Generate a lead magnet title."""
|
||||
templates = {
|
||||
"guide": [
|
||||
f"The Complete Guide to {theme}",
|
||||
f"How to Solve {theme}: A Step-by-Step Guide",
|
||||
f"{theme} Mastery: Everything You Need to Know",
|
||||
],
|
||||
"checklist": [
|
||||
f"The {theme} Checklist: {min(15, 5 + len(keywords))} Steps to Success",
|
||||
f"Your {theme} Pre-Launch Checklist",
|
||||
f"{theme}: The Essential Checklist",
|
||||
],
|
||||
"template": [
|
||||
f"{theme} Template Pack: Copy, Customize, Launch",
|
||||
f"The {theme} Template That Saves 10+ Hours/Week",
|
||||
f"Plug-and-Play {theme} Templates",
|
||||
],
|
||||
"calculator": [
|
||||
f"{theme} Calculator: Know Your Numbers in 5 Minutes",
|
||||
f"The {theme} ROI Calculator",
|
||||
f"Calculate Your {theme} Score",
|
||||
],
|
||||
"swipe_file": [
|
||||
f"50+ {theme} Examples That Actually Work",
|
||||
f"The {theme} Swipe File: Steal These Ideas",
|
||||
f"Best-in-Class {theme} Examples (Curated Collection)",
|
||||
],
|
||||
}
|
||||
|
||||
options = templates.get(fmt, templates["guide"])
|
||||
return options[0]
|
||||
|
||||
|
||||
def generate_hook(theme: str, keywords: list, sample_responses: list) -> str:
|
||||
"""Generate a compelling hook for the lead magnet."""
|
||||
# Extract a pain point from sample responses for the hook
|
||||
pain_phrase = ""
|
||||
if sample_responses:
|
||||
# Find the most representative short phrase
|
||||
for resp in sample_responses[:5]:
|
||||
if 20 < len(resp) < 150:
|
||||
pain_phrase = resp
|
||||
break
|
||||
|
||||
if pain_phrase:
|
||||
return (
|
||||
f"If you've ever thought \"{pain_phrase[:80]}{'...' if len(pain_phrase) > 80 else ''}\" "
|
||||
f"— this is for you. We analyzed hundreds of responses and found the exact "
|
||||
f"patterns that separate those who overcome {keywords[0] if keywords else 'this challenge'} "
|
||||
f"from those who stay stuck."
|
||||
)
|
||||
else:
|
||||
return (
|
||||
f"Most teams waste months trying to figure out {theme.lower()} on their own. "
|
||||
f"This resource distills proven strategies into actionable steps you can "
|
||||
f"implement today."
|
||||
)
|
||||
|
||||
|
||||
def generate_outline(theme: str, fmt: str, keywords: list) -> list:
|
||||
"""Generate a content outline for the lead magnet."""
|
||||
sections = [f"Section 1: Why {theme} Matters Now (The Landscape)"]
|
||||
|
||||
if fmt == "guide":
|
||||
sections.extend([
|
||||
f"Section 2: The Core Framework for {keywords[0].title() if keywords else 'Success'}",
|
||||
f"Section 3: Common Mistakes (And How to Avoid Them)",
|
||||
f"Section 4: Step-by-Step Implementation Plan",
|
||||
f"Section 5: Tools & Resources You'll Need",
|
||||
f"Section 6: Case Studies — What Good Looks Like",
|
||||
f"Section 7: Quick-Start Action Plan",
|
||||
])
|
||||
elif fmt == "checklist":
|
||||
sections.extend([
|
||||
f"Section 2: Pre-Work — What to Have Ready",
|
||||
f"Section 3: Phase 1 — Foundation ({keywords[0].title() if keywords else 'Setup'})",
|
||||
f"Section 4: Phase 2 — Execution ({keywords[1].title() if len(keywords) > 1 else 'Build'})",
|
||||
f"Section 5: Phase 3 — Optimization & Measurement",
|
||||
f"Section 6: Common Gotchas to Watch For",
|
||||
])
|
||||
elif fmt == "template":
|
||||
sections.extend([
|
||||
f"Section 2: How to Use This Template",
|
||||
f"Section 3: Template A — {keywords[0].title() if keywords else 'Standard'} Version",
|
||||
f"Section 4: Template B — Advanced Version",
|
||||
f"Section 5: Customization Guide",
|
||||
f"Section 6: Real Examples (Filled-In Templates)",
|
||||
])
|
||||
elif fmt == "calculator":
|
||||
sections.extend([
|
||||
f"Section 2: Key Metrics You Need to Track",
|
||||
f"Section 3: Input Your Numbers",
|
||||
f"Section 4: Understanding Your Results",
|
||||
f"Section 5: Benchmarks — How You Compare",
|
||||
f"Section 6: Action Steps Based on Your Score",
|
||||
])
|
||||
elif fmt == "swipe_file":
|
||||
sections.extend([
|
||||
f"Section 2: What Makes These Examples Work",
|
||||
f"Section 3: Category A — {keywords[0].title() if keywords else 'Top Performers'}",
|
||||
f"Section 4: Category B — {keywords[1].title() if len(keywords) > 1 else 'Rising Stars'}",
|
||||
f"Section 5: How to Adapt These for Your Business",
|
||||
f"Section 6: Blank Templates to Get Started",
|
||||
])
|
||||
|
||||
return sections
|
||||
|
||||
|
||||
def generate_cta(fmt: str, theme: str) -> str:
|
||||
"""Generate the target CTA for the lead magnet landing page."""
|
||||
ctas = {
|
||||
"guide": f"Download the Free {theme} Guide",
|
||||
"checklist": f"Get Your Free {theme} Checklist",
|
||||
"template": f"Grab the Free {theme} Templates",
|
||||
"calculator": f"Try the Free {theme} Calculator",
|
||||
"swipe_file": f"Download {theme} Swipe File",
|
||||
}
|
||||
return ctas.get(fmt, f"Get Free {theme} Resource")
|
||||
|
||||
|
||||
def recommend_distribution(fmt: str, segment_size_pct: float) -> str:
|
||||
"""Recommend primary distribution channel."""
|
||||
if segment_size_pct > 25:
|
||||
return "Homepage popup + dedicated landing page + paid social"
|
||||
elif segment_size_pct > 15:
|
||||
return "Blog content upgrade + email nurture sequence"
|
||||
elif segment_size_pct > 8:
|
||||
return "Targeted blog posts + LinkedIn organic"
|
||||
else:
|
||||
return "Niche community posts + targeted email segment"
|
||||
|
||||
|
||||
def build_lead_magnet_brief(segment: PainSegment) -> LeadMagnetBrief:
|
||||
"""Generate a complete lead magnet brief for a pain segment."""
|
||||
fmt = recommend_format(segment.top_keywords, segment.sample_responses)
|
||||
title = generate_title(segment.theme, fmt, segment.top_keywords)
|
||||
hook = generate_hook(segment.theme, segment.top_keywords, segment.sample_responses)
|
||||
outline = generate_outline(segment.theme, fmt, segment.top_keywords)
|
||||
cta = generate_cta(fmt, segment.theme)
|
||||
channel = recommend_distribution(fmt, segment.respondent_pct)
|
||||
|
||||
viral = score_viral_potential(title, fmt, segment.respondent_pct)
|
||||
conversion = score_conversion_potential(
|
||||
segment.commercial_score, segment.respondent_pct, fmt,
|
||||
)
|
||||
combined = (viral * 0.4 + conversion * 0.6)
|
||||
|
||||
impl_notes = (
|
||||
f"Target segment: {segment.respondent_count} respondents ({segment.respondent_pct:.1f}% of total). "
|
||||
f"Commercial intent score: {segment.commercial_score}/100. "
|
||||
f"Recommended format: {FORMAT_LABELS.get(fmt, fmt)}. "
|
||||
f"Estimated production time: {'1-2 days' if fmt in ('checklist', 'template') else '3-5 days'}."
|
||||
)
|
||||
|
||||
return LeadMagnetBrief(
|
||||
segment_id=segment.segment_id,
|
||||
segment_theme=segment.theme,
|
||||
title=title,
|
||||
format=FORMAT_LABELS.get(fmt, fmt),
|
||||
hook=hook,
|
||||
outline=outline,
|
||||
target_cta=cta,
|
||||
distribution_channel=channel,
|
||||
viral_potential=viral,
|
||||
conversion_potential=conversion,
|
||||
combined_score=round(combined, 1),
|
||||
implementation_notes=impl_notes,
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Analysis Pipeline
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def analyze_survey(
|
||||
csv_path: str,
|
||||
pain_columns: Optional[list] = None,
|
||||
top_segments: int = 5,
|
||||
) -> AnalysisResult:
|
||||
"""Full analysis pipeline: load → cluster → score → generate briefs."""
|
||||
|
||||
# Load data
|
||||
df = load_survey_data(csv_path)
|
||||
total_respondents = len(df)
|
||||
|
||||
# Detect or use specified pain columns
|
||||
if pain_columns:
|
||||
# Validate columns exist
|
||||
missing = [c for c in pain_columns if c not in df.columns]
|
||||
if missing:
|
||||
# Try fuzzy match
|
||||
actual_cols = []
|
||||
for pc in pain_columns:
|
||||
matches = [c for c in df.columns if pc.lower() in c.lower()]
|
||||
if matches:
|
||||
actual_cols.append(matches[0])
|
||||
else:
|
||||
raise ValueError(f"Column not found: '{pc}'. Available: {list(df.columns)}")
|
||||
pain_columns = actual_cols
|
||||
else:
|
||||
pain_columns = detect_pain_columns(df)
|
||||
if not pain_columns:
|
||||
raise ValueError(
|
||||
"Could not auto-detect pain point columns. "
|
||||
"Use --pain-columns to specify which columns contain challenge/pain responses.\n"
|
||||
f"Available columns: {list(df.columns)}"
|
||||
)
|
||||
|
||||
print(f"Analyzing columns: {pain_columns}", file=sys.stderr)
|
||||
|
||||
# Extract responses
|
||||
responses = extract_responses(df, pain_columns)
|
||||
if not responses:
|
||||
raise ValueError("No non-empty responses found in the specified columns")
|
||||
|
||||
print(f"Found {len(responses)} responses from {total_respondents} respondents", file=sys.stderr)
|
||||
|
||||
# Cluster
|
||||
labels, vectorizer, tfidf_matrix, n_clusters = cluster_responses(
|
||||
responses, n_clusters=min(top_segments, len(responses) // 2) if len(responses) < 30 else None,
|
||||
)
|
||||
|
||||
# Build segments
|
||||
segments = []
|
||||
for cluster_id in range(n_clusters):
|
||||
mask = [i for i, l in enumerate(labels) if l == cluster_id]
|
||||
cluster_responses_list = [responses[i] for i in mask]
|
||||
|
||||
keywords = extract_cluster_keywords(vectorizer, tfidf_matrix, labels, cluster_id)
|
||||
theme = generate_theme_label(keywords)
|
||||
commercial = score_commercial_potential(cluster_responses_list)
|
||||
|
||||
# Pick representative quotes (medium length, most representative)
|
||||
quotes = sorted(
|
||||
cluster_responses_list,
|
||||
key=lambda r: abs(len(r) - 80), # prefer ~80 char responses
|
||||
)[:3]
|
||||
|
||||
segment = PainSegment(
|
||||
segment_id=cluster_id + 1,
|
||||
theme=theme,
|
||||
top_keywords=keywords,
|
||||
respondent_count=len(mask),
|
||||
respondent_pct=round(len(mask) / len(responses) * 100, 1),
|
||||
commercial_score=commercial,
|
||||
sample_responses=cluster_responses_list[:5],
|
||||
representative_quotes=quotes,
|
||||
)
|
||||
segments.append(segment)
|
||||
|
||||
# Sort by size × commercial score
|
||||
segments.sort(key=lambda s: s.respondent_count * (s.commercial_score + 10), reverse=True)
|
||||
|
||||
# Limit to top N
|
||||
segments = segments[:top_segments]
|
||||
|
||||
# Re-number after sorting
|
||||
for i, seg in enumerate(segments):
|
||||
seg.segment_id = i + 1
|
||||
|
||||
# Generate lead magnet briefs
|
||||
lead_magnets = []
|
||||
for seg in segments:
|
||||
brief = build_lead_magnet_brief(seg)
|
||||
lead_magnets.append(brief)
|
||||
|
||||
# Sort briefs by combined score
|
||||
lead_magnets.sort(key=lambda b: b.combined_score, reverse=True)
|
||||
|
||||
# Implementation roadmap
|
||||
roadmap = []
|
||||
for i, lm in enumerate(lead_magnets, 1):
|
||||
roadmap.append({
|
||||
"priority": i,
|
||||
"title": lm.title,
|
||||
"format": lm.format,
|
||||
"segment_size": f"{lm.segment_theme} ({segments[lm.segment_id - 1].respondent_pct:.1f}%)",
|
||||
"combined_score": lm.combined_score,
|
||||
"estimated_effort": "1-2 days" if "Checklist" in lm.format or "Template" in lm.format else "3-5 days",
|
||||
})
|
||||
|
||||
return AnalysisResult(
|
||||
total_respondents=total_respondents,
|
||||
columns_analyzed=pain_columns,
|
||||
segments=[asdict(s) for s in segments],
|
||||
lead_magnets=[asdict(lm) for lm in lead_magnets],
|
||||
implementation_roadmap=roadmap,
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Output Formatters
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def format_analysis_text(result: AnalysisResult) -> str:
|
||||
"""Format analysis as human-readable text."""
|
||||
lines = []
|
||||
lines.append("=" * 70)
|
||||
lines.append(" SURVEY-TO-LEAD-MAGNET ANALYSIS")
|
||||
lines.append("=" * 70)
|
||||
lines.append("")
|
||||
lines.append(f" Total respondents: {result.total_respondents}")
|
||||
lines.append(f" Columns analyzed: {', '.join(result.columns_analyzed)}")
|
||||
lines.append(f" Segments identified: {len(result.segments)}")
|
||||
lines.append("")
|
||||
|
||||
# Segments
|
||||
lines.append("-" * 70)
|
||||
lines.append(" PAIN POINT SEGMENTS (ranked by opportunity)")
|
||||
lines.append("-" * 70)
|
||||
|
||||
for seg in result.segments:
|
||||
lines.append("")
|
||||
lines.append(f" Segment #{seg['segment_id']}: {seg['theme']}")
|
||||
lines.append(f" Respondents: {seg['respondent_count']} ({seg['respondent_pct']}%)")
|
||||
lines.append(f" Commercial Score: {seg['commercial_score']}/100")
|
||||
lines.append(f" Top Keywords: {', '.join(seg['top_keywords'][:5])}")
|
||||
lines.append("")
|
||||
lines.append(" Representative Quotes:")
|
||||
for q in seg["representative_quotes"]:
|
||||
lines.append(f" \"{q[:100]}{'...' if len(q) > 100 else ''}\"")
|
||||
lines.append("")
|
||||
|
||||
# Lead Magnet Briefs
|
||||
lines.append("=" * 70)
|
||||
lines.append(" LEAD MAGNET BRIEFS (ranked by combined score)")
|
||||
lines.append("=" * 70)
|
||||
|
||||
for lm in result.lead_magnets:
|
||||
lines.append("")
|
||||
lines.append(f" 📦 {lm['title']}")
|
||||
lines.append(f" Format: {lm['format']}")
|
||||
lines.append(f" Segment: {lm['segment_theme']}")
|
||||
lines.append(f" Viral Potential: {lm['viral_potential']}/100 | Conversion Potential: {lm['conversion_potential']}/100")
|
||||
lines.append(f" Combined Score: {lm['combined_score']}/100")
|
||||
lines.append("")
|
||||
lines.append(f" Hook: {lm['hook'][:200]}{'...' if len(lm['hook']) > 200 else ''}")
|
||||
lines.append("")
|
||||
lines.append(" Outline:")
|
||||
for section in lm["outline"]:
|
||||
lines.append(f" • {section}")
|
||||
lines.append("")
|
||||
lines.append(f" CTA: {lm['target_cta']}")
|
||||
lines.append(f" Distribution: {lm['distribution_channel']}")
|
||||
lines.append(f" Notes: {lm['implementation_notes']}")
|
||||
lines.append("")
|
||||
lines.append(" " + "-" * 50)
|
||||
|
||||
# Roadmap
|
||||
lines.append("")
|
||||
lines.append("=" * 70)
|
||||
lines.append(" IMPLEMENTATION ROADMAP")
|
||||
lines.append("=" * 70)
|
||||
lines.append("")
|
||||
|
||||
for item in result.implementation_roadmap:
|
||||
lines.append(f" #{item['priority']} [{item['estimated_effort']}] {item['title']}")
|
||||
lines.append(f" Format: {item['format']} | Segment: {item['segment_size']} | Score: {item['combined_score']}")
|
||||
lines.append("")
|
||||
|
||||
lines.append("=" * 70)
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Survey-to-Lead-Magnet Engine — Turn survey data into targeted lead magnet briefs",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""
|
||||
Examples:
|
||||
python survey_lead_magnet.py --csv survey_responses.csv
|
||||
python survey_lead_magnet.py --csv survey.csv --pain-columns "biggest_challenge" "frustrations"
|
||||
python survey_lead_magnet.py --csv survey.csv --top-segments 3 --json --output briefs.json
|
||||
|
||||
CSV Format:
|
||||
Questions as column headers, one respondent per row.
|
||||
Works with exports from Typeform, Google Forms, SurveyMonkey, etc.
|
||||
""",
|
||||
)
|
||||
parser.add_argument("--csv", required=True, help="Path to survey responses CSV")
|
||||
parser.add_argument(
|
||||
"--pain-columns", nargs="+",
|
||||
help="Column names containing pain point / challenge responses (auto-detected if not specified)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--top-segments", type=int, default=5,
|
||||
help="Number of top segments to analyze (default: 5)",
|
||||
)
|
||||
parser.add_argument("--json", action="store_true", help="Output as JSON")
|
||||
parser.add_argument("--output", help="Save output to file")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if not os.path.exists(args.csv):
|
||||
print(f"Error: File not found: {args.csv}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
try:
|
||||
result = analyze_survey(
|
||||
csv_path=args.csv,
|
||||
pain_columns=args.pain_columns,
|
||||
top_segments=args.top_segments,
|
||||
)
|
||||
except ValueError as e:
|
||||
print(f"Error: {e}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
# Output
|
||||
if args.json:
|
||||
output = json.dumps(asdict(result), indent=2, default=str)
|
||||
if args.output:
|
||||
with open(args.output, "w") as f:
|
||||
f.write(output)
|
||||
print(f"Output saved to {args.output}", file=sys.stderr)
|
||||
else:
|
||||
print(output)
|
||||
else:
|
||||
text_output = format_analysis_text(result)
|
||||
if args.output:
|
||||
with open(args.output, "w") as f:
|
||||
f.write(text_output)
|
||||
print(f"Output saved to {args.output}", file=sys.stderr)
|
||||
else:
|
||||
print(text_output)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
5
podcast-ops/.env.example
Normal file
5
podcast-ops/.env.example
Normal file
|
|
@ -0,0 +1,5 @@
|
|||
# Required: OpenAI API key (used for Whisper transcription)
|
||||
OPENAI_API_KEY=sk-...
|
||||
|
||||
# Required: Anthropic API key (used for content generation via Claude)
|
||||
ANTHROPIC_API_KEY=sk-ant-...
|
||||
162
podcast-ops/README.md
Normal file
162
podcast-ops/README.md
Normal file
|
|
@ -0,0 +1,162 @@
|
|||
# AI Podcast Ops
|
||||
|
||||
**One podcast episode in, 15-20 content pieces out. Scored, deduplicated, and scheduled.**
|
||||
|
||||
Most podcast teams publish an episode and maybe pull one audiogram. This pipeline treats every episode as a content mine — extracting narrative arcs, quotable moments, controversial takes, data points, and stories, then generating platform-native content for every channel with viral scoring and deduplication.
|
||||
|
||||
## What's Inside
|
||||
|
||||
### 🎙️ Podcast-to-Everything Pipeline (`podcast_pipeline.py`)
|
||||
End-to-end pipeline that ingests podcast episodes (via RSS feed or raw transcript) and produces a full cross-platform content calendar.
|
||||
|
||||
**Ingest modes:**
|
||||
- RSS feed → auto-download + Whisper transcription
|
||||
- Raw transcript file (text, SRT, VTT)
|
||||
- Batch mode: process last N episodes from a feed
|
||||
|
||||
**Content generated per episode:**
|
||||
- 3-5 short-form video clip suggestions (with timestamps + hooks)
|
||||
- 2-3 Twitter/X thread outlines
|
||||
- 1 LinkedIn article draft
|
||||
- 1 newsletter section
|
||||
- 3-5 quote cards (text overlays for social)
|
||||
- 1 blog post outline with SEO keywords
|
||||
- 1 YouTube Shorts/TikTok script
|
||||
|
||||
**Intelligence layer:**
|
||||
- Editorial Brain: LLM-powered extraction of 7 content atom types
|
||||
- Viral scoring: Novelty × Controversy × Utility (0-100)
|
||||
- Dedup engine: semantic similarity check against last N days of output
|
||||
- Calendar generator: auto-schedules by platform best practices
|
||||
|
||||
### 📋 SKILL.md
|
||||
Claude Code skill file. Drop into your project and ask: *"Turn this podcast episode into a content calendar"* — it handles the rest.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# 1. Install dependencies
|
||||
pip install -r requirements.txt
|
||||
|
||||
# 2. Set up environment
|
||||
cp .env.example .env
|
||||
# Edit .env with your API keys (OPENAI_API_KEY, ANTHROPIC_API_KEY)
|
||||
|
||||
# 3. Process latest episode from your podcast RSS
|
||||
python podcast_pipeline.py --rss "https://feeds.example.com/podcast.xml"
|
||||
|
||||
# 4. Or process a local transcript
|
||||
python podcast_pipeline.py --transcript episode-42.txt
|
||||
|
||||
# 5. Batch process last 5 episodes
|
||||
python podcast_pipeline.py --batch "https://feeds.example.com/podcast.xml" --episodes 5
|
||||
|
||||
# 6. Generate weekly content calendar
|
||||
python podcast_pipeline.py --calendar
|
||||
|
||||
# 7. Only keep high-scoring content
|
||||
python podcast_pipeline.py --rss "https://feeds.example.com/podcast.xml" --min-score 80
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `OPENAI_API_KEY` | Yes | OpenAI API key (Whisper transcription) |
|
||||
| `ANTHROPIC_API_KEY` | Yes | Anthropic API key (content generation) |
|
||||
| `OPENAI_LLM_KEY` | Optional | Separate OpenAI key for GPT-based generation |
|
||||
|
||||
### CLI Options
|
||||
|
||||
| Flag | Description | Default |
|
||||
|------|-------------|---------|
|
||||
| `--rss <url>` | Process latest episode from RSS feed | — |
|
||||
| `--transcript <file>` | Process a local transcript file | — |
|
||||
| `--batch <url>` | Batch process from RSS feed | — |
|
||||
| `--episodes <n>` | Number of episodes for batch mode | 5 |
|
||||
| `--calendar` | Generate weekly calendar from outputs | — |
|
||||
| `--dedup-days <n>` | Days of history for dedup check | 30 |
|
||||
| `--min-score <n>` | Minimum viral score to include | 0 |
|
||||
| `--output-dir <path>` | Output directory | `./output` |
|
||||
|
||||
## Output Structure
|
||||
|
||||
```
|
||||
output/
|
||||
├── episodes/
|
||||
│ ├── 2024-01-15-episode-title/
|
||||
│ │ ├── transcript.txt # Clean transcript
|
||||
│ │ ├── atoms.json # Extracted content atoms
|
||||
│ │ ├── content_pieces.json # All generated content
|
||||
│ │ └── calendar.json # Scheduled calendar
|
||||
│ └── ...
|
||||
├── calendar/
|
||||
│ └── week-2024-W03.json # Aggregated weekly calendar
|
||||
├── content_history.json # Dedup tracking (hashes + embeddings)
|
||||
└── pipeline_log.json # Run history and performance stats
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
```
|
||||
RSS Feed / Transcript
|
||||
│
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ 1. INGEST │ Download audio → Whisper → clean transcript
|
||||
│ │ OR read transcript file directly
|
||||
└────────┬────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ 2. EXTRACT │ Editorial Brain: find narrative arcs, quotes,
|
||||
│ │ controversial takes, data points, stories,
|
||||
│ │ frameworks, predictions
|
||||
└────────┬────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ 3. GENERATE │ For each atom → platform-native content:
|
||||
│ │ clips, threads, articles, newsletter,
|
||||
│ │ quote cards, blog outlines, short scripts
|
||||
└────────┬────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ 4. SCORE │ Viral potential: novelty × controversy × utility
|
||||
│ │ Filter below threshold
|
||||
└────────┬────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ 5. DEDUP │ Semantic similarity vs last N days
|
||||
│ │ Remove overlaps, flag near-dupes
|
||||
└────────┬────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ 6. SCHEDULE │ Calendar generation with platform-specific
|
||||
│ │ timing rules and content mix optimization
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
## Viral Scoring
|
||||
|
||||
Every generated piece is scored on three dimensions:
|
||||
|
||||
| Dimension | Weight | What It Measures |
|
||||
|-----------|--------|-----------------|
|
||||
| Novelty | 40% | Is this new or surprising? |
|
||||
| Controversy | 30% | Will people argue about this? |
|
||||
| Utility | 30% | Can someone use this immediately? |
|
||||
|
||||
**Thresholds:** 80+ = priority publish, 60-79 = solid fill, 40-59 = gap filler, <40 = cut
|
||||
|
||||
## Integration with Other Skills
|
||||
|
||||
- **Content Ops / Expert Panel** — Run generated content through the expert panel for quality gating before publish
|
||||
- **SEO Ops** — Feed blog outlines to the SEO pipeline for keyword validation
|
||||
- **Outbound Engine** — Use podcast insights as personalization hooks in outbound sequences
|
||||
- **Growth Engine** — A/B test different content formats from the same episode atoms
|
||||
302
podcast-ops/SKILL.md
Normal file
302
podcast-ops/SKILL.md
Normal file
|
|
@ -0,0 +1,302 @@
|
|||
---
|
||||
name: podcast-pipeline
|
||||
description: >-
|
||||
Podcast-to-Everything content pipeline. Takes a podcast RSS feed or raw
|
||||
transcript and generates a full cross-platform content calendar: short-form
|
||||
video clips, Twitter/X threads, LinkedIn articles, newsletter sections, quote
|
||||
cards, blog outlines with SEO keywords, and YouTube Shorts/TikTok scripts.
|
||||
Scores each piece by viral potential (novelty × controversy × utility) and
|
||||
deduplicates against recent output. Use when asked to: "repurpose this podcast",
|
||||
"turn this episode into content", "podcast content calendar", "extract clips
|
||||
from this episode", "podcast to social", "content from RSS feed", "batch
|
||||
process episodes", or any request to turn podcast/audio content into a
|
||||
multi-platform content plan.
|
||||
---
|
||||
|
||||
# Podcast-to-Everything Pipeline
|
||||
|
||||
Turns podcast episodes into a full content calendar across every platform.
|
||||
One episode in, 15-20 content pieces out — scored, deduplicated, and scheduled.
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Ingest — Get the Transcript
|
||||
|
||||
Determine the input source and obtain a clean transcript.
|
||||
|
||||
### Option A: RSS Feed (`--rss <url>`)
|
||||
1. Fetch the RSS feed XML
|
||||
2. Extract the latest episode's audio URL (or use `--episodes N` for batch)
|
||||
3. Download the audio file
|
||||
4. Transcribe via OpenAI Whisper API (with timestamps)
|
||||
5. Store transcript with episode metadata (title, date, description, duration)
|
||||
|
||||
### Option B: Raw Transcript (`--transcript <file>`)
|
||||
1. Read the transcript file (plain text, SRT, or VTT)
|
||||
2. Parse timestamps if present
|
||||
3. Extract episode metadata from filename or prompt user
|
||||
|
||||
### Option C: Batch Mode (`--batch <rss_url> --episodes N`)
|
||||
1. Fetch RSS feed
|
||||
2. Extract the last N episodes
|
||||
3. Process each through the full pipeline
|
||||
4. Deduplicate across all episodes in the batch
|
||||
|
||||
### Transcript cleanup
|
||||
- Remove filler words (um, uh, like, you know) for written content
|
||||
- Preserve original with timestamps for video clip suggestions
|
||||
- Split into logical segments by topic shift
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Editorial Brain — Deep Analysis
|
||||
|
||||
Feed the full transcript to the LLM with this extraction framework:
|
||||
|
||||
### Extract these content atoms:
|
||||
|
||||
1. **Narrative Arcs** — Complete story segments with setup → tension → resolution.
|
||||
Tag with start/end timestamps.
|
||||
|
||||
2. **Quotable Moments** — Punchy, shareable statements. One-liners that stand alone.
|
||||
Must pass the "would someone screenshot this?" test.
|
||||
|
||||
3. **Controversial Takes** — Opinions that go against conventional wisdom.
|
||||
The stuff that makes people reply "hard disagree" or "finally someone said it."
|
||||
|
||||
4. **Data Points** — Specific numbers, percentages, dollar amounts, timeframes.
|
||||
Concrete proof points that add credibility.
|
||||
|
||||
5. **Stories** — Personal anecdotes, case studies, client examples.
|
||||
Must have a character, a problem, and an outcome.
|
||||
|
||||
6. **Frameworks** — Step-by-step processes, mental models, decision matrices.
|
||||
Anything structured that people would save or bookmark.
|
||||
|
||||
7. **Predictions** — Forward-looking claims about trends, markets, technology.
|
||||
Hot takes about where things are going.
|
||||
|
||||
### Output format per atom:
|
||||
```
|
||||
- Type: [narrative_arc | quote | controversial_take | data_point | story | framework | prediction]
|
||||
- Content: [extracted text]
|
||||
- Timestamp: [start - end, if available]
|
||||
- Context: [what was being discussed]
|
||||
- Viral Score: [0-100, see Step 4]
|
||||
- Suggested platforms: [where this atom works best]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Content Generation — One Episode, Many Pieces
|
||||
|
||||
For each episode, generate ALL of these from the extracted atoms:
|
||||
|
||||
### 3a. Short-Form Video Clips (3-5 per episode)
|
||||
```
|
||||
- Hook: [First 3 seconds — pattern interrupt or bold claim]
|
||||
- Clip segment: [Timestamp range from transcript]
|
||||
- Caption overlay: [Text for the screen]
|
||||
- Platform: [YouTube Shorts / TikTok / Instagram Reels]
|
||||
- Why it works: [What makes this clippable]
|
||||
```
|
||||
Prioritize: controversial takes > stories with payoffs > surprising data points
|
||||
|
||||
### 3b. Twitter/X Threads (2-3 per episode)
|
||||
```
|
||||
- Thread hook (tweet 1): [Curiosity gap or bold opener]
|
||||
- Thread body (5-10 tweets): [Each tweet is one complete thought]
|
||||
- Thread closer: [CTA — follow, reply, retweet trigger]
|
||||
- Source atoms: [Which content atoms feed this thread]
|
||||
```
|
||||
Rules: No tweet over 280 chars. Each tweet must stand alone. Use data points as proof.
|
||||
|
||||
### 3c. LinkedIn Article Draft (1 per episode)
|
||||
```
|
||||
- Headline: [Specific, benefit-driven]
|
||||
- Hook paragraph: [Before the "see more" fold — must earn the click]
|
||||
- Body: [3-5 sections with headers, 800-1200 words]
|
||||
- CTA: [Engagement driver — question, not link]
|
||||
- Hashtags: [3-5 relevant, not spammy]
|
||||
```
|
||||
Voice: Professional but not corporate. First-person. Story-driven.
|
||||
|
||||
### 3d. Newsletter Section (1 per episode)
|
||||
```
|
||||
- Section headline: [Scannable, specific]
|
||||
- TL;DR: [One sentence, the core insight]
|
||||
- Body: [3-5 bullet points, each with a takeaway]
|
||||
- Pull quote: [The most shareable line from the episode]
|
||||
- Link: [Back to full episode]
|
||||
```
|
||||
|
||||
### 3e. Quote Cards (3-5 per episode)
|
||||
```
|
||||
- Quote text: [Max 20 words — must work as text overlay]
|
||||
- Attribution: [Speaker name]
|
||||
- Background suggestion: [Color/mood that matches the tone]
|
||||
- Platform sizing: [1080x1080 for IG, 1200x675 for Twitter, 1080x1920 for Stories]
|
||||
```
|
||||
|
||||
### 3f. Blog Post Outline (1 per episode)
|
||||
```
|
||||
- Title: [SEO-optimized, includes primary keyword]
|
||||
- Primary keyword: [Search volume + difficulty estimate]
|
||||
- Secondary keywords: [3-5 related terms]
|
||||
- Meta description: [155 chars max]
|
||||
- H2 sections: [5-7, each maps to a content atom]
|
||||
- Internal linking opportunities: [Topics that connect to existing content]
|
||||
- Estimated word count: [1500-2500]
|
||||
```
|
||||
|
||||
### 3g. YouTube Shorts / TikTok Script (1 per episode)
|
||||
```
|
||||
- HOOK (0-3s): [Pattern interrupt — question, bold claim, or visual]
|
||||
- SETUP (3-15s): [Context — why should they care]
|
||||
- PAYOFF (15-45s): [The insight, data, or story resolution]
|
||||
- CTA (45-60s): [Follow, comment prompt, or part 2 tease]
|
||||
- On-screen text: [Key phrases to overlay]
|
||||
- B-roll suggestions: [Visual ideas if not talking-head]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Content Scoring — Viral Potential
|
||||
|
||||
Score every generated piece on three dimensions (each 0-100):
|
||||
|
||||
| Dimension | What It Measures | Signals |
|
||||
|-----------|-----------------|---------|
|
||||
| **Novelty** | Is this new or surprising? | Contrarian takes, unexpected data, first-to-say |
|
||||
| **Controversy** | Will people argue about this? | Strong opinions, challenges norms, picks a side |
|
||||
| **Utility** | Can someone use this immediately? | Frameworks, how-tos, templates, specific numbers |
|
||||
|
||||
**Viral Score = (Novelty × 0.4) + (Controversy × 0.3) + (Utility × 0.3)**
|
||||
|
||||
### Score thresholds:
|
||||
- **80+** → Priority publish. Schedule for peak engagement windows.
|
||||
- **60-79** → Solid content. Fill the calendar.
|
||||
- **40-59** → Filler. Use only if calendar has gaps.
|
||||
- **Below 40** → Cut it. Not worth the publish slot.
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Dedup Engine
|
||||
|
||||
Before finalizing, check all generated content against:
|
||||
1. **This batch** — No two pieces should cover the same angle
|
||||
2. **Recent history** — Compare against last N days of output (default: 30)
|
||||
3. **Similarity threshold** — Flag any pair with >70% semantic overlap
|
||||
|
||||
### Dedup rules:
|
||||
- If two pieces overlap >70%: keep the higher-scored one, cut the other
|
||||
- If a piece overlaps with recently published content: flag with ⚠️ and suggest a differentiation angle
|
||||
- Track all published content hashes in `output/content_history.json`
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Calendar Generation (`--calendar`)
|
||||
|
||||
Assemble scored, deduplicated content into a weekly publish calendar.
|
||||
|
||||
### Scheduling rules:
|
||||
- **Twitter/X:** 1-2 per day, peak hours (8-10am, 12-1pm, 5-7pm ET)
|
||||
- **LinkedIn:** 1 per day max, Tuesday-Thursday mornings
|
||||
- **YouTube Shorts/TikTok:** 1 per day, evenings
|
||||
- **Newsletter:** Weekly, same day each week
|
||||
- **Blog:** 1-2 per week
|
||||
- **Quote cards:** Intersperse on low-content days
|
||||
|
||||
### Calendar output format:
|
||||
```json
|
||||
{
|
||||
"week_of": "2024-01-15",
|
||||
"episode_source": "Episode Title - Guest Name",
|
||||
"content_pieces": [
|
||||
{
|
||||
"date": "2024-01-15",
|
||||
"time": "09:00 ET",
|
||||
"platform": "twitter",
|
||||
"type": "thread",
|
||||
"content": "...",
|
||||
"viral_score": 85,
|
||||
"status": "draft"
|
||||
}
|
||||
],
|
||||
"total_pieces": 18,
|
||||
"avg_viral_score": 72,
|
||||
"coverage": {
|
||||
"twitter": 6,
|
||||
"linkedin": 3,
|
||||
"youtube_shorts": 3,
|
||||
"newsletter": 1,
|
||||
"blog": 1,
|
||||
"quote_cards": 4
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 7: Output
|
||||
|
||||
All output goes to `output/` directory:
|
||||
|
||||
```
|
||||
output/
|
||||
├── episodes/
|
||||
│ ├── YYYY-MM-DD-episode-slug/
|
||||
│ │ ├── transcript.txt
|
||||
│ │ ├── atoms.json # Extracted content atoms
|
||||
│ │ ├── content_pieces.json # All generated content
|
||||
│ │ └── calendar.json # Scheduled calendar
|
||||
│ └── ...
|
||||
├── calendar/
|
||||
│ └── week-YYYY-WNN.json # Aggregated weekly calendar
|
||||
├── content_history.json # Dedup tracking
|
||||
└── pipeline_log.json # Run history and stats
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## CLI Reference
|
||||
|
||||
```bash
|
||||
# Process latest episode from RSS feed
|
||||
python podcast_pipeline.py --rss "https://feeds.example.com/podcast.xml"
|
||||
|
||||
# Process a local transcript
|
||||
python podcast_pipeline.py --transcript episode-42.txt
|
||||
|
||||
# Batch process last 5 episodes
|
||||
python podcast_pipeline.py --batch "https://feeds.example.com/podcast.xml" --episodes 5
|
||||
|
||||
# Generate weekly calendar from existing outputs
|
||||
python podcast_pipeline.py --calendar
|
||||
|
||||
# Process with custom dedup window
|
||||
python podcast_pipeline.py --rss "https://feeds.example.com/podcast.xml" --dedup-days 60
|
||||
|
||||
# Process and only keep 80+ viral score content
|
||||
python podcast_pipeline.py --rss "https://feeds.example.com/podcast.xml" --min-score 80
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `OPENAI_API_KEY` | Yes (for Whisper) | OpenAI API key for audio transcription |
|
||||
| `ANTHROPIC_API_KEY` | Yes (for generation) | Anthropic API key for content generation |
|
||||
| `OPENAI_LLM_KEY` | Optional | Separate OpenAI key if using GPT for generation instead |
|
||||
|
||||
---
|
||||
|
||||
## Reference Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `podcast_pipeline.py` | Main pipeline script |
|
||||
| `requirements.txt` | Python dependencies |
|
||||
| `README.md` | Setup and usage guide |
|
||||
1069
podcast-ops/podcast_pipeline.py
Normal file
1069
podcast-ops/podcast_pipeline.py
Normal file
File diff suppressed because it is too large
Load diff
7
podcast-ops/requirements.txt
Normal file
7
podcast-ops/requirements.txt
Normal file
|
|
@ -0,0 +1,7 @@
|
|||
anthropic>=0.40.0
|
||||
openai>=1.50.0
|
||||
feedparser>=6.0.0
|
||||
requests>=2.31.0
|
||||
python-dateutil>=2.8.0
|
||||
python-slugify>=8.0.0
|
||||
tqdm>=4.66.0
|
||||
329
revenue-intelligence/README.md
Normal file
329
revenue-intelligence/README.md
Normal file
|
|
@ -0,0 +1,329 @@
|
|||
# 📊 AI Revenue Intelligence
|
||||
|
||||
> **Prove content ROI, extract call intelligence, and generate client reports — automatically.**
|
||||
|
||||
An AI-powered revenue intelligence suite that connects the dots between sales calls, content performance, and closed deals. These tools pull from Gong, GA4, HubSpot, and Ahrefs to answer the questions every marketing team hates: "What content actually drove revenue?" and "What are prospects really saying on calls?"
|
||||
|
||||
Built in production at [Single Grain](https://www.singlegrain.com/?utm_source=github&utm_medium=skill_repo&utm_campaign=ai_marketing_skills). Now open-sourced for any revenue-focused marketing team.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────┐
|
||||
│ DATA SOURCES │
|
||||
│ │
|
||||
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
|
||||
│ │ Gong │ │ GA4 │ │ HubSpot │ │ Ahrefs │ │
|
||||
│ │ (calls) │ │(traffic) │ │ (deals) │ │ (SEO) │ │
|
||||
│ └─────┬────┘ └─────┬────┘ └─────┬────┘ └────────┬─────────┘ │
|
||||
└────────┼──────────────┼─────────────┼────────────────┼─────────────┘
|
||||
│ │ │ │
|
||||
▼ │ │ │
|
||||
┌──────────────────┐ │ │ │
|
||||
│ Gong-to-Insight │ │ │ │
|
||||
│ Pipeline │ │ │ │
|
||||
│ │ │ │ │
|
||||
│ • Objections │ │ │ │
|
||||
│ • Buying signals │ │ │ │
|
||||
│ • Competitors │ │ │ │
|
||||
│ • Content topics │ │ │ │
|
||||
│ • Follow-ups │ │ │ │
|
||||
└──────┬───────────┘ │ │ │
|
||||
│ ▼ ▼ │
|
||||
│ ┌───────────────────────────┐ │
|
||||
│ │ Revenue Attribution │ │
|
||||
│ │ Mapper │ │
|
||||
│ │ │ │
|
||||
│ │ • First-touch / linear / │ │
|
||||
│ │ time-decay attribution │ │
|
||||
│ │ • Content ROI by type │ │
|
||||
│ │ • CPA calculations │ │
|
||||
│ │ • Content gap analysis │ │
|
||||
│ └───────────┬───────────────┘ │
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌──────────────────────────────────────────────────────────────┐
|
||||
│ Client Report Generator │
|
||||
│ │
|
||||
│ Executive Summary + Traffic + Pipeline + SEO + Call Quality │
|
||||
│ Anomaly Detection + Period-over-Period Comparison │
|
||||
│ → Markdown or JSON output │
|
||||
└──────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Tools
|
||||
|
||||
### 1. 🎙️ Gong-to-Insight Pipeline (`gong_insight_pipeline.py`)
|
||||
|
||||
Turns sales call transcripts into structured intelligence. Works with the Gong API or plain `.txt` transcript files.
|
||||
|
||||
**What it extracts:**
|
||||
- **Objections** — categorized as pricing, timing, competition, authority, or need
|
||||
- **Buying signals** — budget confirmed, timeline mentioned, decision maker engaged, champion identified
|
||||
- **Competitive mentions** — which competitors were named and in what sentiment (positive/negative/neutral)
|
||||
- **Pricing discussions** — dollar amounts, pricing model questions, ROI concerns
|
||||
- **Content topics** — recurring objection patterns that should become blog posts, case studies, or battle cards
|
||||
- **Follow-up drafts** — personalized outbound suggestions based on what happened on the call
|
||||
|
||||
```bash
|
||||
# Analyze a transcript file
|
||||
python gong_insight_pipeline.py --file transcript.txt
|
||||
|
||||
# Analyze a directory of transcripts
|
||||
python gong_insight_pipeline.py --dir ./transcripts/ --content-topics
|
||||
|
||||
# Pull from Gong API (last 7 days)
|
||||
python gong_insight_pipeline.py --gong --days 7
|
||||
|
||||
# Full output with follow-ups
|
||||
python gong_insight_pipeline.py --file call.txt --follow-ups --output insights.json
|
||||
|
||||
# Example output:
|
||||
# ============================================================
|
||||
# Call: discovery-call-acme
|
||||
# Temperature: WARM
|
||||
# ============================================================
|
||||
#
|
||||
# 🚫 Objections (3):
|
||||
# pricing: 2
|
||||
# timing: 1
|
||||
# → [pricing] "That's a bit more than we budgeted for this quarter"
|
||||
# → [pricing] "Can you do a smaller pilot first?"
|
||||
# → [timing] "We're in the middle of a platform migration"
|
||||
#
|
||||
# ✅ Buying Signals (2):
|
||||
# budget_confirmed: 1
|
||||
# champion_identified: 1
|
||||
#
|
||||
# ⚔️ Competitors: HubSpot, Drift
|
||||
#
|
||||
# 💰 Pricing discussed: Yes (3 mentions)
|
||||
```
|
||||
|
||||
### 2. 💰 Revenue Attribution Mapper (`revenue_attribution.py`)
|
||||
|
||||
The "prove content ROI" tool. Maps blog posts, videos, podcasts, and webinars to actual closed deals using first-touch, linear, or time-decay attribution models.
|
||||
|
||||
**What it produces:**
|
||||
- Content-to-revenue mapping showing exactly which pieces drove pipeline
|
||||
- Attribution across three models (pick the one that fits your sales motion)
|
||||
- Cost-per-acquisition by content type (blog vs. video vs. webinar vs. podcast)
|
||||
- Content gap analysis (which funnel stages have no content working?)
|
||||
- Top performers ranked by attributed revenue
|
||||
|
||||
```bash
|
||||
# Full attribution report (linear model)
|
||||
python revenue_attribution.py --report
|
||||
|
||||
# Time-decay model (more credit to recent touchpoints)
|
||||
python revenue_attribution.py --report --model time-decay
|
||||
|
||||
# Content gaps (which funnel stages are uncovered?)
|
||||
python revenue_attribution.py --gaps
|
||||
|
||||
# CPA by content type
|
||||
python revenue_attribution.py --cpa --costs content_costs.json
|
||||
|
||||
# Example output:
|
||||
# ======================================================================
|
||||
# CONTENT REVENUE ATTRIBUTION REPORT
|
||||
# Model: linear
|
||||
# ======================================================================
|
||||
#
|
||||
# 📊 Summary
|
||||
# Total Revenue: $984,000
|
||||
# Total Deals: 5
|
||||
# Avg Deal Size: $196,800
|
||||
# Content w/ Attribution: 13
|
||||
# Avg Touchpoints/Deal: 4.4
|
||||
#
|
||||
# 📈 Revenue by Content Type
|
||||
# Type Revenue Sessions Pieces Avg/Piece
|
||||
# --------------------------------------------------------
|
||||
# landing_page $211,200 1,800 1 $211,200
|
||||
# blog $298,560 17,000 6 $49,760
|
||||
# case_study $156,000 2,090 2 $78,000
|
||||
# ...
|
||||
```
|
||||
|
||||
### 3. 📋 Multi-Source Client Report Generator (`client_report_generator.py`)
|
||||
|
||||
Pulls from all four data sources (GA4, HubSpot, Ahrefs, Gong) and generates a unified, client-ready BI report with an auto-generated executive summary and optional anomaly detection.
|
||||
|
||||
**What it includes:**
|
||||
- **Executive summary** — auto-generated highlights, concerns, and recommendations
|
||||
- **Traffic** — sessions, users, conversions, channel breakdown, top pages (GA4)
|
||||
- **Pipeline** — deals created/won/lost, revenue, win rate, avg cycle (HubSpot)
|
||||
- **SEO** — domain rating, rankings, backlinks, organic traffic (Ahrefs)
|
||||
- **Call quality** — talk ratio, call duration, next-steps rate, top topics (Gong)
|
||||
- **Anomaly detection** — flags unusual changes with severity levels
|
||||
- **Period comparison** — month-over-month, quarter-over-quarter, or year-over-year
|
||||
|
||||
```bash
|
||||
# Console summary
|
||||
python client_report_generator.py --client "Acme Corp"
|
||||
|
||||
# Full markdown report
|
||||
python client_report_generator.py --client "Acme Corp" --format markdown --output report.md
|
||||
|
||||
# JSON for dashboards/slides
|
||||
python client_report_generator.py --client "Acme Corp" --format json --anomalies
|
||||
|
||||
# Skip sources you don't use
|
||||
python client_report_generator.py --client "Acme Corp" --skip gong,ahrefs
|
||||
|
||||
# Example output:
|
||||
# ======================================================================
|
||||
# Acme Corp - Performance Report
|
||||
# 2025-03-01 to 2025-03-31
|
||||
# ======================================================================
|
||||
#
|
||||
# 🟢 Overall: Strong
|
||||
#
|
||||
# ✅ Highlights:
|
||||
# • Traffic up 8.1% (45,200 sessions)
|
||||
# • Conversions up 14.8% (342 total)
|
||||
# • Win rate at 60.0% (12 won)
|
||||
# • $1,440,000 revenue closed
|
||||
#
|
||||
# ⚠️ Concerns:
|
||||
# • Reps talking too much (54.2% talk ratio)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Install dependencies
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### 2. Configure environment
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
# Edit .env with your API keys
|
||||
```
|
||||
|
||||
### 3. Test with sample data
|
||||
|
||||
All tools ship with built-in sample data and fall back gracefully when API keys aren't configured. Try them out of the box:
|
||||
|
||||
```bash
|
||||
# Analyze a transcript
|
||||
echo "Prospect: That's more than we budgeted for this quarter.
|
||||
Rep: I understand. What range were you expecting?
|
||||
Prospect: We were looking at HubSpot too, they quoted us around 50k.
|
||||
Rep: Makes sense. Our ROI calculator shows 3x return in year one." > sample.txt
|
||||
|
||||
python gong_insight_pipeline.py --file sample.txt --follow-ups
|
||||
|
||||
# Run attribution report (uses sample data without API keys)
|
||||
python revenue_attribution.py --report --gaps
|
||||
|
||||
# Generate client report (uses sample data without API keys)
|
||||
python client_report_generator.py --client "Demo Corp" --anomalies
|
||||
```
|
||||
|
||||
### 4. Connect real APIs
|
||||
|
||||
Set these environment variables to connect live data:
|
||||
|
||||
```bash
|
||||
# Gong
|
||||
export GONG_API_KEY="your-gong-api-key"
|
||||
|
||||
# GA4
|
||||
export GA4_PROPERTY_ID="123456789"
|
||||
export GA4_CREDENTIALS_JSON="/path/to/service-account.json"
|
||||
|
||||
# HubSpot
|
||||
export HUBSPOT_API_KEY="your-hubspot-private-app-token"
|
||||
|
||||
# Ahrefs
|
||||
export AHREFS_TOKEN="your-ahrefs-api-token"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
| Variable | Required By | Description |
|
||||
|----------|-------------|-------------|
|
||||
| `GONG_API_KEY` | Gong Pipeline, Client Report | Gong API access key |
|
||||
| `GONG_API_BASE_URL` | Gong Pipeline, Client Report | Gong API URL (default: `https://api.gong.io/v2`) |
|
||||
| `GA4_PROPERTY_ID` | Attribution, Client Report | GA4 property ID |
|
||||
| `GA4_CREDENTIALS_JSON` | Attribution, Client Report | Path to GA4 service account JSON |
|
||||
| `HUBSPOT_API_KEY` | Attribution, Client Report | HubSpot private app token |
|
||||
| `AHREFS_TOKEN` | Client Report | Ahrefs API token |
|
||||
| `YOUR_DOMAIN` | Client Report | Your root domain for Ahrefs data |
|
||||
| `OUTPUT_DIR` | All | Output directory (default: `./output`) |
|
||||
|
||||
---
|
||||
|
||||
## Customization
|
||||
|
||||
### Objection Patterns
|
||||
Edit `OBJECTION_PATTERNS` in `gong_insight_pipeline.py` to match your industry's objection language.
|
||||
|
||||
### Competitor List
|
||||
Edit `KNOWN_COMPETITORS` in `gong_insight_pipeline.py` with your actual competitive landscape.
|
||||
|
||||
### Content Type Classification
|
||||
Edit `CONTENT_TYPE_PATTERNS` in `revenue_attribution.py` to match your site's URL structure.
|
||||
|
||||
### Anomaly Thresholds
|
||||
Pass custom thresholds to `detect_anomalies()` in `client_report_generator.py`:
|
||||
```python
|
||||
thresholds = {"warning": 0.15, "critical": 0.30} # 15% = warning, 30% = critical
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## How They Work Together
|
||||
|
||||
1. **Weekly**: Run `gong_insight_pipeline.py` on recent calls → extract objections and buying signals
|
||||
2. **Monthly**: Run `revenue_attribution.py` → see which content drove deals
|
||||
3. **Monthly**: Run `client_report_generator.py` → deliver unified report to clients or leadership
|
||||
4. **Quarterly**: Use Gong content topics + attribution gaps to plan next quarter's content
|
||||
|
||||
The insight loop:
|
||||
- Gong reveals what prospects ask about → creates content topics
|
||||
- Content gets published → drives traffic (GA4)
|
||||
- Traffic converts to pipeline → deals close (HubSpot)
|
||||
- Attribution mapper proves which content worked → invest more in winners
|
||||
- Repeat
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
revenue-intelligence/
|
||||
├── README.md # This file
|
||||
├── SKILL.md # Claude Code agent skill definition
|
||||
├── requirements.txt # Python dependencies
|
||||
├── gong_insight_pipeline.py # Call transcript → structured insights
|
||||
├── revenue_attribution.py # Content → revenue mapping
|
||||
└── client_report_generator.py # Multi-source client BI reports
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
<div align="center">
|
||||
|
||||
**🧠 [Want these built and managed for you? →](https://singlebrain.com/?utm_source=github&utm_medium=skill_repo&utm_campaign=ai_marketing_skills)**
|
||||
|
||||
*This is how we build agents at [Single Brain](https://singlebrain.com/?utm_source=github&utm_medium=skill_repo&utm_campaign=ai_marketing_skills) for our clients.*
|
||||
|
||||
[Single Grain](https://www.singlegrain.com/?utm_source=github&utm_medium=skill_repo&utm_campaign=ai_marketing_skills) · our marketing agency
|
||||
|
||||
📬 **[Level up your marketing with 14,000+ marketers and founders →](https://levelingup.beehiiv.com/subscribe)** *(free)*
|
||||
|
||||
</div>
|
||||
172
revenue-intelligence/SKILL.md
Normal file
172
revenue-intelligence/SKILL.md
Normal file
|
|
@ -0,0 +1,172 @@
|
|||
# AI Revenue Intelligence
|
||||
|
||||
AI-powered revenue intelligence: sales call insight extraction, content-to-revenue attribution, and multi-source client reporting.
|
||||
|
||||
## When to Use
|
||||
|
||||
- User wants to extract insights from Gong sales call transcripts
|
||||
- User needs to identify objections, buying signals, or competitive mentions in calls
|
||||
- User wants to prove content ROI by mapping content to closed deals
|
||||
- User needs revenue attribution across first-touch and multi-touch models
|
||||
- User wants to generate a unified client report from GA4 + HubSpot + Ahrefs + Gong
|
||||
- User asks about content gaps in the buyer journey
|
||||
- User needs anomaly detection across marketing metrics
|
||||
|
||||
## Tools
|
||||
|
||||
### Gong-to-Insight Pipeline (`gong_insight_pipeline.py`)
|
||||
|
||||
Extracts structured intelligence from sales call transcripts. Works with Gong API or plain transcript files.
|
||||
|
||||
```bash
|
||||
# Analyze a single transcript file
|
||||
python gong_insight_pipeline.py --file transcript.txt
|
||||
|
||||
# Analyze multiple transcript files
|
||||
python gong_insight_pipeline.py --dir ./transcripts/
|
||||
|
||||
# Pull recent calls from Gong API (last 7 days)
|
||||
python gong_insight_pipeline.py --gong --days 7
|
||||
|
||||
# Pull specific call by ID
|
||||
python gong_insight_pipeline.py --gong --call-id abc123
|
||||
|
||||
# Output as JSON file
|
||||
python gong_insight_pipeline.py --file transcript.txt --output insights.json
|
||||
|
||||
# Generate content topics from recurring objections
|
||||
python gong_insight_pipeline.py --dir ./transcripts/ --content-topics
|
||||
|
||||
# Generate follow-up suggestions for outbound sequences
|
||||
python gong_insight_pipeline.py --file transcript.txt --follow-ups
|
||||
```
|
||||
|
||||
**What it extracts:**
|
||||
- Objections (categorized: pricing, timing, competition, authority, need)
|
||||
- Buying signals (budget confirmed, timeline mentioned, decision maker engaged, champion identified)
|
||||
- Competitive mentions (who was mentioned, context: positive/negative/neutral)
|
||||
- Pricing discussions (anchors, pushback, willingness indicators)
|
||||
- Content topic suggestions from recurring objection patterns
|
||||
- Personalized follow-up drafts based on call context
|
||||
|
||||
**Output:** Structured JSON to stdout or file. Each call produces an `insights` object with `objections`, `buying_signals`, `competitive_mentions`, `pricing_discussions`, `content_topics`, and `follow_ups` arrays.
|
||||
|
||||
### Revenue Attribution Mapper (`revenue_attribution.py`)
|
||||
|
||||
Maps content pieces to pipeline and closed revenue. Proves content ROI with first-touch and multi-touch attribution.
|
||||
|
||||
```bash
|
||||
# Run full attribution report (GA4 + HubSpot)
|
||||
python revenue_attribution.py --report
|
||||
|
||||
# First-touch attribution only
|
||||
python revenue_attribution.py --report --model first-touch
|
||||
|
||||
# Multi-touch (linear) attribution
|
||||
python revenue_attribution.py --report --model linear
|
||||
|
||||
# Time-decay attribution
|
||||
python revenue_attribution.py --report --model time-decay
|
||||
|
||||
# Filter by date range
|
||||
python revenue_attribution.py --report --start 2025-01-01 --end 2025-03-31
|
||||
|
||||
# Calculate cost-per-acquisition by content type
|
||||
python revenue_attribution.py --cpa --costs content_costs.json
|
||||
|
||||
# Identify content gaps in the buyer journey
|
||||
python revenue_attribution.py --gaps
|
||||
|
||||
# Output as JSON
|
||||
python revenue_attribution.py --report --json --output attribution.json
|
||||
```
|
||||
|
||||
**What it produces:**
|
||||
- Content-to-revenue mapping (which blog posts, videos, podcasts drove deals)
|
||||
- First-touch, linear, and time-decay attribution models
|
||||
- Cost-per-acquisition by content type (blog, video, podcast, webinar)
|
||||
- Content ROI report with revenue per piece
|
||||
- Content gap analysis (funnel stages with no attribution)
|
||||
- Top-performing content ranked by attributed revenue
|
||||
|
||||
**Data sources:** GA4 (page paths, sessions, conversions) + HubSpot (deals, touchpoints, close dates)
|
||||
|
||||
### Multi-Source Client Report Generator (`client_report_generator.py`)
|
||||
|
||||
Generates unified client-ready BI reports from GA4, HubSpot, Ahrefs, and Gong.
|
||||
|
||||
```bash
|
||||
# Generate full client report
|
||||
python client_report_generator.py --client "Acme Corp"
|
||||
|
||||
# Specify date range
|
||||
python client_report_generator.py --client "Acme Corp" --start 2025-03-01 --end 2025-03-31
|
||||
|
||||
# Output as markdown
|
||||
python client_report_generator.py --client "Acme Corp" --format markdown --output report.md
|
||||
|
||||
# Output as JSON (for rendering in slides/dashboards)
|
||||
python client_report_generator.py --client "Acme Corp" --format json --output report.json
|
||||
|
||||
# Skip specific data sources
|
||||
python client_report_generator.py --client "Acme Corp" --skip gong
|
||||
python client_report_generator.py --client "Acme Corp" --skip ahrefs,gong
|
||||
|
||||
# Enable anomaly detection
|
||||
python client_report_generator.py --client "Acme Corp" --anomalies
|
||||
|
||||
# Compare to previous period
|
||||
python client_report_generator.py --client "Acme Corp" --compare previous-month
|
||||
```
|
||||
|
||||
**What it produces:**
|
||||
- Executive summary with key metrics and period-over-period changes
|
||||
- Traffic section: sessions, users, top pages, channel breakdown (GA4)
|
||||
- Pipeline section: deals created, moved, closed, revenue (HubSpot)
|
||||
- SEO section: keyword rankings, backlinks, domain rating changes (Ahrefs)
|
||||
- Call quality section: talk ratios, objection frequency, win rates (Gong)
|
||||
- Anomaly flags: unusual spikes/drops with severity and context
|
||||
- Output as structured markdown or JSON
|
||||
|
||||
## Configuration
|
||||
|
||||
All scripts read from environment variables. Copy `.env.example` to `.env` and fill in your values.
|
||||
|
||||
### Required Environment Variables
|
||||
|
||||
| Variable | Used By | Description |
|
||||
|----------|---------|-------------|
|
||||
| `GONG_API_KEY` | Gong Pipeline, Client Report | Gong API access key |
|
||||
| `GONG_API_BASE_URL` | Gong Pipeline, Client Report | Gong API base URL |
|
||||
| `HUBSPOT_API_KEY` | Attribution, Client Report | HubSpot private app token |
|
||||
| `GA4_PROPERTY_ID` | Attribution, Client Report | GA4 property ID |
|
||||
| `GA4_CREDENTIALS_JSON` | Attribution, Client Report | Path to GA4 service account JSON |
|
||||
|
||||
### Optional Environment Variables
|
||||
|
||||
| Variable | Used By | Description |
|
||||
|----------|---------|-------------|
|
||||
| `AHREFS_TOKEN` | Client Report | Ahrefs API token |
|
||||
| `OUTPUT_DIR` | All | Directory for output files (default: `./output`) |
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
Gong Transcripts → Insight Pipeline → Objections, Signals, Competitors → Content Topics + Follow-ups
|
||||
GA4 + HubSpot → Attribution Mapper → Content ROI, CPA, Gap Analysis → Revenue Proof
|
||||
GA4 + HubSpot + Ahrefs + Gong → Client Report → Executive Summary + Anomalies → Client Deliverable
|
||||
```
|
||||
|
||||
## Recommended Workflow
|
||||
|
||||
1. **Weekly:** Run `gong_insight_pipeline.py --gong --days 7` to extract call intelligence
|
||||
2. **Monthly:** Run `revenue_attribution.py --report` to prove content ROI
|
||||
3. **Monthly:** Run `client_report_generator.py` for each client deliverable
|
||||
4. **Quarterly:** Run `revenue_attribution.py --gaps` to find content gaps
|
||||
5. **Ongoing:** Feed Gong insight follow-ups into outbound sequences
|
||||
|
||||
## Dependencies
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
1011
revenue-intelligence/client_report_generator.py
Normal file
1011
revenue-intelligence/client_report_generator.py
Normal file
File diff suppressed because it is too large
Load diff
705
revenue-intelligence/gong_insight_pipeline.py
Normal file
705
revenue-intelligence/gong_insight_pipeline.py
Normal file
|
|
@ -0,0 +1,705 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
Gong-to-Insight Pipeline
|
||||
|
||||
Extracts structured intelligence from sales call transcripts:
|
||||
- Objections (pricing, timing, competition, authority, need)
|
||||
- Buying signals (budget, timeline, decision maker, champion)
|
||||
- Competitive mentions (who, context)
|
||||
- Pricing discussions
|
||||
- Content topic suggestions from recurring patterns
|
||||
- Personalized follow-up drafts
|
||||
|
||||
Works with Gong API or plain transcript files.
|
||||
|
||||
Usage:
|
||||
python gong_insight_pipeline.py --file transcript.txt
|
||||
python gong_insight_pipeline.py --dir ./transcripts/
|
||||
python gong_insight_pipeline.py --gong --days 7
|
||||
python gong_insight_pipeline.py --file transcript.txt --content-topics --follow-ups
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from collections import Counter, defaultdict
|
||||
from datetime import datetime, timedelta
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Gong API client
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# To use the Gong API:
|
||||
# 1. Set GONG_API_KEY (your Gong access key)
|
||||
# 2. Set GONG_API_BASE_URL (default: https://api.gong.io/v2)
|
||||
# 3. Generate API credentials in Gong > Settings > API
|
||||
|
||||
GONG_API_KEY = os.environ.get("GONG_API_KEY", "")
|
||||
GONG_API_BASE_URL = os.environ.get("GONG_API_BASE_URL", "https://api.gong.io/v2")
|
||||
|
||||
|
||||
def _gong_headers() -> dict:
|
||||
"""Build authorization headers for Gong API."""
|
||||
if not GONG_API_KEY:
|
||||
print("ERROR: GONG_API_KEY not set. Export it or pass --file/--dir instead.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
return {
|
||||
"Authorization": f"Bearer {GONG_API_KEY}",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
|
||||
|
||||
def fetch_calls_from_gong(days: int = 7, call_id: Optional[str] = None) -> list[dict]:
|
||||
"""
|
||||
Fetch call transcripts from Gong API.
|
||||
|
||||
Returns list of dicts: [{"id": ..., "title": ..., "transcript": ..., "participants": [...]}]
|
||||
|
||||
NOTE: This uses the Gong v2 API. You need:
|
||||
- API credentials with 'api:calls:read:transcript' scope
|
||||
- Calls must be processed (transcription complete)
|
||||
"""
|
||||
try:
|
||||
import requests
|
||||
except ImportError:
|
||||
print("ERROR: 'requests' required for Gong API. Run: pip install requests", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
headers = _gong_headers()
|
||||
calls = []
|
||||
|
||||
if call_id:
|
||||
# Fetch a specific call
|
||||
# Step 1: Get call metadata
|
||||
resp = requests.get(f"{GONG_API_BASE_URL}/calls/{call_id}", headers=headers)
|
||||
resp.raise_for_status()
|
||||
call_data = resp.json()
|
||||
|
||||
# Step 2: Get transcript
|
||||
transcript_resp = requests.post(
|
||||
f"{GONG_API_BASE_URL}/calls/transcript",
|
||||
headers=headers,
|
||||
json={"filter": {"callIds": [call_id]}},
|
||||
)
|
||||
transcript_resp.raise_for_status()
|
||||
transcript_data = transcript_resp.json()
|
||||
|
||||
transcript_text = _assemble_transcript(transcript_data.get("callTranscripts", []))
|
||||
calls.append({
|
||||
"id": call_id,
|
||||
"title": call_data.get("metaData", {}).get("title", "Unknown"),
|
||||
"transcript": transcript_text,
|
||||
"participants": [p.get("name", "") for p in call_data.get("parties", [])],
|
||||
})
|
||||
else:
|
||||
# Fetch recent calls
|
||||
from_dt = (datetime.utcnow() - timedelta(days=days)).strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
to_dt = datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
|
||||
# Step 1: List calls in date range
|
||||
list_resp = requests.post(
|
||||
f"{GONG_API_BASE_URL}/calls",
|
||||
headers=headers,
|
||||
json={"filter": {"fromDateTime": from_dt, "toDateTime": to_dt}},
|
||||
)
|
||||
list_resp.raise_for_status()
|
||||
call_list = list_resp.json().get("calls", [])
|
||||
|
||||
if not call_list:
|
||||
print(f"No calls found in the last {days} days.", file=sys.stderr)
|
||||
return []
|
||||
|
||||
call_ids = [c["id"] for c in call_list]
|
||||
|
||||
# Step 2: Batch fetch transcripts (Gong supports up to 100 per request)
|
||||
for batch_start in range(0, len(call_ids), 100):
|
||||
batch = call_ids[batch_start : batch_start + 100]
|
||||
transcript_resp = requests.post(
|
||||
f"{GONG_API_BASE_URL}/calls/transcript",
|
||||
headers=headers,
|
||||
json={"filter": {"callIds": batch}},
|
||||
)
|
||||
transcript_resp.raise_for_status()
|
||||
transcripts_by_id = {}
|
||||
for ct in transcript_resp.json().get("callTranscripts", []):
|
||||
cid = ct.get("callId")
|
||||
text = "\n".join(
|
||||
f"{s.get('speakerName', 'Unknown')}: {' '.join(sent.get('text', '') for sent in s.get('sentences', []))}"
|
||||
for s in ct.get("transcript", [])
|
||||
)
|
||||
transcripts_by_id[cid] = text
|
||||
|
||||
for c in call_list:
|
||||
if c["id"] in transcripts_by_id:
|
||||
calls.append({
|
||||
"id": c["id"],
|
||||
"title": c.get("title", "Unknown"),
|
||||
"transcript": transcripts_by_id[c["id"]],
|
||||
"participants": [p.get("name", "") for p in c.get("parties", [])],
|
||||
})
|
||||
|
||||
return calls
|
||||
|
||||
|
||||
def _assemble_transcript(call_transcripts: list) -> str:
|
||||
"""Assemble transcript text from Gong API response format."""
|
||||
lines = []
|
||||
for ct in call_transcripts:
|
||||
for segment in ct.get("transcript", []):
|
||||
speaker = segment.get("speakerName", "Unknown")
|
||||
text = " ".join(s.get("text", "") for s in segment.get("sentences", []))
|
||||
lines.append(f"{speaker}: {text}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Transcript analysis engine
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Objection patterns — maps regex patterns to objection categories
|
||||
OBJECTION_PATTERNS = {
|
||||
"pricing": [
|
||||
r"(?i)(too expensive|over budget|can't afford|cost(s)? too|cheaper|lower price|discount|pricing is|budget.*tight|price.*high|expensive)",
|
||||
r"(?i)(what('s| is) the (price|cost|pricing)|how much (does|will|would)|investment.*significant)",
|
||||
r"(?i)(need to.*justify.*cost|hard to.*justify|roi.*unclear|not sure.*worth)",
|
||||
],
|
||||
"timing": [
|
||||
r"(?i)(not the right time|bad timing|next quarter|next year|revisit.*later|too soon|not ready|circle back|table this)",
|
||||
r"(?i)(busy.*right now|other priorities|roadmap.*full|backlog|bandwidth|tied up)",
|
||||
r"(?i)(maybe (in|after) (q[1-4]|january|february|march|april|may|june|july|august|september|october|november|december))",
|
||||
],
|
||||
"competition": [
|
||||
r"(?i)(already (using|working with|have)|current (vendor|provider|partner|agency)|locked in|contract.*with|compared to|vs\.?\s)",
|
||||
r"(?i)(what makes you different|why.*switch|competitor|alternative|other option|looking at.*other)",
|
||||
],
|
||||
"authority": [
|
||||
r"(?i)(need to (talk to|run.*by|check with|get approval|ask) (my|the|our))",
|
||||
r"(?i)(not my (decision|call)|someone else|boss|manager|board|committee|stakeholder.*approve)",
|
||||
r"(?i)(decision.*committee|buying committee|multiple stakeholders|procurement)",
|
||||
],
|
||||
"need": [
|
||||
r"(?i)(don't (need|see the need|think we need)|not a priority|we're (fine|good|okay) (with|as)|status quo)",
|
||||
r"(?i)(what problem.*solve|why would we|not sure.*fit|doesn't apply|not relevant)",
|
||||
r"(?i)(happy with.*current|no pain|working well enough)",
|
||||
],
|
||||
}
|
||||
|
||||
# Buying signal patterns
|
||||
BUYING_SIGNAL_PATTERNS = {
|
||||
"budget_confirmed": [
|
||||
r"(?i)(budget.*approved|have.*budget|allocated.*budget|budget (is|of) \$|earmarked|set aside.*for)",
|
||||
r"(?i)(can.*invest|willing to (spend|invest|pay)|comfortable with.*price)",
|
||||
],
|
||||
"timeline_mentioned": [
|
||||
r"(?i)(want.*by (q[1-4]|end of|january|february|march|april|may|june|july|august|september|october|november|december))",
|
||||
r"(?i)(need.*live by|launch.*by|deadline|go.?live|start (date|asap|immediately|next week|this month))",
|
||||
r"(?i)(sooner.*better|asap|urgent|time.?sensitive|quickly)",
|
||||
],
|
||||
"decision_maker_engaged": [
|
||||
r"(?i)(ceo|cmo|cfo|cto|vp|vice president|chief|director|head of|svp|evp).*(?:join|call|meeting|asked me)",
|
||||
r"(?i)(brought.*my (boss|manager|ceo|cmo)|loop(ed|ing) in|invited.*leadership)",
|
||||
r"(?i)(decision maker|final say|sign.*off|authorize)",
|
||||
],
|
||||
"champion_identified": [
|
||||
r"(?i)(love (this|it|what)|really (like|impressed|excited)|sold on|big fan|advocate)",
|
||||
r"(?i)(push.*internally|sell.*internally|convince.*team|champion|sponsor|rally|get.*buy.?in)",
|
||||
r"(?i)(exactly what we need|this solves|perfect fit|game.?changer)",
|
||||
],
|
||||
"next_steps_agreed": [
|
||||
r"(?i)(next step|follow.?up|send.*proposal|schedule.*demo|set up.*call|let's (do|move|proceed))",
|
||||
r"(?i)(send.*contract|nda|msa|sow|statement of work|proposal|agreement)",
|
||||
],
|
||||
}
|
||||
|
||||
# Competitive mention patterns — extend with your actual competitors
|
||||
KNOWN_COMPETITORS = [
|
||||
# Add your competitors here. These are common B2B marketing/agency competitors as examples.
|
||||
"HubSpot", "Marketo", "Salesforce", "Drift", "6sense", "Demandbase",
|
||||
"ZoomInfo", "Apollo", "Outreach", "Salesloft", "Gartner", "Forrester",
|
||||
"WebFX", "Wpromote", "Tinuiti", "Power Digital", "Directive",
|
||||
]
|
||||
|
||||
PRICING_DISCUSSION_PATTERNS = [
|
||||
r"(?i)\$[\d,]+(\.\d{2})?(\s*(k|K|thousand|million|per month|/mo|/month|annually|per year))?",
|
||||
r"(?i)(pricing (model|structure|tier|plan)|pay.*per|subscription|retainer|flat fee|hourly rate)",
|
||||
r"(?i)(proposal|quote|estimate|ballpark|range|starting at|minimum.*engagement)",
|
||||
r"(?i)(roi|return on investment|payback|break.?even|cost.*benefit)",
|
||||
]
|
||||
|
||||
|
||||
def analyze_transcript(text: str, source_id: str = "unknown") -> dict:
|
||||
"""
|
||||
Analyze a single transcript and return structured insights.
|
||||
|
||||
Returns dict with: objections, buying_signals, competitive_mentions,
|
||||
pricing_discussions, raw_quotes
|
||||
"""
|
||||
lines = text.strip().split("\n")
|
||||
insights = {
|
||||
"source_id": source_id,
|
||||
"analyzed_at": datetime.utcnow().isoformat() + "Z",
|
||||
"objections": [],
|
||||
"buying_signals": [],
|
||||
"competitive_mentions": [],
|
||||
"pricing_discussions": [],
|
||||
}
|
||||
|
||||
for i, line in enumerate(lines):
|
||||
context_window = " ".join(lines[max(0, i - 1) : min(len(lines), i + 2)])
|
||||
|
||||
# --- Objections ---
|
||||
for category, patterns in OBJECTION_PATTERNS.items():
|
||||
for pattern in patterns:
|
||||
match = re.search(pattern, line)
|
||||
if match:
|
||||
insights["objections"].append({
|
||||
"category": category,
|
||||
"quote": line.strip(),
|
||||
"match": match.group(),
|
||||
"line_number": i + 1,
|
||||
"context": context_window.strip(),
|
||||
})
|
||||
break # One match per category per line
|
||||
|
||||
# --- Buying Signals ---
|
||||
for signal_type, patterns in BUYING_SIGNAL_PATTERNS.items():
|
||||
for pattern in patterns:
|
||||
match = re.search(pattern, line)
|
||||
if match:
|
||||
insights["buying_signals"].append({
|
||||
"type": signal_type,
|
||||
"quote": line.strip(),
|
||||
"match": match.group(),
|
||||
"line_number": i + 1,
|
||||
})
|
||||
break
|
||||
|
||||
# --- Competitive Mentions ---
|
||||
for competitor in KNOWN_COMPETITORS:
|
||||
if re.search(r"\b" + re.escape(competitor) + r"\b", line, re.IGNORECASE):
|
||||
# Determine context sentiment (basic heuristic)
|
||||
sentiment = "neutral"
|
||||
neg_words = ["problem", "issue", "bad", "worse", "hate", "frustrat", "limit", "lack", "miss", "fail", "leaving", "switch"]
|
||||
pos_words = ["good", "great", "love", "like", "happy", "better", "best", "strong"]
|
||||
line_lower = line.lower()
|
||||
if any(w in line_lower for w in neg_words):
|
||||
sentiment = "negative"
|
||||
elif any(w in line_lower for w in pos_words):
|
||||
sentiment = "positive"
|
||||
|
||||
insights["competitive_mentions"].append({
|
||||
"competitor": competitor,
|
||||
"context_sentiment": sentiment,
|
||||
"quote": line.strip(),
|
||||
"line_number": i + 1,
|
||||
})
|
||||
|
||||
# --- Pricing Discussions ---
|
||||
for pattern in PRICING_DISCUSSION_PATTERNS:
|
||||
match = re.search(pattern, line)
|
||||
if match:
|
||||
insights["pricing_discussions"].append({
|
||||
"quote": line.strip(),
|
||||
"match": match.group(),
|
||||
"line_number": i + 1,
|
||||
})
|
||||
break
|
||||
|
||||
# Deduplicate (same quote can match multiple patterns)
|
||||
insights["objections"] = _dedupe_by_line(insights["objections"])
|
||||
insights["buying_signals"] = _dedupe_by_line(insights["buying_signals"])
|
||||
insights["competitive_mentions"] = _dedupe_by_line(insights["competitive_mentions"])
|
||||
insights["pricing_discussions"] = _dedupe_by_line(insights["pricing_discussions"])
|
||||
|
||||
# Summary stats
|
||||
insights["summary"] = {
|
||||
"total_objections": len(insights["objections"]),
|
||||
"objection_categories": dict(Counter(o["category"] for o in insights["objections"])),
|
||||
"total_buying_signals": len(insights["buying_signals"]),
|
||||
"signal_types": dict(Counter(s["type"] for s in insights["buying_signals"])),
|
||||
"competitors_mentioned": list(set(c["competitor"] for c in insights["competitive_mentions"])),
|
||||
"has_pricing_discussion": len(insights["pricing_discussions"]) > 0,
|
||||
"deal_temperature": _score_deal_temperature(insights),
|
||||
}
|
||||
|
||||
return insights
|
||||
|
||||
|
||||
def _dedupe_by_line(items: list) -> list:
|
||||
"""Remove duplicate entries for the same line number."""
|
||||
seen = set()
|
||||
deduped = []
|
||||
for item in items:
|
||||
key = item.get("line_number", id(item))
|
||||
if key not in seen:
|
||||
seen.add(key)
|
||||
deduped.append(item)
|
||||
return deduped
|
||||
|
||||
|
||||
def _score_deal_temperature(insights: dict) -> str:
|
||||
"""
|
||||
Score deal temperature based on signals vs objections.
|
||||
Returns: hot, warm, cool, cold
|
||||
"""
|
||||
signal_count = len(insights["buying_signals"])
|
||||
objection_count = len(insights["objections"])
|
||||
|
||||
# Weighted scoring
|
||||
score = 0
|
||||
for sig in insights["buying_signals"]:
|
||||
weights = {
|
||||
"budget_confirmed": 3,
|
||||
"decision_maker_engaged": 3,
|
||||
"timeline_mentioned": 2,
|
||||
"champion_identified": 2,
|
||||
"next_steps_agreed": 2,
|
||||
}
|
||||
score += weights.get(sig["type"], 1)
|
||||
|
||||
for obj in insights["objections"]:
|
||||
penalties = {
|
||||
"need": -3, # No need = worst signal
|
||||
"authority": -1,
|
||||
"timing": -1,
|
||||
"pricing": -1,
|
||||
"competition": -2,
|
||||
}
|
||||
score += penalties.get(obj["category"], -1)
|
||||
|
||||
if score >= 6:
|
||||
return "hot"
|
||||
elif score >= 3:
|
||||
return "warm"
|
||||
elif score >= 0:
|
||||
return "cool"
|
||||
else:
|
||||
return "cold"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Content topic generator
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def generate_content_topics(all_insights: list[dict]) -> list[dict]:
|
||||
"""
|
||||
Analyze recurring objections across multiple calls to suggest content topics.
|
||||
Returns list of content topic suggestions.
|
||||
"""
|
||||
objection_quotes = defaultdict(list)
|
||||
for insight in all_insights:
|
||||
for obj in insight.get("objections", []):
|
||||
objection_quotes[obj["category"]].append(obj["quote"])
|
||||
|
||||
topics = []
|
||||
|
||||
# Map objection categories to content strategies
|
||||
content_strategies = {
|
||||
"pricing": {
|
||||
"topic_template": "ROI Calculator: How {product} Pays for Itself in {timeframe}",
|
||||
"content_types": ["blog post", "interactive calculator", "case study"],
|
||||
"angle": "Address pricing objections with concrete ROI proof",
|
||||
},
|
||||
"timing": {
|
||||
"topic_template": "The Cost of Waiting: What Happens When You Delay {solution}",
|
||||
"content_types": ["blog post", "email sequence", "one-pager"],
|
||||
"angle": "Create urgency with cost-of-inaction framing",
|
||||
},
|
||||
"competition": {
|
||||
"topic_template": "{product} vs {competitor}: Honest Comparison for {use_case}",
|
||||
"content_types": ["comparison page", "blog post", "battle card"],
|
||||
"angle": "Win competitive deals with transparent comparison content",
|
||||
},
|
||||
"authority": {
|
||||
"topic_template": "How to Build the Business Case for {product} (Template Included)",
|
||||
"content_types": ["template", "guide", "executive summary"],
|
||||
"angle": "Arm your champion with materials to sell internally",
|
||||
},
|
||||
"need": {
|
||||
"topic_template": "Why Top {role}s Are Prioritizing {category} in {year}",
|
||||
"content_types": ["thought leadership", "industry report", "webinar"],
|
||||
"angle": "Build awareness and urgency around the problem",
|
||||
},
|
||||
}
|
||||
|
||||
for category, quotes in objection_quotes.items():
|
||||
count = len(quotes)
|
||||
if count == 0:
|
||||
continue
|
||||
|
||||
strategy = content_strategies.get(category, {})
|
||||
topics.append({
|
||||
"category": category,
|
||||
"frequency": count,
|
||||
"sample_quotes": quotes[:3], # Top 3 examples
|
||||
"suggested_topic": strategy.get("topic_template", f"Content addressing {category} objections"),
|
||||
"recommended_content_types": strategy.get("content_types", ["blog post"]),
|
||||
"strategic_angle": strategy.get("angle", ""),
|
||||
"priority": "high" if count >= 5 else "medium" if count >= 2 else "low",
|
||||
})
|
||||
|
||||
topics.sort(key=lambda t: t["frequency"], reverse=True)
|
||||
return topics
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Follow-up generator
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def generate_follow_ups(insights: dict) -> list[dict]:
|
||||
"""
|
||||
Generate personalized follow-up suggestions based on call insights.
|
||||
"""
|
||||
follow_ups = []
|
||||
|
||||
# Address top objections
|
||||
for obj in insights.get("objections", [])[:3]:
|
||||
templates = {
|
||||
"pricing": {
|
||||
"subject": "Quick thought on the investment discussion",
|
||||
"body": "Following up on our pricing conversation. I put together a quick ROI model based on what you shared about {context}. The numbers suggest a {x}x return in the first year. Want me to walk through it?",
|
||||
"asset": "ROI calculator or case study with similar company metrics",
|
||||
},
|
||||
"timing": {
|
||||
"subject": "Timing + what others in your position did",
|
||||
"body": "I hear you on timing. Quick data point: companies that started in a similar position to yours saw {metric} within the first 90 days. Happy to share the case study if helpful.",
|
||||
"asset": "Quick-win case study showing fast time-to-value",
|
||||
},
|
||||
"competition": {
|
||||
"subject": "Honest take on {competitor} vs us",
|
||||
"body": "You mentioned you're also looking at {competitor}. Totally fair. Here's where we genuinely win and where they might be a better fit. I'd rather you make the right call than the easy one.",
|
||||
"asset": "Competitive battle card or comparison one-pager",
|
||||
},
|
||||
"authority": {
|
||||
"subject": "Materials for your team's review",
|
||||
"body": "I know you need to loop in {stakeholder}. I put together a one-page executive summary that hits the points they'll care about most: ROI, timeline, and risk. Want me to send it over?",
|
||||
"asset": "Executive summary one-pager, tailored to stakeholder concerns",
|
||||
},
|
||||
"need": {
|
||||
"subject": "Something that might change the calculus",
|
||||
"body": "I appreciated the honest pushback on whether this is a priority right now. One thing I didn't get to share: {relevant_insight}. Might be worth a 10-minute follow-up if you're open to it.",
|
||||
"asset": "Industry report or benchmark data showing peer adoption",
|
||||
},
|
||||
}
|
||||
|
||||
template = templates.get(obj["category"], {})
|
||||
follow_ups.append({
|
||||
"type": "objection_response",
|
||||
"objection_category": obj["category"],
|
||||
"trigger_quote": obj["quote"],
|
||||
"suggested_subject": template.get("subject", f"Following up on {obj['category']} discussion"),
|
||||
"suggested_body": template.get("body", "Following up on our conversation..."),
|
||||
"recommended_asset": template.get("asset", ""),
|
||||
"timing": "Send within 24 hours of call",
|
||||
})
|
||||
|
||||
# Capitalize on buying signals
|
||||
for sig in insights.get("buying_signals", [])[:2]:
|
||||
if sig["type"] == "champion_identified":
|
||||
follow_ups.append({
|
||||
"type": "champion_enablement",
|
||||
"signal": sig["quote"],
|
||||
"suggested_subject": "Ammo for your internal pitch",
|
||||
"suggested_body": "You clearly get the value here. I want to make sure you have everything you need to bring the team along. Here's a deck you can customize + the key metrics that usually close the deal internally.",
|
||||
"recommended_asset": "Internal pitch deck template + metrics cheat sheet",
|
||||
"timing": "Send within 12 hours",
|
||||
})
|
||||
elif sig["type"] == "next_steps_agreed":
|
||||
follow_ups.append({
|
||||
"type": "momentum_keeper",
|
||||
"signal": sig["quote"],
|
||||
"suggested_subject": "Recap + next steps locked in",
|
||||
"suggested_body": "Great call. Here's what we agreed on: {next_steps}. I'll have {deliverable} ready by {date}. Let me know if anything changes on your end.",
|
||||
"recommended_asset": "Meeting summary with action items",
|
||||
"timing": "Send within 2 hours of call",
|
||||
})
|
||||
|
||||
return follow_ups
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# File I/O
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def load_transcript_file(filepath: str) -> dict:
|
||||
"""Load a transcript from a text file."""
|
||||
path = Path(filepath)
|
||||
if not path.exists():
|
||||
print(f"ERROR: File not found: {filepath}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
text = path.read_text(encoding="utf-8")
|
||||
return {"id": path.stem, "title": path.stem, "transcript": text, "participants": []}
|
||||
|
||||
|
||||
def load_transcript_dir(dirpath: str) -> list[dict]:
|
||||
"""Load all .txt transcript files from a directory."""
|
||||
path = Path(dirpath)
|
||||
if not path.is_dir():
|
||||
print(f"ERROR: Directory not found: {dirpath}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
files = sorted(path.glob("*.txt"))
|
||||
if not files:
|
||||
print(f"WARNING: No .txt files found in {dirpath}", file=sys.stderr)
|
||||
return []
|
||||
return [load_transcript_file(str(f)) for f in files]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Output
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def print_summary(insights: dict) -> None:
|
||||
"""Print a human-readable summary of insights."""
|
||||
s = insights["summary"]
|
||||
print(f"\n{'='*60}")
|
||||
print(f" Call: {insights['source_id']}")
|
||||
print(f" Temperature: {s['deal_temperature'].upper()}")
|
||||
print(f"{'='*60}")
|
||||
|
||||
if s["total_objections"]:
|
||||
print(f"\n 🚫 Objections ({s['total_objections']}):")
|
||||
for cat, count in sorted(s["objection_categories"].items(), key=lambda x: -x[1]):
|
||||
print(f" {cat}: {count}")
|
||||
for obj in insights["objections"][:3]:
|
||||
print(f" → [{obj['category']}] \"{obj['quote'][:80]}...\"" if len(obj['quote']) > 80 else f" → [{obj['category']}] \"{obj['quote']}\"")
|
||||
|
||||
if s["total_buying_signals"]:
|
||||
print(f"\n ✅ Buying Signals ({s['total_buying_signals']}):")
|
||||
for sig_type, count in sorted(s["signal_types"].items(), key=lambda x: -x[1]):
|
||||
print(f" {sig_type}: {count}")
|
||||
|
||||
if s["competitors_mentioned"]:
|
||||
print(f"\n ⚔️ Competitors: {', '.join(s['competitors_mentioned'])}")
|
||||
|
||||
if s["has_pricing_discussion"]:
|
||||
print(f"\n 💰 Pricing discussed: Yes ({len(insights['pricing_discussions'])} mentions)")
|
||||
|
||||
print()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Extract structured insights from sales call transcripts.",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""
|
||||
Examples:
|
||||
%(prog)s --file transcript.txt
|
||||
%(prog)s --dir ./transcripts/ --content-topics
|
||||
%(prog)s --gong --days 7 --follow-ups
|
||||
%(prog)s --file call.txt --output insights.json
|
||||
""",
|
||||
)
|
||||
|
||||
# Input sources (mutually exclusive)
|
||||
source = parser.add_mutually_exclusive_group(required=True)
|
||||
source.add_argument("--file", help="Path to a single transcript file (.txt)")
|
||||
source.add_argument("--dir", help="Path to directory of transcript files (.txt)")
|
||||
source.add_argument("--gong", action="store_true", help="Pull transcripts from Gong API")
|
||||
|
||||
# Gong options
|
||||
parser.add_argument("--days", type=int, default=7, help="Days of history to pull from Gong (default: 7)")
|
||||
parser.add_argument("--call-id", help="Specific Gong call ID to analyze")
|
||||
|
||||
# Output options
|
||||
parser.add_argument("--output", "-o", help="Write JSON output to file")
|
||||
parser.add_argument("--json", action="store_true", help="Output raw JSON to stdout")
|
||||
parser.add_argument("--content-topics", action="store_true", help="Generate content topics from recurring objections")
|
||||
parser.add_argument("--follow-ups", action="store_true", help="Generate follow-up suggestions")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Load transcripts
|
||||
calls = []
|
||||
if args.file:
|
||||
calls = [load_transcript_file(args.file)]
|
||||
elif args.dir:
|
||||
calls = load_transcript_dir(args.dir)
|
||||
elif args.gong:
|
||||
calls = fetch_calls_from_gong(days=args.days, call_id=args.call_id)
|
||||
|
||||
if not calls:
|
||||
print("No transcripts to analyze.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
# Analyze
|
||||
all_insights = []
|
||||
for call in calls:
|
||||
insights = analyze_transcript(call["transcript"], source_id=call.get("id", "unknown"))
|
||||
insights["title"] = call.get("title", "")
|
||||
all_insights.append(insights)
|
||||
|
||||
if not args.json:
|
||||
print_summary(insights)
|
||||
|
||||
# Content topics
|
||||
content_topics = []
|
||||
if args.content_topics and len(all_insights) > 0:
|
||||
content_topics = generate_content_topics(all_insights)
|
||||
if not args.json:
|
||||
print(f"\n{'='*60}")
|
||||
print(" 📝 Content Topics from Recurring Objections")
|
||||
print(f"{'='*60}")
|
||||
for topic in content_topics:
|
||||
print(f"\n [{topic['priority'].upper()}] {topic['category']} (mentioned {topic['frequency']}x)")
|
||||
print(f" Topic: {topic['suggested_topic']}")
|
||||
print(f" Types: {', '.join(topic['recommended_content_types'])}")
|
||||
print(f" Angle: {topic['strategic_angle']}")
|
||||
|
||||
# Follow-ups
|
||||
all_follow_ups = []
|
||||
if args.follow_ups:
|
||||
for insights in all_insights:
|
||||
follow_ups = generate_follow_ups(insights)
|
||||
all_follow_ups.extend(follow_ups)
|
||||
if not args.json:
|
||||
print(f"\n{'='*60}")
|
||||
print(f" 📧 Follow-up Suggestions for: {insights['source_id']}")
|
||||
print(f"{'='*60}")
|
||||
for fu in follow_ups:
|
||||
print(f"\n Type: {fu['type']}")
|
||||
print(f" Subject: {fu['suggested_subject']}")
|
||||
print(f" Timing: {fu['timing']}")
|
||||
if fu.get("recommended_asset"):
|
||||
print(f" Asset: {fu['recommended_asset']}")
|
||||
|
||||
# Build output
|
||||
output = {
|
||||
"analyzed_at": datetime.utcnow().isoformat() + "Z",
|
||||
"total_calls": len(all_insights),
|
||||
"calls": all_insights,
|
||||
}
|
||||
if content_topics:
|
||||
output["content_topics"] = content_topics
|
||||
if all_follow_ups:
|
||||
output["follow_ups"] = all_follow_ups
|
||||
|
||||
# Aggregate stats
|
||||
output["aggregate"] = {
|
||||
"total_objections": sum(i["summary"]["total_objections"] for i in all_insights),
|
||||
"total_buying_signals": sum(i["summary"]["total_buying_signals"] for i in all_insights),
|
||||
"all_competitors": list(set(c for i in all_insights for c in i["summary"]["competitors_mentioned"])),
|
||||
"temperature_distribution": dict(Counter(i["summary"]["deal_temperature"] for i in all_insights)),
|
||||
}
|
||||
|
||||
# Output
|
||||
if args.json:
|
||||
print(json.dumps(output, indent=2))
|
||||
|
||||
if args.output:
|
||||
out_path = Path(args.output)
|
||||
out_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
out_path.write_text(json.dumps(output, indent=2))
|
||||
if not args.json:
|
||||
print(f"\n✅ Output written to {args.output}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
3
revenue-intelligence/requirements.txt
Normal file
3
revenue-intelligence/requirements.txt
Normal file
|
|
@ -0,0 +1,3 @@
|
|||
requests>=2.28.0
|
||||
google-analytics-data>=0.18.0
|
||||
google-auth>=2.22.0
|
||||
797
revenue-intelligence/revenue_attribution.py
Normal file
797
revenue-intelligence/revenue_attribution.py
Normal file
|
|
@ -0,0 +1,797 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
Revenue Attribution Mapper
|
||||
|
||||
Connects content pieces to pipeline and closed deals. Proves content ROI.
|
||||
Maps blog posts, videos, podcasts to first-touch and multi-touch attribution
|
||||
using GA4 + HubSpot deal data.
|
||||
|
||||
Usage:
|
||||
python revenue_attribution.py --report
|
||||
python revenue_attribution.py --report --model linear
|
||||
python revenue_attribution.py --cpa --costs content_costs.json
|
||||
python revenue_attribution.py --gaps
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from collections import defaultdict
|
||||
from datetime import datetime, timedelta
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# API Configuration
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# HubSpot: Set HUBSPOT_API_KEY to your private app token
|
||||
# Required scopes: crm.objects.deals.read, crm.objects.contacts.read
|
||||
HUBSPOT_API_KEY = os.environ.get("HUBSPOT_API_KEY", "")
|
||||
HUBSPOT_BASE_URL = "https://api.hubapi.com"
|
||||
|
||||
# GA4: Set GA4_PROPERTY_ID and GA4_CREDENTIALS_JSON
|
||||
# GA4_CREDENTIALS_JSON should point to a service account JSON file
|
||||
# Required: Google Analytics Data API (v1beta) enabled
|
||||
GA4_PROPERTY_ID = os.environ.get("GA4_PROPERTY_ID", "")
|
||||
GA4_CREDENTIALS_JSON = os.environ.get("GA4_CREDENTIALS_JSON", "")
|
||||
|
||||
OUTPUT_DIR = os.environ.get("OUTPUT_DIR", "./output")
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Content type classification
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
CONTENT_TYPE_PATTERNS = {
|
||||
"blog": ["/blog/", "/posts/", "/article/", "/insights/"],
|
||||
"video": ["/video/", "/youtube/", "/watch/", "/webinar-recording/"],
|
||||
"podcast": ["/podcast/", "/episode/", "/listen/"],
|
||||
"webinar": ["/webinar/", "/live/", "/register/"],
|
||||
"case_study": ["/case-study/", "/case-studies/", "/success-story/", "/customer-story/"],
|
||||
"landing_page": ["/lp/", "/landing/", "/offer/", "/download/"],
|
||||
"tool": ["/tool/", "/calculator/", "/grader/", "/analyzer/"],
|
||||
"comparison": ["/vs/", "/compare/", "/alternative/", "/versus/"],
|
||||
}
|
||||
|
||||
# Funnel stage classification
|
||||
FUNNEL_STAGE_PATTERNS = {
|
||||
"awareness": ["/blog/", "/posts/", "/article/", "/podcast/", "/video/"],
|
||||
"consideration": ["/case-study/", "/webinar/", "/guide/", "/comparison/", "/vs/"],
|
||||
"decision": ["/pricing/", "/demo/", "/contact/", "/trial/", "/start/", "/lp/"],
|
||||
}
|
||||
|
||||
|
||||
def classify_content_type(url: str) -> str:
|
||||
"""Classify a URL into a content type."""
|
||||
url_lower = url.lower()
|
||||
for content_type, patterns in CONTENT_TYPE_PATTERNS.items():
|
||||
if any(p in url_lower for p in patterns):
|
||||
return content_type
|
||||
return "other"
|
||||
|
||||
|
||||
def classify_funnel_stage(url: str) -> str:
|
||||
"""Classify a URL into a funnel stage."""
|
||||
url_lower = url.lower()
|
||||
for stage, patterns in FUNNEL_STAGE_PATTERNS.items():
|
||||
if any(p in url_lower for p in patterns):
|
||||
return stage
|
||||
return "unknown"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# GA4 Data Client
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def fetch_ga4_page_data(start_date: str, end_date: str) -> list[dict]:
|
||||
"""
|
||||
Fetch page-level session and conversion data from GA4.
|
||||
|
||||
Returns list of dicts:
|
||||
[{"page_path": "/blog/foo", "sessions": 1234, "conversions": 5, "users": 900}]
|
||||
|
||||
NOTE: Requires google-analytics-data library.
|
||||
pip install google-analytics-data
|
||||
|
||||
Setup:
|
||||
1. Create a service account in Google Cloud Console
|
||||
2. Enable the Google Analytics Data API
|
||||
3. Add the service account email as a viewer on your GA4 property
|
||||
4. Download the JSON key file and set GA4_CREDENTIALS_JSON env var
|
||||
"""
|
||||
if not GA4_PROPERTY_ID or not GA4_CREDENTIALS_JSON:
|
||||
print("WARNING: GA4_PROPERTY_ID or GA4_CREDENTIALS_JSON not set. Using sample data.", file=sys.stderr)
|
||||
return _sample_ga4_data()
|
||||
|
||||
try:
|
||||
from google.analytics.data_v1beta import BetaAnalyticsDataClient
|
||||
from google.analytics.data_v1beta.types import (
|
||||
DateRange,
|
||||
Dimension,
|
||||
Metric,
|
||||
RunReportRequest,
|
||||
)
|
||||
|
||||
client = BetaAnalyticsDataClient.from_service_account_json(GA4_CREDENTIALS_JSON)
|
||||
|
||||
request = RunReportRequest(
|
||||
property=f"properties/{GA4_PROPERTY_ID}",
|
||||
dimensions=[
|
||||
Dimension(name="pagePath"),
|
||||
Dimension(name="sessionDefaultChannelGroup"),
|
||||
],
|
||||
metrics=[
|
||||
Metric(name="sessions"),
|
||||
Metric(name="totalUsers"),
|
||||
Metric(name="conversions"),
|
||||
],
|
||||
date_ranges=[DateRange(start_date=start_date, end_date=end_date)],
|
||||
)
|
||||
|
||||
response = client.run_report(request)
|
||||
|
||||
results = []
|
||||
for row in response.rows:
|
||||
results.append({
|
||||
"page_path": row.dimension_values[0].value,
|
||||
"channel": row.dimension_values[1].value,
|
||||
"sessions": int(row.metric_values[0].value),
|
||||
"users": int(row.metric_values[1].value),
|
||||
"conversions": int(row.metric_values[2].value),
|
||||
})
|
||||
|
||||
return results
|
||||
|
||||
except ImportError:
|
||||
print("WARNING: google-analytics-data not installed. Using sample data.", file=sys.stderr)
|
||||
return _sample_ga4_data()
|
||||
except Exception as e:
|
||||
print(f"WARNING: GA4 API error: {e}. Using sample data.", file=sys.stderr)
|
||||
return _sample_ga4_data()
|
||||
|
||||
|
||||
def _sample_ga4_data() -> list[dict]:
|
||||
"""Sample GA4 data for testing/demo purposes."""
|
||||
return [
|
||||
{"page_path": "/blog/seo-strategy-2025", "channel": "Organic Search", "sessions": 4200, "users": 3800, "conversions": 12},
|
||||
{"page_path": "/blog/content-marketing-roi", "channel": "Organic Search", "sessions": 3100, "users": 2900, "conversions": 8},
|
||||
{"page_path": "/blog/ai-marketing-tools", "channel": "Organic Search", "sessions": 5600, "users": 5100, "conversions": 15},
|
||||
{"page_path": "/case-study/saas-company-3x-pipeline", "channel": "Direct", "sessions": 890, "users": 820, "conversions": 9},
|
||||
{"page_path": "/case-study/ecommerce-seo-growth", "channel": "Organic Search", "sessions": 1200, "users": 1100, "conversions": 7},
|
||||
{"page_path": "/podcast/episode-42-growth-loops", "channel": "Social", "sessions": 2300, "users": 2100, "conversions": 3},
|
||||
{"page_path": "/webinar/ai-ops-for-marketers", "channel": "Email", "sessions": 650, "users": 600, "conversions": 11},
|
||||
{"page_path": "/video/youtube-seo-masterclass", "channel": "Social", "sessions": 8900, "users": 8200, "conversions": 6},
|
||||
{"page_path": "/blog/paid-media-benchmarks", "channel": "Organic Search", "sessions": 2700, "users": 2500, "conversions": 4},
|
||||
{"page_path": "/lp/free-seo-audit", "channel": "Paid Search", "sessions": 1800, "users": 1700, "conversions": 22},
|
||||
{"page_path": "/pricing", "channel": "Direct", "sessions": 3200, "users": 2900, "conversions": 18},
|
||||
{"page_path": "/blog/b2b-lead-generation", "channel": "Organic Search", "sessions": 3400, "users": 3100, "conversions": 5},
|
||||
{"page_path": "/vs/hubspot-alternative", "channel": "Organic Search", "sessions": 1500, "users": 1400, "conversions": 10},
|
||||
]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# HubSpot Deal Data
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def fetch_hubspot_deals(start_date: str, end_date: str) -> list[dict]:
|
||||
"""
|
||||
Fetch closed-won deals from HubSpot with touchpoint history.
|
||||
|
||||
Returns list of dicts:
|
||||
[{
|
||||
"deal_id": "123",
|
||||
"deal_name": "Acme Corp",
|
||||
"amount": 50000,
|
||||
"close_date": "2025-03-15",
|
||||
"touchpoints": [
|
||||
{"url": "/blog/seo-strategy", "timestamp": "2025-01-10", "type": "first_touch"},
|
||||
{"url": "/case-study/saas", "timestamp": "2025-02-20", "type": "page_view"},
|
||||
{"url": "/pricing", "timestamp": "2025-03-01", "type": "page_view"},
|
||||
]
|
||||
}]
|
||||
|
||||
NOTE: Requires requests library.
|
||||
Touchpoints come from HubSpot's contact timeline / page views.
|
||||
You need a private app with crm.objects.deals.read + crm.objects.contacts.read scopes.
|
||||
"""
|
||||
if not HUBSPOT_API_KEY:
|
||||
print("WARNING: HUBSPOT_API_KEY not set. Using sample data.", file=sys.stderr)
|
||||
return _sample_hubspot_deals()
|
||||
|
||||
try:
|
||||
import requests
|
||||
|
||||
headers = {"Authorization": f"Bearer {HUBSPOT_API_KEY}"}
|
||||
|
||||
# Fetch closed-won deals in date range
|
||||
# Using the search API for better filtering
|
||||
search_body = {
|
||||
"filterGroups": [{
|
||||
"filters": [
|
||||
{"propertyName": "dealstage", "operator": "EQ", "value": "closedwon"},
|
||||
{"propertyName": "closedate", "operator": "GTE", "value": f"{start_date}T00:00:00Z"},
|
||||
{"propertyName": "closedate", "operator": "LTE", "value": f"{end_date}T23:59:59Z"},
|
||||
]
|
||||
}],
|
||||
"properties": ["dealname", "amount", "closedate", "dealstage"],
|
||||
"limit": 100,
|
||||
}
|
||||
|
||||
resp = requests.post(
|
||||
f"{HUBSPOT_BASE_URL}/crm/v3/objects/deals/search",
|
||||
headers=headers,
|
||||
json=search_body,
|
||||
)
|
||||
resp.raise_for_status()
|
||||
deals_data = resp.json().get("results", [])
|
||||
|
||||
deals = []
|
||||
for deal in deals_data:
|
||||
props = deal.get("properties", {})
|
||||
deal_id = deal["id"]
|
||||
|
||||
# Get associated contacts
|
||||
assoc_resp = requests.get(
|
||||
f"{HUBSPOT_BASE_URL}/crm/v3/objects/deals/{deal_id}/associations/contacts",
|
||||
headers=headers,
|
||||
)
|
||||
contact_ids = [r["id"] for r in assoc_resp.json().get("results", [])] if assoc_resp.ok else []
|
||||
|
||||
# Get page views for each contact (from engagement timeline)
|
||||
touchpoints = []
|
||||
for cid in contact_ids[:5]: # Limit to avoid rate limits
|
||||
# Fetch contact's page views from the timeline API
|
||||
timeline_resp = requests.get(
|
||||
f"{HUBSPOT_BASE_URL}/crm/v3/objects/contacts/{cid}/engagements",
|
||||
headers=headers,
|
||||
params={"limit": 50},
|
||||
)
|
||||
if timeline_resp.ok:
|
||||
for eng in timeline_resp.json().get("results", []):
|
||||
# Extract page view URLs from engagement metadata
|
||||
metadata = eng.get("properties", {})
|
||||
if metadata.get("hs_page_url"):
|
||||
touchpoints.append({
|
||||
"url": metadata["hs_page_url"],
|
||||
"timestamp": metadata.get("hs_timestamp", ""),
|
||||
"type": "page_view",
|
||||
})
|
||||
|
||||
# Mark first and last touch
|
||||
if touchpoints:
|
||||
touchpoints.sort(key=lambda t: t["timestamp"])
|
||||
touchpoints[0]["type"] = "first_touch"
|
||||
touchpoints[-1]["type"] = "last_touch"
|
||||
|
||||
deals.append({
|
||||
"deal_id": deal_id,
|
||||
"deal_name": props.get("dealname", "Unknown"),
|
||||
"amount": float(props.get("amount", 0) or 0),
|
||||
"close_date": props.get("closedate", "")[:10],
|
||||
"touchpoints": touchpoints,
|
||||
})
|
||||
|
||||
return deals
|
||||
|
||||
except ImportError:
|
||||
print("WARNING: requests not installed. Using sample data.", file=sys.stderr)
|
||||
return _sample_hubspot_deals()
|
||||
except Exception as e:
|
||||
print(f"WARNING: HubSpot API error: {e}. Using sample data.", file=sys.stderr)
|
||||
return _sample_hubspot_deals()
|
||||
|
||||
|
||||
def _sample_hubspot_deals() -> list[dict]:
|
||||
"""Sample HubSpot deal data for testing/demo."""
|
||||
return [
|
||||
{
|
||||
"deal_id": "deal_001",
|
||||
"deal_name": "Acme Corp - SEO Retainer",
|
||||
"amount": 120000,
|
||||
"close_date": "2025-03-15",
|
||||
"touchpoints": [
|
||||
{"url": "/blog/seo-strategy-2025", "timestamp": "2025-01-05", "type": "first_touch"},
|
||||
{"url": "/blog/content-marketing-roi", "timestamp": "2025-01-22", "type": "page_view"},
|
||||
{"url": "/case-study/saas-company-3x-pipeline", "timestamp": "2025-02-10", "type": "page_view"},
|
||||
{"url": "/pricing", "timestamp": "2025-02-28", "type": "page_view"},
|
||||
{"url": "/lp/free-seo-audit", "timestamp": "2025-03-05", "type": "last_touch"},
|
||||
],
|
||||
},
|
||||
{
|
||||
"deal_id": "deal_002",
|
||||
"deal_name": "TechStart Inc - Full Service",
|
||||
"amount": 240000,
|
||||
"close_date": "2025-02-20",
|
||||
"touchpoints": [
|
||||
{"url": "/blog/ai-marketing-tools", "timestamp": "2024-12-01", "type": "first_touch"},
|
||||
{"url": "/podcast/episode-42-growth-loops", "timestamp": "2024-12-15", "type": "page_view"},
|
||||
{"url": "/webinar/ai-ops-for-marketers", "timestamp": "2025-01-10", "type": "page_view"},
|
||||
{"url": "/vs/hubspot-alternative", "timestamp": "2025-01-25", "type": "page_view"},
|
||||
{"url": "/pricing", "timestamp": "2025-02-10", "type": "last_touch"},
|
||||
],
|
||||
},
|
||||
{
|
||||
"deal_id": "deal_003",
|
||||
"deal_name": "GrowthCo - Content Marketing",
|
||||
"amount": 84000,
|
||||
"close_date": "2025-03-01",
|
||||
"touchpoints": [
|
||||
{"url": "/blog/content-marketing-roi", "timestamp": "2025-01-15", "type": "first_touch"},
|
||||
{"url": "/case-study/ecommerce-seo-growth", "timestamp": "2025-02-01", "type": "page_view"},
|
||||
{"url": "/pricing", "timestamp": "2025-02-20", "type": "last_touch"},
|
||||
],
|
||||
},
|
||||
{
|
||||
"deal_id": "deal_004",
|
||||
"deal_name": "SaaS Corp - Paid Media",
|
||||
"amount": 180000,
|
||||
"close_date": "2025-01-30",
|
||||
"touchpoints": [
|
||||
{"url": "/video/youtube-seo-masterclass", "timestamp": "2024-11-15", "type": "first_touch"},
|
||||
{"url": "/blog/paid-media-benchmarks", "timestamp": "2024-12-10", "type": "page_view"},
|
||||
{"url": "/blog/b2b-lead-generation", "timestamp": "2025-01-05", "type": "page_view"},
|
||||
{"url": "/lp/free-seo-audit", "timestamp": "2025-01-20", "type": "last_touch"},
|
||||
],
|
||||
},
|
||||
{
|
||||
"deal_id": "deal_005",
|
||||
"deal_name": "Enterprise Ltd - SEO + Content",
|
||||
"amount": 360000,
|
||||
"close_date": "2025-03-20",
|
||||
"touchpoints": [
|
||||
{"url": "/blog/seo-strategy-2025", "timestamp": "2024-12-20", "type": "first_touch"},
|
||||
{"url": "/blog/ai-marketing-tools", "timestamp": "2025-01-08", "type": "page_view"},
|
||||
{"url": "/case-study/saas-company-3x-pipeline", "timestamp": "2025-01-25", "type": "page_view"},
|
||||
{"url": "/webinar/ai-ops-for-marketers", "timestamp": "2025-02-05", "type": "page_view"},
|
||||
{"url": "/pricing", "timestamp": "2025-03-01", "type": "page_view"},
|
||||
{"url": "/lp/free-seo-audit", "timestamp": "2025-03-10", "type": "last_touch"},
|
||||
],
|
||||
},
|
||||
]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Attribution Models
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def first_touch_attribution(deals: list[dict]) -> dict[str, float]:
|
||||
"""100% credit to the first touchpoint."""
|
||||
attribution = defaultdict(float)
|
||||
for deal in deals:
|
||||
tps = deal.get("touchpoints", [])
|
||||
if tps:
|
||||
first = tps[0]
|
||||
attribution[first["url"]] += deal["amount"]
|
||||
return dict(attribution)
|
||||
|
||||
|
||||
def last_touch_attribution(deals: list[dict]) -> dict[str, float]:
|
||||
"""100% credit to the last touchpoint."""
|
||||
attribution = defaultdict(float)
|
||||
for deal in deals:
|
||||
tps = deal.get("touchpoints", [])
|
||||
if tps:
|
||||
last = tps[-1]
|
||||
attribution[last["url"]] += deal["amount"]
|
||||
return dict(attribution)
|
||||
|
||||
|
||||
def linear_attribution(deals: list[dict]) -> dict[str, float]:
|
||||
"""Equal credit to all touchpoints."""
|
||||
attribution = defaultdict(float)
|
||||
for deal in deals:
|
||||
tps = deal.get("touchpoints", [])
|
||||
if tps:
|
||||
credit = deal["amount"] / len(tps)
|
||||
for tp in tps:
|
||||
attribution[tp["url"]] += credit
|
||||
return dict(attribution)
|
||||
|
||||
|
||||
def time_decay_attribution(deals: list[dict], half_life_days: int = 7) -> dict[str, float]:
|
||||
"""
|
||||
More credit to touchpoints closer to close date.
|
||||
Uses exponential decay with configurable half-life.
|
||||
"""
|
||||
import math
|
||||
|
||||
attribution = defaultdict(float)
|
||||
for deal in deals:
|
||||
tps = deal.get("touchpoints", [])
|
||||
close_date = deal.get("close_date", "")
|
||||
if not tps or not close_date:
|
||||
continue
|
||||
|
||||
try:
|
||||
close_dt = datetime.strptime(close_date, "%Y-%m-%d")
|
||||
except ValueError:
|
||||
continue
|
||||
|
||||
# Calculate decay weights
|
||||
weights = []
|
||||
for tp in tps:
|
||||
try:
|
||||
tp_dt = datetime.strptime(tp["timestamp"][:10], "%Y-%m-%d")
|
||||
days_before = (close_dt - tp_dt).days
|
||||
weight = math.pow(0.5, days_before / half_life_days)
|
||||
weights.append(weight)
|
||||
except (ValueError, KeyError):
|
||||
weights.append(0.1)
|
||||
|
||||
total_weight = sum(weights) or 1
|
||||
for tp, weight in zip(tps, weights):
|
||||
attribution[tp["url"]] += deal["amount"] * (weight / total_weight)
|
||||
|
||||
return dict(attribution)
|
||||
|
||||
|
||||
ATTRIBUTION_MODELS = {
|
||||
"first-touch": first_touch_attribution,
|
||||
"last-touch": last_touch_attribution,
|
||||
"linear": linear_attribution,
|
||||
"time-decay": time_decay_attribution,
|
||||
}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Report Generation
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def generate_attribution_report(
|
||||
deals: list[dict],
|
||||
ga4_data: list[dict],
|
||||
model: str = "linear",
|
||||
) -> dict:
|
||||
"""Generate a full attribution report."""
|
||||
# Run attribution
|
||||
model_func = ATTRIBUTION_MODELS.get(model, linear_attribution)
|
||||
attribution = model_func(deals)
|
||||
|
||||
# Enrich with GA4 data
|
||||
ga4_by_path = {}
|
||||
for row in ga4_data:
|
||||
path = row["page_path"]
|
||||
if path not in ga4_by_path:
|
||||
ga4_by_path[path] = {"sessions": 0, "users": 0, "conversions": 0}
|
||||
ga4_by_path[path]["sessions"] += row["sessions"]
|
||||
ga4_by_path[path]["users"] += row["users"]
|
||||
ga4_by_path[path]["conversions"] += row["conversions"]
|
||||
|
||||
# Build content performance table
|
||||
content_performance = []
|
||||
for url, revenue in sorted(attribution.items(), key=lambda x: -x[1]):
|
||||
ga4 = ga4_by_path.get(url, {"sessions": 0, "users": 0, "conversions": 0})
|
||||
content_type = classify_content_type(url)
|
||||
funnel_stage = classify_funnel_stage(url)
|
||||
|
||||
content_performance.append({
|
||||
"url": url,
|
||||
"content_type": content_type,
|
||||
"funnel_stage": funnel_stage,
|
||||
"attributed_revenue": round(revenue, 2),
|
||||
"sessions": ga4["sessions"],
|
||||
"users": ga4["users"],
|
||||
"conversions": ga4["conversions"],
|
||||
"revenue_per_session": round(revenue / ga4["sessions"], 2) if ga4["sessions"] else 0,
|
||||
"deals_touched": sum(
|
||||
1 for d in deals if any(tp["url"] == url for tp in d.get("touchpoints", []))
|
||||
),
|
||||
})
|
||||
|
||||
# Aggregate by content type
|
||||
by_type = defaultdict(lambda: {"revenue": 0, "sessions": 0, "conversions": 0, "pieces": 0})
|
||||
for cp in content_performance:
|
||||
t = cp["content_type"]
|
||||
by_type[t]["revenue"] += cp["attributed_revenue"]
|
||||
by_type[t]["sessions"] += cp["sessions"]
|
||||
by_type[t]["conversions"] += cp["conversions"]
|
||||
by_type[t]["pieces"] += 1
|
||||
|
||||
type_summary = []
|
||||
for content_type, stats in sorted(by_type.items(), key=lambda x: -x[1]["revenue"]):
|
||||
type_summary.append({
|
||||
"content_type": content_type,
|
||||
"total_revenue": round(stats["revenue"], 2),
|
||||
"total_sessions": stats["sessions"],
|
||||
"total_conversions": stats["conversions"],
|
||||
"piece_count": stats["pieces"],
|
||||
"avg_revenue_per_piece": round(stats["revenue"] / stats["pieces"], 2) if stats["pieces"] else 0,
|
||||
})
|
||||
|
||||
# Summary
|
||||
total_revenue = sum(d["amount"] for d in deals)
|
||||
total_deals = len(deals)
|
||||
|
||||
report = {
|
||||
"generated_at": datetime.utcnow().isoformat() + "Z",
|
||||
"attribution_model": model,
|
||||
"summary": {
|
||||
"total_revenue": total_revenue,
|
||||
"total_deals": total_deals,
|
||||
"avg_deal_size": round(total_revenue / total_deals, 2) if total_deals else 0,
|
||||
"content_pieces_with_attribution": len(content_performance),
|
||||
"avg_touchpoints_per_deal": round(
|
||||
sum(len(d.get("touchpoints", [])) for d in deals) / total_deals, 1
|
||||
) if total_deals else 0,
|
||||
},
|
||||
"top_content": content_performance[:20],
|
||||
"by_content_type": type_summary,
|
||||
}
|
||||
|
||||
return report
|
||||
|
||||
|
||||
def calculate_cpa(report: dict, costs: dict) -> dict:
|
||||
"""
|
||||
Calculate cost-per-acquisition by content type.
|
||||
|
||||
costs should be: {"blog": 15000, "video": 8000, "podcast": 3000, ...}
|
||||
representing total spend on each content type in the period.
|
||||
"""
|
||||
cpa_report = []
|
||||
for type_data in report["by_content_type"]:
|
||||
ct = type_data["content_type"]
|
||||
cost = costs.get(ct, 0)
|
||||
revenue = type_data["total_revenue"]
|
||||
conversions = type_data["total_conversions"]
|
||||
|
||||
cpa_report.append({
|
||||
"content_type": ct,
|
||||
"total_cost": cost,
|
||||
"total_revenue": revenue,
|
||||
"conversions": conversions,
|
||||
"cpa": round(cost / conversions, 2) if conversions else None,
|
||||
"roi": round((revenue - cost) / cost, 2) if cost else None,
|
||||
"roi_multiple": f"{round(revenue / cost, 1)}x" if cost else "N/A",
|
||||
})
|
||||
|
||||
cpa_report.sort(key=lambda x: (x["roi"] or 0), reverse=True)
|
||||
return {"cpa_by_content_type": cpa_report}
|
||||
|
||||
|
||||
def find_content_gaps(deals: list[dict]) -> dict:
|
||||
"""
|
||||
Identify funnel stages with no or low content attribution.
|
||||
"""
|
||||
stage_coverage = defaultdict(lambda: {"urls": set(), "deals": 0, "revenue": 0})
|
||||
|
||||
for deal in deals:
|
||||
stages_hit = set()
|
||||
for tp in deal.get("touchpoints", []):
|
||||
stage = classify_funnel_stage(tp["url"])
|
||||
stage_coverage[stage]["urls"].add(tp["url"])
|
||||
stages_hit.add(stage)
|
||||
|
||||
for stage in stages_hit:
|
||||
stage_coverage[stage]["deals"] += 1
|
||||
stage_coverage[stage]["revenue"] += deal["amount"] / len(stages_hit)
|
||||
|
||||
# Check for gaps
|
||||
expected_stages = ["awareness", "consideration", "decision"]
|
||||
gaps = []
|
||||
for stage in expected_stages:
|
||||
data = stage_coverage.get(stage, {"urls": set(), "deals": 0, "revenue": 0})
|
||||
total_deals = len(deals)
|
||||
coverage_pct = round(data["deals"] / total_deals * 100, 1) if total_deals else 0
|
||||
|
||||
if coverage_pct < 30:
|
||||
severity = "critical" if coverage_pct < 10 else "moderate"
|
||||
gaps.append({
|
||||
"stage": stage,
|
||||
"coverage_percent": coverage_pct,
|
||||
"deals_with_stage": data["deals"],
|
||||
"content_pieces": len(data["urls"]),
|
||||
"severity": severity,
|
||||
"recommendation": _gap_recommendation(stage, coverage_pct),
|
||||
})
|
||||
|
||||
stage_summary = []
|
||||
for stage in expected_stages:
|
||||
data = stage_coverage.get(stage, {"urls": set(), "deals": 0, "revenue": 0})
|
||||
stage_summary.append({
|
||||
"stage": stage,
|
||||
"content_pieces": len(data["urls"]),
|
||||
"deals_touched": data["deals"],
|
||||
"attributed_revenue": round(data["revenue"], 2),
|
||||
"top_urls": list(data["urls"])[:5],
|
||||
})
|
||||
|
||||
return {
|
||||
"gaps": gaps,
|
||||
"stage_summary": stage_summary,
|
||||
"total_deals_analyzed": len(deals),
|
||||
}
|
||||
|
||||
|
||||
def _gap_recommendation(stage: str, coverage_pct: float) -> str:
|
||||
"""Generate a recommendation for a content gap."""
|
||||
recs = {
|
||||
"awareness": "Create more top-of-funnel content (blog posts, videos, podcasts) targeting high-volume keywords. Focus on educational content that introduces the problem your product solves.",
|
||||
"consideration": "Build comparison pages, case studies, and webinars that help prospects evaluate solutions. This is where you prove credibility and differentiation.",
|
||||
"decision": "Add pricing pages, ROI calculators, free trials, and demo CTAs. Make it easy for ready-to-buy prospects to take action.",
|
||||
}
|
||||
return recs.get(stage, f"Create content for the {stage} stage to improve coverage from {coverage_pct}%.")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Output Formatting
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def print_report(report: dict) -> None:
|
||||
"""Print attribution report in human-readable format."""
|
||||
s = report["summary"]
|
||||
print(f"\n{'='*70}")
|
||||
print(f" CONTENT REVENUE ATTRIBUTION REPORT")
|
||||
print(f" Model: {report['attribution_model']}")
|
||||
print(f" Generated: {report['generated_at']}")
|
||||
print(f"{'='*70}")
|
||||
|
||||
print(f"\n 📊 Summary")
|
||||
print(f" Total Revenue: ${s['total_revenue']:,.0f}")
|
||||
print(f" Total Deals: {s['total_deals']}")
|
||||
print(f" Avg Deal Size: ${s['avg_deal_size']:,.0f}")
|
||||
print(f" Content w/ Attribution: {s['content_pieces_with_attribution']}")
|
||||
print(f" Avg Touchpoints/Deal: {s['avg_touchpoints_per_deal']}")
|
||||
|
||||
print(f"\n 📈 Revenue by Content Type")
|
||||
print(f" {'Type':<16} {'Revenue':>12} {'Sessions':>10} {'Pieces':>8} {'Avg/Piece':>12}")
|
||||
print(f" {'-'*58}")
|
||||
for ct in report["by_content_type"]:
|
||||
print(
|
||||
f" {ct['content_type']:<16} "
|
||||
f"${ct['total_revenue']:>10,.0f} "
|
||||
f"{ct['total_sessions']:>10,} "
|
||||
f"{ct['piece_count']:>8} "
|
||||
f"${ct['avg_revenue_per_piece']:>10,.0f}"
|
||||
)
|
||||
|
||||
print(f"\n 🏆 Top Content by Revenue")
|
||||
print(f" {'URL':<45} {'Revenue':>12} {'Sessions':>10} {'Type':<12}")
|
||||
print(f" {'-'*79}")
|
||||
for cp in report["top_content"][:10]:
|
||||
url_display = cp["url"][:43] + ".." if len(cp["url"]) > 45 else cp["url"]
|
||||
print(
|
||||
f" {url_display:<45} "
|
||||
f"${cp['attributed_revenue']:>10,.0f} "
|
||||
f"{cp['sessions']:>10,} "
|
||||
f"{cp['content_type']:<12}"
|
||||
)
|
||||
|
||||
print()
|
||||
|
||||
|
||||
def print_gaps(gaps_report: dict) -> None:
|
||||
"""Print content gap analysis."""
|
||||
print(f"\n{'='*70}")
|
||||
print(f" CONTENT GAP ANALYSIS")
|
||||
print(f"{'='*70}")
|
||||
|
||||
print(f"\n 📊 Funnel Stage Coverage ({gaps_report['total_deals_analyzed']} deals)")
|
||||
for stage in gaps_report["stage_summary"]:
|
||||
print(f"\n {stage['stage'].upper()}")
|
||||
print(f" Content Pieces: {stage['content_pieces']}")
|
||||
print(f" Deals Touched: {stage['deals_touched']}")
|
||||
print(f" Revenue: ${stage['attributed_revenue']:,.0f}")
|
||||
|
||||
if gaps_report["gaps"]:
|
||||
print(f"\n ⚠️ Gaps Identified")
|
||||
for gap in gaps_report["gaps"]:
|
||||
print(f"\n [{gap['severity'].upper()}] {gap['stage'].upper()} — {gap['coverage_percent']}% coverage")
|
||||
print(f" → {gap['recommendation']}")
|
||||
else:
|
||||
print(f"\n ✅ No significant gaps found")
|
||||
|
||||
print()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Map content to revenue with multi-touch attribution.",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""
|
||||
Examples:
|
||||
%(prog)s --report
|
||||
%(prog)s --report --model time-decay
|
||||
%(prog)s --cpa --costs content_costs.json
|
||||
%(prog)s --gaps
|
||||
%(prog)s --report --start 2025-01-01 --end 2025-03-31 --json
|
||||
""",
|
||||
)
|
||||
|
||||
parser.add_argument("--report", action="store_true", help="Generate attribution report")
|
||||
parser.add_argument("--gaps", action="store_true", help="Identify content gaps in buyer journey")
|
||||
parser.add_argument("--cpa", action="store_true", help="Calculate cost-per-acquisition by content type")
|
||||
|
||||
parser.add_argument("--model", choices=["first-touch", "last-touch", "linear", "time-decay"],
|
||||
default="linear", help="Attribution model (default: linear)")
|
||||
parser.add_argument("--start", help="Start date YYYY-MM-DD (default: 90 days ago)")
|
||||
parser.add_argument("--end", help="End date YYYY-MM-DD (default: today)")
|
||||
parser.add_argument("--costs", help="JSON file with content costs by type (for --cpa)")
|
||||
|
||||
parser.add_argument("--json", action="store_true", help="Output raw JSON")
|
||||
parser.add_argument("--output", "-o", help="Write output to file")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if not (args.report or args.gaps or args.cpa):
|
||||
parser.error("At least one of --report, --gaps, or --cpa is required")
|
||||
|
||||
# Date range
|
||||
end_date = args.end or datetime.utcnow().strftime("%Y-%m-%d")
|
||||
start_date = args.start or (datetime.utcnow() - timedelta(days=90)).strftime("%Y-%m-%d")
|
||||
|
||||
print(f"Fetching data for {start_date} to {end_date}...", file=sys.stderr)
|
||||
|
||||
# Fetch data
|
||||
ga4_data = fetch_ga4_page_data(start_date, end_date)
|
||||
deals = fetch_hubspot_deals(start_date, end_date)
|
||||
|
||||
output = {
|
||||
"date_range": {"start": start_date, "end": end_date},
|
||||
"generated_at": datetime.utcnow().isoformat() + "Z",
|
||||
}
|
||||
|
||||
if args.report:
|
||||
report = generate_attribution_report(deals, ga4_data, model=args.model)
|
||||
output["attribution_report"] = report
|
||||
if not args.json:
|
||||
print_report(report)
|
||||
|
||||
if args.cpa:
|
||||
if not args.report:
|
||||
report = generate_attribution_report(deals, ga4_data, model=args.model)
|
||||
output["attribution_report"] = report
|
||||
|
||||
costs = {}
|
||||
if args.costs:
|
||||
costs_path = Path(args.costs)
|
||||
if costs_path.exists():
|
||||
costs = json.loads(costs_path.read_text())
|
||||
else:
|
||||
print(f"WARNING: Costs file not found: {args.costs}. Using empty costs.", file=sys.stderr)
|
||||
|
||||
cpa_data = calculate_cpa(output["attribution_report"], costs)
|
||||
output["cpa"] = cpa_data
|
||||
|
||||
if not args.json:
|
||||
print(f"\n{'='*70}")
|
||||
print(f" COST PER ACQUISITION BY CONTENT TYPE")
|
||||
print(f"{'='*70}")
|
||||
print(f" {'Type':<16} {'Cost':>10} {'Revenue':>12} {'CPA':>10} {'ROI':>8}")
|
||||
print(f" {'-'*56}")
|
||||
for row in cpa_data["cpa_by_content_type"]:
|
||||
cpa_str = f"${row['cpa']:,.0f}" if row["cpa"] is not None else "N/A"
|
||||
roi_str = row["roi_multiple"]
|
||||
print(
|
||||
f" {row['content_type']:<16} "
|
||||
f"${row['total_cost']:>8,} "
|
||||
f"${row['total_revenue']:>10,.0f} "
|
||||
f"{cpa_str:>10} "
|
||||
f"{roi_str:>8}"
|
||||
)
|
||||
print()
|
||||
|
||||
if args.gaps:
|
||||
gaps_data = find_content_gaps(deals)
|
||||
output["gaps"] = gaps_data
|
||||
if not args.json:
|
||||
print_gaps(gaps_data)
|
||||
|
||||
if args.json:
|
||||
print(json.dumps(output, indent=2, default=str))
|
||||
|
||||
if args.output:
|
||||
out_path = Path(args.output)
|
||||
out_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
out_path.write_text(json.dumps(output, indent=2, default=str))
|
||||
if not args.json:
|
||||
print(f"✅ Output written to {args.output}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
319
team-ops/README.md
Normal file
319
team-ops/README.md
Normal file
|
|
@ -0,0 +1,319 @@
|
|||
# 👥 AI Team Ops
|
||||
|
||||
> **Run your team like an engineer runs a system — measure everything, cut waste, ship faster.**
|
||||
|
||||
Two AI-powered tools for ruthless team optimization: a structured performance audit framework (the "Elon Algorithm") and an intelligent meeting transcript processor that never lets action items fall through the cracks.
|
||||
|
||||
Built for operators who want data-driven team decisions, not vibes-based management.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────┐
|
||||
│ TEAM PERFORMANCE AUDIT │
|
||||
│ ("Elon Algorithm") │
|
||||
└──────────────┬───────────────────────┘
|
||||
│
|
||||
┌────────────────────────┼────────────────────────┐
|
||||
│ │ │
|
||||
Role Descriptions OKRs / KPIs Output Data
|
||||
(who does what) (what they should hit) (what they actually did)
|
||||
│ │ │
|
||||
└────────────────────────┼────────────────────────┘
|
||||
│
|
||||
┌──────────────▼───────────────────────┐
|
||||
│ 5-Step Elon Algorithm │
|
||||
│ │
|
||||
│ 1. Question — is this necessary? │
|
||||
│ 2. Delete — flag redundancies │
|
||||
│ 3. Simplify — cut complexity │
|
||||
│ 4. Accelerate — find bottlenecks │
|
||||
│ 5. Automate — what can AI handle? │
|
||||
└──────────────┬───────────────────────┘
|
||||
│
|
||||
┌──────────────▼───────────────────────┐
|
||||
│ Scoring Engine │
|
||||
│ • Output Velocity (30%) │
|
||||
│ • Quality (30%) │
|
||||
│ • Independence (20%) │
|
||||
│ • Initiative (20%) │
|
||||
│ │
|
||||
│ → A/B/C Stack Rank │
|
||||
│ → Promote / Coach / Reassign / Exit │
|
||||
└──────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
Executive Summary + Scorecards + Org Recommendations
|
||||
|
||||
|
||||
┌──────────────────────────────────────┐
|
||||
│ MEETING ACTION EXTRACTOR │
|
||||
└──────────────┬───────────────────────┘
|
||||
│
|
||||
Meeting Transcripts (text / stdin / batch)
|
||||
│
|
||||
┌──────────────▼───────────────────────┐
|
||||
│ LLM Extraction Engine │
|
||||
│ │
|
||||
│ • Decisions (who + context) │
|
||||
│ • Action Items (owner + deadline) │
|
||||
│ • Open Questions │
|
||||
│ • Key Insights / Quotes │
|
||||
│ • Follow-up Meetings │
|
||||
│ • Implicit Commitments │
|
||||
│ + Confidence Scores │
|
||||
└──────────────┬───────────────────────┘
|
||||
│
|
||||
┌──────────────▼───────────────────────┐
|
||||
│ Output │
|
||||
│ • Structured JSON │
|
||||
│ • Formatted Markdown │
|
||||
│ • HubSpot Tasks (optional) │
|
||||
└──────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Tools
|
||||
|
||||
### 1. 🏭 Team Performance Audit (`team_performance_audit.py`)
|
||||
|
||||
The "Elon Algorithm" applied to team management. A 5-step framework that questions every role, deletes redundancy, simplifies workflows, accelerates bottlenecks, and flags automation opportunities.
|
||||
|
||||
**What it does:**
|
||||
- Ingests role descriptions, OKRs/KPIs, and output data (CSV or JSON)
|
||||
- Scores each person on 4 dimensions: output velocity, quality, independence, initiative
|
||||
- Computes a weighted composite score and assigns A/B/C tier labels
|
||||
- Runs the 5-step Elon Algorithm via LLM for qualitative organizational analysis
|
||||
- Generates recommended actions: promote, retain, coach, reassign, exit
|
||||
- Outputs executive summary + individual scorecards + org-level recommendations
|
||||
|
||||
```bash
|
||||
# Run with JSON input
|
||||
python3 team_performance_audit.py --input team_data.json --output report.md
|
||||
|
||||
# Run with CSV input
|
||||
python3 team_performance_audit.py --input team_data.csv --output report.md
|
||||
|
||||
# JSON output
|
||||
python3 team_performance_audit.py --input team_data.json --format json --output report.json
|
||||
|
||||
# Dry run (quantitative only, no LLM calls)
|
||||
python3 team_performance_audit.py --input team_data.json --dry-run
|
||||
|
||||
# Custom scoring weights
|
||||
python3 team_performance_audit.py --input team_data.json \
|
||||
--weights '{"output_velocity":0.4,"quality":0.3,"independence":0.15,"initiative":0.15}'
|
||||
```
|
||||
|
||||
**JSON Input Format:**
|
||||
```json
|
||||
{
|
||||
"team_members": [
|
||||
{
|
||||
"name": "Alice Chen",
|
||||
"role": "Senior Engineer",
|
||||
"role_description": "Owns backend API development",
|
||||
"okrs": [
|
||||
{"objective": "Reduce API latency", "key_result": "P95 < 200ms", "progress": 0.85}
|
||||
],
|
||||
"metrics": {
|
||||
"tasks_completed": 47,
|
||||
"tasks_assigned": 52,
|
||||
"avg_completion_days": 3.2,
|
||||
"quality_score": 92,
|
||||
"peer_feedback_score": 4.5,
|
||||
"initiatives_proposed": 3,
|
||||
"initiatives_shipped": 2
|
||||
},
|
||||
"deliverables": [
|
||||
{"name": "API v2 Migration", "status": "completed", "date": "2024-02-15"}
|
||||
]
|
||||
}
|
||||
],
|
||||
"org_context": {
|
||||
"company_goals": ["Ship v3 by Q2", "Reduce infra costs 30%"],
|
||||
"team_size": 12,
|
||||
"evaluation_period": "Q1 2024"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**CSV Input Format:**
|
||||
```csv
|
||||
name,role,tasks_completed,tasks_assigned,avg_completion_days,quality_score,peer_feedback_score,initiatives_proposed,initiatives_shipped
|
||||
Alice Chen,Senior Engineer,47,52,3.2,92,4.5,3,2
|
||||
Bob Park,Junior Dev,28,40,5.1,68,3.2,0,0
|
||||
```
|
||||
|
||||
**Scoring Dimensions:**
|
||||
|
||||
| Dimension | Weight | What It Measures |
|
||||
|-----------|--------|-----------------|
|
||||
| Output Velocity | 30% | Task completion rate + speed |
|
||||
| Quality | 30% | Deliverable quality + peer feedback |
|
||||
| Independence | 20% | Self-direction, low management overhead |
|
||||
| Initiative | 20% | Proactive contributions beyond assigned work |
|
||||
|
||||
**Tier Labels:**
|
||||
|
||||
| Tier | Score | Meaning |
|
||||
|------|-------|---------|
|
||||
| 🟢 A-Player | 80+ | Top performer. Promote or retain aggressively. |
|
||||
| 🟡 B-Player | 55-79 | Solid contributor. Coach to A or maintain. |
|
||||
| 🔴 C-Player | <55 | Underperforming. Reassign, PIP, or exit. |
|
||||
|
||||
---
|
||||
|
||||
### 2. 📋 Meeting Action Extractor (`meeting_action_extractor.py`)
|
||||
|
||||
Never lose an action item again. Feed it meeting transcripts; get structured decisions, action items, follow-ups, and insights.
|
||||
|
||||
**What it does:**
|
||||
- Extracts decisions with who made them and context
|
||||
- Identifies action items with owner, deadline, and priority
|
||||
- Catches implicit commitments ("I'll take care of that" → action item)
|
||||
- Flags open questions and unresolved items
|
||||
- Pulls out key insights and quotable moments
|
||||
- Identifies follow-up meetings needed
|
||||
- Assigns confidence scores (1.0 = explicit, 0.5 = inferred)
|
||||
- Supports batch processing of entire transcript directories
|
||||
- Optional HubSpot integration to push action items as tasks
|
||||
|
||||
```bash
|
||||
# Single transcript → markdown
|
||||
python3 meeting_action_extractor.py --transcript meeting.txt
|
||||
|
||||
# Single transcript → JSON
|
||||
python3 meeting_action_extractor.py --transcript meeting.txt --format json
|
||||
|
||||
# Read from stdin (paste or pipe)
|
||||
cat meeting.txt | python3 meeting_action_extractor.py --stdin
|
||||
|
||||
# Batch process a directory
|
||||
python3 meeting_action_extractor.py --batch ./transcripts/ --output ./actions/
|
||||
|
||||
# Push action items to HubSpot
|
||||
python3 meeting_action_extractor.py --transcript meeting.txt --push-hubspot
|
||||
|
||||
# Dry run
|
||||
python3 meeting_action_extractor.py --transcript meeting.txt --dry-run
|
||||
```
|
||||
|
||||
**Example Output (Markdown):**
|
||||
|
||||
```markdown
|
||||
## Action Items
|
||||
|
||||
1. 🔴 **Finalize Q2 budget proposal**
|
||||
- Owner: **Sarah**
|
||||
- Deadline: Friday March 15
|
||||
- Confidence: 95%
|
||||
- Source: "Sarah, can you get the Q2 budget finalized by Friday?"
|
||||
|
||||
2. 🟡 **Look into the API latency issue** *(implicit)*
|
||||
- Owner: **Mike**
|
||||
- Deadline: No deadline
|
||||
- Confidence: 80%
|
||||
- Source: "Yeah, I'll look into that"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Clone and install
|
||||
|
||||
```bash
|
||||
git clone https://github.com/singlegrain/ai-marketing-skills.git
|
||||
cd ai-marketing-skills/team-ops
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### 2. Configure environment
|
||||
|
||||
```bash
|
||||
# Set at least one LLM provider
|
||||
export ANTHROPIC_API_KEY="sk-ant-..."
|
||||
# OR
|
||||
export OPENAI_API_KEY="sk-..."
|
||||
|
||||
# Optional: HubSpot for meeting action push
|
||||
export HUBSPOT_API_KEY="pat-..."
|
||||
|
||||
# Optional: Override LLM settings
|
||||
export LLM_PROVIDER="anthropic" # or "openai"
|
||||
export LLM_MODEL="claude-sonnet-4-20250514" # or "gpt-4o"
|
||||
```
|
||||
|
||||
### 3. Test with dry runs
|
||||
|
||||
```bash
|
||||
# Test performance audit (quantitative scoring only)
|
||||
python3 team_performance_audit.py --input sample_team.json --dry-run
|
||||
|
||||
# Test meeting extractor
|
||||
python3 meeting_action_extractor.py --transcript sample_meeting.txt --dry-run
|
||||
```
|
||||
|
||||
### 4. Run for real
|
||||
|
||||
```bash
|
||||
# Full team audit
|
||||
python3 team_performance_audit.py --input team_data.json --output q1_audit.md
|
||||
|
||||
# Extract actions from today's meeting
|
||||
python3 meeting_action_extractor.py --transcript standup.txt --format markdown
|
||||
|
||||
# Batch process last week's meetings
|
||||
python3 meeting_action_extractor.py --batch ./weekly_transcripts/ --output ./weekly_actions/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Integrations
|
||||
|
||||
| Tool | Required | Used By |
|
||||
|------|----------|---------|
|
||||
| [Anthropic](https://anthropic.com) | One LLM required | Both tools |
|
||||
| [OpenAI](https://openai.com) | One LLM required | Both tools |
|
||||
| [HubSpot](https://hubspot.com) | Optional | Meeting Extractor (task push) |
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
team-ops/
|
||||
├── README.md # This file
|
||||
├── SKILL.md # Claude Code skill definition
|
||||
├── requirements.txt # Python dependencies
|
||||
├── team_performance_audit.py # Elon Algorithm team audit
|
||||
└── meeting_action_extractor.py # Meeting transcript → action items
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## How It Works Together
|
||||
|
||||
1. **Team Performance Audit** gives you the big picture: who's performing, who isn't, where the org is inefficient
|
||||
2. **Meeting Action Extractor** keeps the day-to-day moving: every meeting produces clear, tracked action items
|
||||
3. Together: audit identifies what needs to change, meetings track the execution of those changes
|
||||
|
||||
Run the audit quarterly. Run the extractor after every meeting. Watch accountability compound.
|
||||
|
||||
---
|
||||
|
||||
<div align="center">
|
||||
|
||||
**🧠 [Want these built and managed for you? →](https://singlebrain.com/?utm_source=github&utm_medium=skill_repo&utm_campaign=ai_marketing_skills)**
|
||||
|
||||
*This is how we build agents at [Single Brain](https://singlebrain.com/?utm_source=github&utm_medium=skill_repo&utm_campaign=ai_marketing_skills) for our clients.*
|
||||
|
||||
[Single Grain](https://www.singlegrain.com/?utm_source=github&utm_medium=skill_repo&utm_campaign=ai_marketing_skills) · our marketing agency
|
||||
|
||||
📬 **[Level up your marketing with 14,000+ marketers and founders →](https://levelingup.beehiiv.com/subscribe)** *(free)*
|
||||
|
||||
</div>
|
||||
93
team-ops/SKILL.md
Normal file
93
team-ops/SKILL.md
Normal file
|
|
@ -0,0 +1,93 @@
|
|||
# AI Team Ops
|
||||
|
||||
AI-powered team performance analysis and meeting intelligence: ruthless performance audits using the "Elon Algorithm" + automatic extraction of action items, decisions, and follow-ups from meeting transcripts.
|
||||
|
||||
## When to Use
|
||||
|
||||
Use this skill when:
|
||||
- Evaluating team performance against OKRs/KPIs with a structured framework
|
||||
- Stack ranking team members to identify A/B/C players
|
||||
- Finding redundant roles, bottlenecks, and automation opportunities in your org
|
||||
- Extracting action items and decisions from meeting transcripts
|
||||
- Processing batch meeting notes into structured follow-up lists
|
||||
- Pushing meeting action items to CRM (HubSpot) as tasks
|
||||
|
||||
## Tools
|
||||
|
||||
### Team Performance
|
||||
|
||||
| Script | Purpose | Key Command |
|
||||
|--------|---------|-------------|
|
||||
| `team_performance_audit.py` | Elon Algorithm: 5-step team audit + stack rank + scorecards | `python3 team_performance_audit.py --input team_data.json --output report.md` |
|
||||
|
||||
### Meeting Intelligence
|
||||
|
||||
| Script | Purpose | Key Command |
|
||||
|--------|---------|-------------|
|
||||
| `meeting_action_extractor.py` | Extract decisions, actions, follow-ups from transcripts | `python3 meeting_action_extractor.py --transcript meeting.txt --format markdown` |
|
||||
|
||||
## Configuration
|
||||
|
||||
All scripts use environment variables for LLM API access. Copy `.env.example` to `.env` and fill in your values.
|
||||
|
||||
### Required Environment Variables
|
||||
|
||||
- `ANTHROPIC_API_KEY` — Anthropic API key (Claude for analysis)
|
||||
- `OPENAI_API_KEY` — OpenAI API key (alternative LLM provider)
|
||||
|
||||
### Optional Environment Variables
|
||||
|
||||
- `HUBSPOT_API_KEY` — HubSpot private app token (for pushing meeting action items as tasks)
|
||||
- `LLM_PROVIDER` — `anthropic` (default) or `openai`
|
||||
- `LLM_MODEL` — Model name override (default: `claude-sonnet-4-20250514` or `gpt-4o`)
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
Role Descriptions + OKRs + Output Data (CSV/JSON)
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────────┐
|
||||
│ team_performance_audit.py │
|
||||
│ 5-Step Elon Algorithm: │
|
||||
│ 1. Question requirements │
|
||||
│ 2. Delete redundancies │
|
||||
│ 3. Simplify workflows │
|
||||
│ 4. Accelerate bottlenecks │
|
||||
│ 5. Automate what's possible │
|
||||
│ │
|
||||
│ + Score: velocity, quality, │
|
||||
│ independence, initiative │
|
||||
│ + Stack rank: A/B/C players │
|
||||
│ + Actions: promote/coach/exit │
|
||||
└──────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
Executive Summary + Individual Scorecards + Org Recommendations
|
||||
|
||||
|
||||
Meeting Transcripts (text files or stdin)
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────────┐
|
||||
│ meeting_action_extractor.py │
|
||||
│ Extract: │
|
||||
│ • Decisions (who + context) │
|
||||
│ • Action items (owner + │
|
||||
│ deadline + priority) │
|
||||
│ • Open questions │
|
||||
│ • Key insights / quotes │
|
||||
│ • Follow-up meetings needed │
|
||||
│ • Implicit commitments │
|
||||
│ + Confidence scores │
|
||||
└──────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
Structured JSON / Markdown + Optional CRM Push
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
- Python 3.9+
|
||||
- `anthropic` or `openai` (for LLM-powered analysis)
|
||||
- `requests` (for optional HubSpot integration)
|
||||
666
team-ops/meeting_action_extractor.py
Normal file
666
team-ops/meeting_action_extractor.py
Normal file
|
|
@ -0,0 +1,666 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
Meeting-to-Action Extractor
|
||||
|
||||
Takes meeting transcripts and extracts structured action items, decisions,
|
||||
follow-ups, and insights using LLM analysis.
|
||||
|
||||
Usage:
|
||||
# Single transcript
|
||||
python3 meeting_action_extractor.py --transcript meeting.txt
|
||||
|
||||
# Output as JSON
|
||||
python3 meeting_action_extractor.py --transcript meeting.txt --format json
|
||||
|
||||
# Output as markdown (default)
|
||||
python3 meeting_action_extractor.py --transcript meeting.txt --format markdown
|
||||
|
||||
# Batch mode — process a directory of transcripts
|
||||
python3 meeting_action_extractor.py --batch ./transcripts/ --output ./actions/
|
||||
|
||||
# Read from stdin (pipe or paste)
|
||||
cat meeting.txt | python3 meeting_action_extractor.py --stdin
|
||||
|
||||
# Push action items to HubSpot as tasks
|
||||
python3 meeting_action_extractor.py --transcript meeting.txt --push-hubspot
|
||||
|
||||
# Dry run (no LLM calls, shows what would be processed)
|
||||
python3 meeting_action_extractor.py --transcript meeting.txt --dry-run
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import glob
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# LLM Integration
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
EXTRACTION_SYSTEM_PROMPT = """You are an expert meeting analyst. Your job is to extract structured information from meeting transcripts with high accuracy.
|
||||
|
||||
You must return ONLY valid JSON (no markdown, no explanation) matching this exact schema:
|
||||
|
||||
{
|
||||
"meeting_title": "string — inferred from context",
|
||||
"meeting_date": "string — date if mentioned, else 'Unknown'",
|
||||
"attendees": ["string — names mentioned as present"],
|
||||
"decisions": [
|
||||
{
|
||||
"decision": "string — what was decided",
|
||||
"made_by": "string — who made or drove the decision",
|
||||
"context": "string — brief context/reasoning",
|
||||
"confidence": 0.0-1.0
|
||||
}
|
||||
],
|
||||
"action_items": [
|
||||
{
|
||||
"action": "string — specific task to be done",
|
||||
"owner": "string — person responsible",
|
||||
"deadline": "string — deadline if mentioned, else null",
|
||||
"priority": "high|medium|low",
|
||||
"is_implicit": false,
|
||||
"source_quote": "string — the relevant quote from transcript",
|
||||
"confidence": 0.0-1.0
|
||||
}
|
||||
],
|
||||
"open_questions": [
|
||||
{
|
||||
"question": "string — unresolved question or topic",
|
||||
"raised_by": "string — who raised it, if clear",
|
||||
"context": "string — brief context",
|
||||
"confidence": 0.0-1.0
|
||||
}
|
||||
],
|
||||
"key_insights": [
|
||||
{
|
||||
"insight": "string — notable observation, data point, or quotable moment",
|
||||
"speaker": "string — who said it",
|
||||
"quote": "string — direct quote if available",
|
||||
"confidence": 0.0-1.0
|
||||
}
|
||||
],
|
||||
"follow_up_meetings": [
|
||||
{
|
||||
"topic": "string — what needs follow-up discussion",
|
||||
"suggested_attendees": ["string"],
|
||||
"urgency": "high|medium|low",
|
||||
"confidence": 0.0-1.0
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
RULES:
|
||||
- Detect implicit commitments. Phrases like "I'll handle that", "let me look into it", "I can take care of that", "we should probably..." are action items.
|
||||
- Assign confidence scores: 1.0 = explicitly stated, 0.8 = strongly implied, 0.5-0.7 = inferred from context, <0.5 = uncertain.
|
||||
- For priority: high = mentioned as urgent/blocking/deadline-sensitive. medium = important but not blocking. low = nice-to-have or background task.
|
||||
- If someone says "I'll do X by Friday" — that's an action item with owner and deadline.
|
||||
- If a question is asked and not answered in the transcript, it's an open question.
|
||||
- Be exhaustive. Missing an action item is worse than including a low-confidence one."""
|
||||
|
||||
|
||||
def call_llm(prompt: str, system_prompt: str = "") -> str:
|
||||
"""
|
||||
Call the configured LLM provider.
|
||||
Set LLM_PROVIDER to 'anthropic' or 'openai'.
|
||||
"""
|
||||
provider = os.getenv("LLM_PROVIDER", "anthropic").lower()
|
||||
model = os.getenv("LLM_MODEL", "")
|
||||
|
||||
if provider == "anthropic":
|
||||
api_key = os.getenv("ANTHROPIC_API_KEY", "")
|
||||
if not api_key:
|
||||
return _fallback_extraction()
|
||||
|
||||
try:
|
||||
import anthropic
|
||||
client = anthropic.Anthropic(api_key=api_key)
|
||||
message = client.messages.create(
|
||||
model=model or "claude-sonnet-4-20250514",
|
||||
max_tokens=4096,
|
||||
system=system_prompt,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
)
|
||||
return message.content[0].text
|
||||
except ImportError:
|
||||
print("Warning: 'anthropic' package not installed. Using fallback.", file=sys.stderr)
|
||||
return _fallback_extraction()
|
||||
except Exception as e:
|
||||
print(f"Warning: Anthropic API error: {e}. Using fallback.", file=sys.stderr)
|
||||
return _fallback_extraction()
|
||||
|
||||
elif provider == "openai":
|
||||
api_key = os.getenv("OPENAI_API_KEY", "")
|
||||
if not api_key:
|
||||
return _fallback_extraction()
|
||||
|
||||
try:
|
||||
import openai
|
||||
client = openai.OpenAI(api_key=api_key)
|
||||
response = client.chat.completions.create(
|
||||
model=model or "gpt-4o",
|
||||
messages=[
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": prompt},
|
||||
],
|
||||
max_tokens=4096,
|
||||
response_format={"type": "json_object"},
|
||||
)
|
||||
return response.choices[0].message.content
|
||||
except ImportError:
|
||||
print("Warning: 'openai' package not installed. Using fallback.", file=sys.stderr)
|
||||
return _fallback_extraction()
|
||||
except Exception as e:
|
||||
print(f"Warning: OpenAI API error: {e}. Using fallback.", file=sys.stderr)
|
||||
return _fallback_extraction()
|
||||
|
||||
else:
|
||||
print(f"Warning: Unknown LLM provider '{provider}'.", file=sys.stderr)
|
||||
return _fallback_extraction()
|
||||
|
||||
|
||||
def _fallback_extraction() -> str:
|
||||
"""Return a placeholder when no LLM is available."""
|
||||
return json.dumps({
|
||||
"meeting_title": "Unknown (LLM unavailable)",
|
||||
"meeting_date": "Unknown",
|
||||
"attendees": [],
|
||||
"decisions": [],
|
||||
"action_items": [],
|
||||
"open_questions": [],
|
||||
"key_insights": [],
|
||||
"follow_up_meetings": [],
|
||||
"_error": "No LLM API key configured. Set ANTHROPIC_API_KEY or OPENAI_API_KEY.",
|
||||
})
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Extraction
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def extract_from_transcript(transcript: str) -> dict:
|
||||
"""
|
||||
Send a transcript to the LLM and parse the structured extraction.
|
||||
"""
|
||||
# Truncate extremely long transcripts to avoid token limits
|
||||
max_chars = 100_000 # ~25k tokens
|
||||
if len(transcript) > max_chars:
|
||||
print(
|
||||
f"Warning: Transcript truncated from {len(transcript)} to {max_chars} chars.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
transcript = transcript[:max_chars] + "\n\n[TRANSCRIPT TRUNCATED]"
|
||||
|
||||
prompt = f"""Extract all decisions, action items, open questions, key insights, and follow-up meetings from this meeting transcript.
|
||||
|
||||
Return ONLY valid JSON matching the schema in your instructions.
|
||||
|
||||
---
|
||||
TRANSCRIPT:
|
||||
{transcript}
|
||||
---"""
|
||||
|
||||
raw_response = call_llm(prompt, system_prompt=EXTRACTION_SYSTEM_PROMPT)
|
||||
|
||||
# Parse JSON from response (handle potential markdown wrapping)
|
||||
try:
|
||||
# Try direct parse first
|
||||
return json.loads(raw_response)
|
||||
except json.JSONDecodeError:
|
||||
# Try to extract JSON from markdown code block
|
||||
if "```json" in raw_response:
|
||||
json_str = raw_response.split("```json")[1].split("```")[0].strip()
|
||||
return json.loads(json_str)
|
||||
elif "```" in raw_response:
|
||||
json_str = raw_response.split("```")[1].split("```")[0].strip()
|
||||
return json.loads(json_str)
|
||||
else:
|
||||
print("Error: Could not parse LLM response as JSON.", file=sys.stderr)
|
||||
return {
|
||||
"meeting_title": "Parse Error",
|
||||
"decisions": [],
|
||||
"action_items": [],
|
||||
"open_questions": [],
|
||||
"key_insights": [],
|
||||
"follow_up_meetings": [],
|
||||
"_error": f"Failed to parse LLM response. Raw: {raw_response[:500]}",
|
||||
}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Output Formatting
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def format_markdown(extraction: dict, source_file: Optional[str] = None) -> str:
|
||||
"""Format extraction results as readable markdown."""
|
||||
lines = []
|
||||
title = extraction.get("meeting_title", "Meeting Notes")
|
||||
date = extraction.get("meeting_date", "Unknown")
|
||||
|
||||
lines.extend([
|
||||
f"# {title}",
|
||||
"",
|
||||
f"**Date:** {date}",
|
||||
f"**Extracted:** {datetime.now().strftime('%Y-%m-%d %H:%M')}",
|
||||
])
|
||||
|
||||
if source_file:
|
||||
lines.append(f"**Source:** {source_file}")
|
||||
|
||||
attendees = extraction.get("attendees", [])
|
||||
if attendees:
|
||||
lines.append(f"**Attendees:** {', '.join(attendees)}")
|
||||
|
||||
lines.append("")
|
||||
|
||||
# --- Decisions ---
|
||||
decisions = extraction.get("decisions", [])
|
||||
if decisions:
|
||||
lines.extend(["## Decisions Made", ""])
|
||||
for i, d in enumerate(decisions, 1):
|
||||
conf = d.get("confidence", 0)
|
||||
conf_bar = "🟢" if conf >= 0.8 else "🟡" if conf >= 0.5 else "🔴"
|
||||
lines.append(f"{i}. **{d.get('decision', 'Unknown')}**")
|
||||
lines.append(f" - Made by: {d.get('made_by', 'Unknown')}")
|
||||
lines.append(f" - Context: {d.get('context', 'N/A')}")
|
||||
lines.append(f" - Confidence: {conf_bar} {conf:.0%}")
|
||||
lines.append("")
|
||||
|
||||
# --- Action Items ---
|
||||
actions = extraction.get("action_items", [])
|
||||
if actions:
|
||||
lines.extend(["## Action Items", ""])
|
||||
|
||||
# Sort by priority
|
||||
priority_order = {"high": 0, "medium": 1, "low": 2}
|
||||
actions_sorted = sorted(actions, key=lambda a: priority_order.get(a.get("priority", "medium"), 1))
|
||||
|
||||
for i, a in enumerate(actions_sorted, 1):
|
||||
priority = a.get("priority", "medium")
|
||||
priority_emoji = {"high": "🔴", "medium": "🟡", "low": "🟢"}.get(priority, "⚪")
|
||||
implicit_tag = " *(implicit)*" if a.get("is_implicit") else ""
|
||||
deadline = a.get("deadline") or "No deadline"
|
||||
conf = a.get("confidence", 0)
|
||||
|
||||
lines.append(f"{i}. {priority_emoji} **{a.get('action', 'Unknown')}**{implicit_tag}")
|
||||
lines.append(f" - Owner: **{a.get('owner', 'Unassigned')}**")
|
||||
lines.append(f" - Deadline: {deadline}")
|
||||
lines.append(f" - Confidence: {conf:.0%}")
|
||||
if a.get("source_quote"):
|
||||
lines.append(f' - Source: "{a["source_quote"]}"')
|
||||
lines.append("")
|
||||
|
||||
# --- Open Questions ---
|
||||
questions = extraction.get("open_questions", [])
|
||||
if questions:
|
||||
lines.extend(["## Open Questions", ""])
|
||||
for i, q in enumerate(questions, 1):
|
||||
lines.append(f"{i}. **{q.get('question', 'Unknown')}**")
|
||||
if q.get("raised_by"):
|
||||
lines.append(f" - Raised by: {q['raised_by']}")
|
||||
if q.get("context"):
|
||||
lines.append(f" - Context: {q['context']}")
|
||||
lines.append("")
|
||||
|
||||
# --- Key Insights ---
|
||||
insights = extraction.get("key_insights", [])
|
||||
if insights:
|
||||
lines.extend(["## Key Insights", ""])
|
||||
for i, ins in enumerate(insights, 1):
|
||||
lines.append(f"{i}. **{ins.get('insight', 'Unknown')}**")
|
||||
if ins.get("speaker"):
|
||||
lines.append(f" - Speaker: {ins['speaker']}")
|
||||
if ins.get("quote"):
|
||||
lines.append(f' - Quote: "{ins["quote"]}"')
|
||||
lines.append("")
|
||||
|
||||
# --- Follow-up Meetings ---
|
||||
followups = extraction.get("follow_up_meetings", [])
|
||||
if followups:
|
||||
lines.extend(["## Follow-up Meetings Needed", ""])
|
||||
for i, fu in enumerate(followups, 1):
|
||||
urgency_emoji = {"high": "🔴", "medium": "🟡", "low": "🟢"}.get(fu.get("urgency", "medium"), "⚪")
|
||||
lines.append(f"{i}. {urgency_emoji} **{fu.get('topic', 'Unknown')}**")
|
||||
attendees_list = fu.get("suggested_attendees", [])
|
||||
if attendees_list:
|
||||
lines.append(f" - Attendees: {', '.join(attendees_list)}")
|
||||
lines.append("")
|
||||
|
||||
# --- Summary Stats ---
|
||||
lines.extend([
|
||||
"---",
|
||||
"",
|
||||
"### Extraction Summary",
|
||||
f"- Decisions: {len(decisions)}",
|
||||
f"- Action Items: {len(actions)} ({sum(1 for a in actions if a.get('priority') == 'high')} high priority)",
|
||||
f"- Open Questions: {len(questions)}",
|
||||
f"- Key Insights: {len(insights)}",
|
||||
f"- Follow-up Meetings: {len(followups)}",
|
||||
"",
|
||||
"*Generated by Meeting-to-Action Extractor*",
|
||||
])
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# HubSpot Integration (stub with real API structure)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def push_to_hubspot(extraction: dict) -> dict:
|
||||
"""
|
||||
Push action items to HubSpot as tasks.
|
||||
|
||||
Requires HUBSPOT_API_KEY env var.
|
||||
Creates a task for each action item with owner, deadline, and priority.
|
||||
|
||||
Returns a summary of created/failed tasks.
|
||||
"""
|
||||
api_key = os.getenv("HUBSPOT_API_KEY", "")
|
||||
if not api_key:
|
||||
return {
|
||||
"success": False,
|
||||
"error": "HUBSPOT_API_KEY not set. Cannot push to HubSpot.",
|
||||
"created": 0,
|
||||
"failed": 0,
|
||||
}
|
||||
|
||||
actions = extraction.get("action_items", [])
|
||||
if not actions:
|
||||
return {"success": True, "created": 0, "failed": 0, "message": "No action items to push."}
|
||||
|
||||
# --- HubSpot Task Creation ---
|
||||
# Uses the HubSpot CRM API v3 to create tasks (engagements)
|
||||
# Docs: https://developers.hubspot.com/docs/api/crm/tasks
|
||||
|
||||
import requests # only imported when actually pushing
|
||||
|
||||
results = {"created": 0, "failed": 0, "errors": []}
|
||||
hubspot_url = "https://api.hubapi.com/crm/v3/objects/tasks"
|
||||
headers = {
|
||||
"Authorization": f"Bearer {api_key}",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
|
||||
# Map priority to HubSpot priority values
|
||||
priority_map = {"high": "HIGH", "medium": "MEDIUM", "low": "LOW"}
|
||||
|
||||
for action in actions:
|
||||
# Build the task payload
|
||||
task_body = action.get("action", "Meeting action item")
|
||||
owner_name = action.get("owner", "Unassigned")
|
||||
deadline = action.get("deadline")
|
||||
priority = priority_map.get(action.get("priority", "medium"), "MEDIUM")
|
||||
|
||||
meeting_title = extraction.get("meeting_title", "Meeting")
|
||||
task_subject = f"[{meeting_title}] {task_body[:100]}"
|
||||
|
||||
payload = {
|
||||
"properties": {
|
||||
"hs_task_subject": task_subject,
|
||||
"hs_task_body": (
|
||||
f"Action: {task_body}\n"
|
||||
f"Owner: {owner_name}\n"
|
||||
f"Source: Meeting transcript extraction\n"
|
||||
f"Confidence: {action.get('confidence', 'N/A')}"
|
||||
),
|
||||
"hs_task_status": "NOT_STARTED",
|
||||
"hs_task_priority": priority,
|
||||
}
|
||||
}
|
||||
|
||||
# Add due date if we have a deadline
|
||||
if deadline and deadline.lower() not in ("none", "no deadline", "null", "tbd"):
|
||||
payload["properties"]["hs_timestamp"] = deadline # HubSpot expects ISO format
|
||||
|
||||
try:
|
||||
resp = requests.post(hubspot_url, headers=headers, json=payload, timeout=10)
|
||||
if resp.status_code in (200, 201):
|
||||
results["created"] += 1
|
||||
else:
|
||||
results["failed"] += 1
|
||||
results["errors"].append(f"Task '{task_body[:50]}': HTTP {resp.status_code}")
|
||||
except requests.RequestException as e:
|
||||
results["failed"] += 1
|
||||
results["errors"].append(f"Task '{task_body[:50]}': {str(e)}")
|
||||
|
||||
results["success"] = results["failed"] == 0
|
||||
return results
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Batch Processing
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def process_batch(directory: str, output_dir: Optional[str], fmt: str, push_hs: bool) -> list[dict]:
|
||||
"""
|
||||
Process all transcript files in a directory.
|
||||
|
||||
Supports .txt, .md, and .json files.
|
||||
"""
|
||||
transcript_files = []
|
||||
for ext in ("*.txt", "*.md", "*.json"):
|
||||
transcript_files.extend(glob.glob(os.path.join(directory, ext)))
|
||||
|
||||
transcript_files.sort()
|
||||
|
||||
if not transcript_files:
|
||||
print(f"No transcript files found in {directory}", file=sys.stderr)
|
||||
return []
|
||||
|
||||
print(f"📂 Found {len(transcript_files)} transcripts to process", file=sys.stderr)
|
||||
|
||||
if output_dir:
|
||||
os.makedirs(output_dir, exist_ok=True)
|
||||
|
||||
results = []
|
||||
for i, filepath in enumerate(transcript_files, 1):
|
||||
filename = os.path.basename(filepath)
|
||||
print(f"\n[{i}/{len(transcript_files)}] Processing: {filename}", file=sys.stderr)
|
||||
|
||||
with open(filepath, "r") as f:
|
||||
transcript = f.read()
|
||||
|
||||
extraction = extract_from_transcript(transcript)
|
||||
extraction["_source_file"] = filepath
|
||||
|
||||
if fmt == "markdown":
|
||||
output = format_markdown(extraction, source_file=filename)
|
||||
ext = ".md"
|
||||
else:
|
||||
output = json.dumps(extraction, indent=2, default=str)
|
||||
ext = ".json"
|
||||
|
||||
if output_dir:
|
||||
out_filename = Path(filename).stem + f"_actions{ext}"
|
||||
out_path = os.path.join(output_dir, out_filename)
|
||||
with open(out_path, "w") as f:
|
||||
f.write(output)
|
||||
print(f" ✅ Saved to {out_path}", file=sys.stderr)
|
||||
else:
|
||||
print(output)
|
||||
print("\n" + "=" * 80 + "\n")
|
||||
|
||||
if push_hs:
|
||||
hs_result = push_to_hubspot(extraction)
|
||||
print(
|
||||
f" 📤 HubSpot: {hs_result.get('created', 0)} created, "
|
||||
f"{hs_result.get('failed', 0)} failed",
|
||||
file=sys.stderr,
|
||||
)
|
||||
|
||||
results.append(extraction)
|
||||
|
||||
return results
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CLI
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Meeting-to-Action Extractor — Extract decisions, action items, and follow-ups from meeting transcripts.",
|
||||
epilog="Supports single transcripts, stdin, and batch processing of entire directories.",
|
||||
)
|
||||
|
||||
# Input source (mutually exclusive)
|
||||
input_group = parser.add_mutually_exclusive_group(required=True)
|
||||
input_group.add_argument(
|
||||
"--transcript", "-t",
|
||||
help="Path to a single transcript file (.txt, .md).",
|
||||
)
|
||||
input_group.add_argument(
|
||||
"--batch", "-b",
|
||||
help="Directory of transcript files to process in batch.",
|
||||
)
|
||||
input_group.add_argument(
|
||||
"--stdin",
|
||||
action="store_true",
|
||||
help="Read transcript from stdin.",
|
||||
)
|
||||
|
||||
# Output options
|
||||
parser.add_argument(
|
||||
"--output", "-o",
|
||||
help="Output file (single mode) or directory (batch mode).",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--format", "-f",
|
||||
choices=["markdown", "json"],
|
||||
default="markdown",
|
||||
help="Output format (default: markdown).",
|
||||
)
|
||||
|
||||
# Integration options
|
||||
parser.add_argument(
|
||||
"--push-hubspot",
|
||||
action="store_true",
|
||||
help="Push action items to HubSpot as tasks (requires HUBSPOT_API_KEY).",
|
||||
)
|
||||
|
||||
# Execution options
|
||||
parser.add_argument(
|
||||
"--dry-run",
|
||||
action="store_true",
|
||||
help="Show what would be processed without making LLM calls.",
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# --- Single transcript mode ---
|
||||
if args.transcript:
|
||||
if not os.path.exists(args.transcript):
|
||||
print(f"Error: File not found: {args.transcript}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
with open(args.transcript, "r") as f:
|
||||
transcript = f.read()
|
||||
|
||||
if args.dry_run:
|
||||
word_count = len(transcript.split())
|
||||
print(f"📄 Would process: {args.transcript} ({word_count} words, {len(transcript)} chars)")
|
||||
print(f" Format: {args.format}")
|
||||
print(f" HubSpot push: {'yes' if args.push_hubspot else 'no'}")
|
||||
return
|
||||
|
||||
print(f"📄 Processing: {args.transcript}", file=sys.stderr)
|
||||
extraction = extract_from_transcript(transcript)
|
||||
|
||||
if args.format == "markdown":
|
||||
output = format_markdown(extraction, source_file=args.transcript)
|
||||
else:
|
||||
output = json.dumps(extraction, indent=2, default=str)
|
||||
|
||||
if args.output:
|
||||
with open(args.output, "w") as f:
|
||||
f.write(output)
|
||||
print(f"✅ Written to {args.output}", file=sys.stderr)
|
||||
else:
|
||||
print(output)
|
||||
|
||||
if args.push_hubspot:
|
||||
hs_result = push_to_hubspot(extraction)
|
||||
print(
|
||||
f"📤 HubSpot: {hs_result.get('created', 0)} tasks created, "
|
||||
f"{hs_result.get('failed', 0)} failed",
|
||||
file=sys.stderr,
|
||||
)
|
||||
|
||||
# Print summary to stderr
|
||||
actions = extraction.get("action_items", [])
|
||||
decisions = extraction.get("decisions", [])
|
||||
print(
|
||||
f"\n📊 Extracted: {len(decisions)} decisions, {len(actions)} action items "
|
||||
f"({sum(1 for a in actions if a.get('priority') == 'high')} high priority)",
|
||||
file=sys.stderr,
|
||||
)
|
||||
|
||||
# --- Batch mode ---
|
||||
elif args.batch:
|
||||
if not os.path.isdir(args.batch):
|
||||
print(f"Error: Directory not found: {args.batch}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
if args.dry_run:
|
||||
files = []
|
||||
for ext in ("*.txt", "*.md", "*.json"):
|
||||
files.extend(glob.glob(os.path.join(args.batch, ext)))
|
||||
print(f"📂 Would process {len(files)} files from {args.batch}:")
|
||||
for f in sorted(files):
|
||||
print(f" - {os.path.basename(f)}")
|
||||
return
|
||||
|
||||
results = process_batch(args.batch, args.output, args.format, args.push_hubspot)
|
||||
|
||||
# Batch summary
|
||||
total_actions = sum(len(r.get("action_items", [])) for r in results)
|
||||
total_decisions = sum(len(r.get("decisions", [])) for r in results)
|
||||
print(
|
||||
f"\n📊 Batch complete: {len(results)} transcripts → "
|
||||
f"{total_decisions} decisions, {total_actions} action items",
|
||||
file=sys.stderr,
|
||||
)
|
||||
|
||||
# --- Stdin mode ---
|
||||
elif args.stdin:
|
||||
if args.dry_run:
|
||||
print("📄 Would process transcript from stdin")
|
||||
return
|
||||
|
||||
print("📄 Reading transcript from stdin...", file=sys.stderr)
|
||||
transcript = sys.stdin.read()
|
||||
|
||||
if not transcript.strip():
|
||||
print("Error: Empty input.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
extraction = extract_from_transcript(transcript)
|
||||
|
||||
if args.format == "markdown":
|
||||
output = format_markdown(extraction)
|
||||
else:
|
||||
output = json.dumps(extraction, indent=2, default=str)
|
||||
|
||||
if args.output:
|
||||
with open(args.output, "w") as f:
|
||||
f.write(output)
|
||||
print(f"✅ Written to {args.output}", file=sys.stderr)
|
||||
else:
|
||||
print(output)
|
||||
|
||||
if args.push_hubspot:
|
||||
hs_result = push_to_hubspot(extraction)
|
||||
print(
|
||||
f"📤 HubSpot: {hs_result.get('created', 0)} tasks created",
|
||||
file=sys.stderr,
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
6
team-ops/requirements.txt
Normal file
6
team-ops/requirements.txt
Normal file
|
|
@ -0,0 +1,6 @@
|
|||
# Core LLM providers (install at least one)
|
||||
anthropic>=0.39.0
|
||||
openai>=1.50.0
|
||||
|
||||
# For HubSpot CRM integration (optional)
|
||||
requests>=2.31.0
|
||||
746
team-ops/team_performance_audit.py
Normal file
746
team-ops/team_performance_audit.py
Normal file
|
|
@ -0,0 +1,746 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
Team Performance Audit — The "Elon Algorithm"
|
||||
|
||||
A ruthless, structured team performance evaluation framework.
|
||||
5-step analysis → individual scorecards → stack rank → recommended actions.
|
||||
|
||||
The 5 Steps:
|
||||
1. Question every requirement — is this role/task actually necessary?
|
||||
2. Delete redundant processes — flag overlap between team members
|
||||
3. Simplify — identify overcomplicated workflows
|
||||
4. Accelerate — find bottlenecks slowing the team
|
||||
5. Automate — flag tasks that AI/automation could handle
|
||||
|
||||
Usage:
|
||||
# Analyze from JSON input
|
||||
python3 team_performance_audit.py --input team_data.json --output report.md
|
||||
|
||||
# Analyze from CSV
|
||||
python3 team_performance_audit.py --input team_data.csv --output report.md
|
||||
|
||||
# Dry run (print to stdout, no LLM calls)
|
||||
python3 team_performance_audit.py --input team_data.json --dry-run
|
||||
|
||||
# JSON output instead of markdown
|
||||
python3 team_performance_audit.py --input team_data.json --format json --output report.json
|
||||
|
||||
Input format (JSON):
|
||||
{
|
||||
"team_members": [
|
||||
{
|
||||
"name": "Alice Chen",
|
||||
"role": "Senior Engineer",
|
||||
"role_description": "Owns backend API development and database optimization",
|
||||
"okrs": [
|
||||
{"objective": "Reduce API latency", "key_result": "P95 < 200ms", "progress": 0.85}
|
||||
],
|
||||
"metrics": {
|
||||
"tasks_completed": 47,
|
||||
"tasks_assigned": 52,
|
||||
"avg_completion_days": 3.2,
|
||||
"quality_score": 92,
|
||||
"peer_feedback_score": 4.5,
|
||||
"initiatives_proposed": 3,
|
||||
"initiatives_shipped": 2
|
||||
},
|
||||
"deliverables": [
|
||||
{"name": "API v2 Migration", "status": "completed", "date": "2024-02-15"},
|
||||
{"name": "DB Index Optimization", "status": "completed", "date": "2024-03-01"}
|
||||
]
|
||||
}
|
||||
],
|
||||
"org_context": {
|
||||
"company_goals": ["Ship v3 by Q2", "Reduce infrastructure costs 30%"],
|
||||
"team_size": 12,
|
||||
"evaluation_period": "Q1 2024"
|
||||
}
|
||||
}
|
||||
|
||||
Input format (CSV):
|
||||
name,role,tasks_completed,tasks_assigned,avg_completion_days,quality_score,peer_feedback_score,initiatives_proposed,initiatives_shipped
|
||||
Alice Chen,Senior Engineer,47,52,3.2,92,4.5,3,2
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from datetime import datetime
|
||||
from typing import Any
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# LLM Integration (stubs with real API structure)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def call_llm(prompt: str, system_prompt: str = "") -> str:
|
||||
"""
|
||||
Call the configured LLM provider for analysis.
|
||||
|
||||
Supports Anthropic (Claude) and OpenAI (GPT-4).
|
||||
Set LLM_PROVIDER env var to 'anthropic' or 'openai'.
|
||||
Set the corresponding API key env var.
|
||||
|
||||
Returns the LLM response text, or a placeholder if no API key is set.
|
||||
"""
|
||||
provider = os.getenv("LLM_PROVIDER", "anthropic").lower()
|
||||
model = os.getenv("LLM_MODEL", "")
|
||||
|
||||
if provider == "anthropic":
|
||||
api_key = os.getenv("ANTHROPIC_API_KEY", "")
|
||||
if not api_key:
|
||||
return _fallback_analysis(prompt)
|
||||
|
||||
# --- Anthropic API call ---
|
||||
try:
|
||||
import anthropic
|
||||
client = anthropic.Anthropic(api_key=api_key)
|
||||
message = client.messages.create(
|
||||
model=model or "claude-sonnet-4-20250514",
|
||||
max_tokens=4096,
|
||||
system=system_prompt or "You are an expert organizational analyst and management consultant.",
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
)
|
||||
return message.content[0].text
|
||||
except ImportError:
|
||||
print("Warning: 'anthropic' package not installed. Using fallback analysis.", file=sys.stderr)
|
||||
return _fallback_analysis(prompt)
|
||||
except Exception as e:
|
||||
print(f"Warning: Anthropic API error: {e}. Using fallback analysis.", file=sys.stderr)
|
||||
return _fallback_analysis(prompt)
|
||||
|
||||
elif provider == "openai":
|
||||
api_key = os.getenv("OPENAI_API_KEY", "")
|
||||
if not api_key:
|
||||
return _fallback_analysis(prompt)
|
||||
|
||||
# --- OpenAI API call ---
|
||||
try:
|
||||
import openai
|
||||
client = openai.OpenAI(api_key=api_key)
|
||||
response = client.chat.completions.create(
|
||||
model=model or "gpt-4o",
|
||||
messages=[
|
||||
{"role": "system", "content": system_prompt or "You are an expert organizational analyst and management consultant."},
|
||||
{"role": "user", "content": prompt},
|
||||
],
|
||||
max_tokens=4096,
|
||||
)
|
||||
return response.choices[0].message.content
|
||||
except ImportError:
|
||||
print("Warning: 'openai' package not installed. Using fallback analysis.", file=sys.stderr)
|
||||
return _fallback_analysis(prompt)
|
||||
except Exception as e:
|
||||
print(f"Warning: OpenAI API error: {e}. Using fallback analysis.", file=sys.stderr)
|
||||
return _fallback_analysis(prompt)
|
||||
|
||||
else:
|
||||
print(f"Warning: Unknown LLM provider '{provider}'. Using fallback.", file=sys.stderr)
|
||||
return _fallback_analysis(prompt)
|
||||
|
||||
|
||||
def _fallback_analysis(prompt: str) -> str:
|
||||
"""Fallback when no LLM API is available. Returns a notice."""
|
||||
return (
|
||||
"[LLM analysis unavailable — set ANTHROPIC_API_KEY or OPENAI_API_KEY]\n"
|
||||
"The quantitative scores below are computed locally. "
|
||||
"For qualitative analysis (redundancy detection, simplification recommendations, "
|
||||
"automation opportunities), configure an LLM provider."
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Data Loading
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def load_json_input(filepath: str) -> dict:
|
||||
"""Load team data from a JSON file."""
|
||||
with open(filepath, "r") as f:
|
||||
data = json.load(f)
|
||||
|
||||
if "team_members" not in data:
|
||||
raise ValueError("JSON input must contain a 'team_members' array.")
|
||||
return data
|
||||
|
||||
|
||||
def load_csv_input(filepath: str) -> dict:
|
||||
"""
|
||||
Load team data from a CSV file.
|
||||
|
||||
Expected columns: name, role, tasks_completed, tasks_assigned,
|
||||
avg_completion_days, quality_score, peer_feedback_score,
|
||||
initiatives_proposed, initiatives_shipped
|
||||
"""
|
||||
team_members = []
|
||||
with open(filepath, "r") as f:
|
||||
reader = csv.DictReader(f)
|
||||
for row in reader:
|
||||
member = {
|
||||
"name": row.get("name", "Unknown"),
|
||||
"role": row.get("role", "Unknown"),
|
||||
"role_description": row.get("role_description", ""),
|
||||
"okrs": [],
|
||||
"metrics": {
|
||||
"tasks_completed": int(row.get("tasks_completed", 0)),
|
||||
"tasks_assigned": int(row.get("tasks_assigned", 0)),
|
||||
"avg_completion_days": float(row.get("avg_completion_days", 0)),
|
||||
"quality_score": float(row.get("quality_score", 0)),
|
||||
"peer_feedback_score": float(row.get("peer_feedback_score", 0)),
|
||||
"initiatives_proposed": int(row.get("initiatives_proposed", 0)),
|
||||
"initiatives_shipped": int(row.get("initiatives_shipped", 0)),
|
||||
},
|
||||
"deliverables": [],
|
||||
}
|
||||
team_members.append(member)
|
||||
|
||||
return {"team_members": team_members, "org_context": {}}
|
||||
|
||||
|
||||
def load_input(filepath: str) -> dict:
|
||||
"""Load team data from JSON or CSV based on file extension."""
|
||||
if filepath.endswith(".csv"):
|
||||
return load_csv_input(filepath)
|
||||
else:
|
||||
return load_json_input(filepath)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Scoring Engine
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Weight configuration for the composite score
|
||||
SCORE_WEIGHTS = {
|
||||
"output_velocity": 0.30, # Speed and throughput
|
||||
"quality": 0.30, # Quality of deliverables
|
||||
"independence": 0.20, # Self-direction, low management overhead
|
||||
"initiative": 0.20, # Proactive contributions beyond assigned work
|
||||
}
|
||||
|
||||
# Tier thresholds
|
||||
TIER_THRESHOLDS = {
|
||||
"A": 80, # A-player: top performers, promote/retain
|
||||
"B": 55, # B-player: solid contributors, coach to A or maintain
|
||||
"C": 0, # C-player: underperforming, reassign or exit
|
||||
}
|
||||
|
||||
|
||||
def compute_output_velocity(metrics: dict) -> float:
|
||||
"""
|
||||
Score output velocity (0-100).
|
||||
|
||||
Factors:
|
||||
- Task completion rate (completed / assigned)
|
||||
- Speed (inverse of avg_completion_days, normalized)
|
||||
"""
|
||||
completed = metrics.get("tasks_completed", 0)
|
||||
assigned = metrics.get("tasks_assigned", 1) # avoid division by zero
|
||||
avg_days = metrics.get("avg_completion_days", 5)
|
||||
|
||||
# Completion rate: 0-60 points
|
||||
completion_rate = min(completed / max(assigned, 1), 1.0)
|
||||
completion_score = completion_rate * 60
|
||||
|
||||
# Speed: 0-40 points (faster = better, assumes <2 days is excellent, >10 is poor)
|
||||
if avg_days <= 1:
|
||||
speed_score = 40
|
||||
elif avg_days >= 10:
|
||||
speed_score = 0
|
||||
else:
|
||||
speed_score = max(0, 40 * (1 - (avg_days - 1) / 9))
|
||||
|
||||
return round(completion_score + speed_score, 1)
|
||||
|
||||
|
||||
def compute_quality(metrics: dict) -> float:
|
||||
"""
|
||||
Score quality (0-100).
|
||||
|
||||
Factors:
|
||||
- Quality score from reviews/metrics (0-100 scale expected)
|
||||
- Peer feedback score (1-5 scale, normalized to 0-100)
|
||||
"""
|
||||
quality_raw = metrics.get("quality_score", 50)
|
||||
peer_score = metrics.get("peer_feedback_score", 3.0)
|
||||
|
||||
# Quality component: 60% weight
|
||||
quality_component = min(quality_raw, 100) * 0.6
|
||||
|
||||
# Peer feedback: 40% weight (1-5 scale → 0-100)
|
||||
peer_normalized = max(0, min((peer_score - 1) / 4 * 100, 100))
|
||||
peer_component = peer_normalized * 0.4
|
||||
|
||||
return round(quality_component + peer_component, 1)
|
||||
|
||||
|
||||
def compute_independence(metrics: dict) -> float:
|
||||
"""
|
||||
Score independence (0-100).
|
||||
|
||||
Heuristic based on:
|
||||
- High completion rate (doesn't need hand-holding)
|
||||
- Low avg_completion_days relative to task volume
|
||||
- Peer feedback as proxy for collaboration without dependency
|
||||
|
||||
Note: For richer scoring, add fields like 'escalations_to_manager',
|
||||
'blockers_raised', 'self_unblocked_count' to your input data.
|
||||
"""
|
||||
completed = metrics.get("tasks_completed", 0)
|
||||
assigned = metrics.get("tasks_assigned", 1)
|
||||
peer_score = metrics.get("peer_feedback_score", 3.0)
|
||||
|
||||
# Completion without escalation proxy: 60% weight
|
||||
completion_rate = min(completed / max(assigned, 1), 1.0)
|
||||
completion_component = completion_rate * 60
|
||||
|
||||
# Peer score as collaboration proxy: 40% weight
|
||||
peer_normalized = max(0, min((peer_score - 1) / 4 * 100, 100))
|
||||
peer_component = peer_normalized * 0.4
|
||||
|
||||
return round(completion_component + peer_component, 1)
|
||||
|
||||
|
||||
def compute_initiative(metrics: dict) -> float:
|
||||
"""
|
||||
Score initiative (0-100).
|
||||
|
||||
Factors:
|
||||
- Initiatives proposed (ideas beyond assigned work)
|
||||
- Initiatives shipped (executed, not just suggested)
|
||||
- Ship rate (proposed → shipped conversion)
|
||||
"""
|
||||
proposed = metrics.get("initiatives_proposed", 0)
|
||||
shipped = metrics.get("initiatives_shipped", 0)
|
||||
|
||||
# Volume: 0-50 points (caps at 5+ proposed)
|
||||
volume_score = min(proposed / 5, 1.0) * 50
|
||||
|
||||
# Ship rate: 0-30 points
|
||||
if proposed > 0:
|
||||
ship_rate = min(shipped / proposed, 1.0)
|
||||
ship_score = ship_rate * 30
|
||||
else:
|
||||
ship_score = 0
|
||||
|
||||
# Shipped count bonus: 0-20 points (caps at 3+ shipped)
|
||||
shipped_bonus = min(shipped / 3, 1.0) * 20
|
||||
|
||||
return round(volume_score + ship_score + shipped_bonus, 1)
|
||||
|
||||
|
||||
def compute_composite_score(metrics: dict) -> dict:
|
||||
"""Compute all dimension scores and weighted composite."""
|
||||
velocity = compute_output_velocity(metrics)
|
||||
quality = compute_quality(metrics)
|
||||
independence = compute_independence(metrics)
|
||||
initiative = compute_initiative(metrics)
|
||||
|
||||
composite = (
|
||||
velocity * SCORE_WEIGHTS["output_velocity"]
|
||||
+ quality * SCORE_WEIGHTS["quality"]
|
||||
+ independence * SCORE_WEIGHTS["independence"]
|
||||
+ initiative * SCORE_WEIGHTS["initiative"]
|
||||
)
|
||||
|
||||
# Determine tier
|
||||
if composite >= TIER_THRESHOLDS["A"]:
|
||||
tier = "A"
|
||||
elif composite >= TIER_THRESHOLDS["B"]:
|
||||
tier = "B"
|
||||
else:
|
||||
tier = "C"
|
||||
|
||||
return {
|
||||
"output_velocity": velocity,
|
||||
"quality": quality,
|
||||
"independence": independence,
|
||||
"initiative": initiative,
|
||||
"composite": round(composite, 1),
|
||||
"tier": tier,
|
||||
}
|
||||
|
||||
|
||||
def recommend_action(tier: str, scores: dict) -> str:
|
||||
"""Generate recommended action based on tier and score profile."""
|
||||
if tier == "A":
|
||||
if scores["initiative"] >= 80:
|
||||
return "PROMOTE — High performer with strong initiative. Leadership candidate."
|
||||
return "RETAIN & REWARD — Top performer. Ensure compensation and growth path are competitive."
|
||||
|
||||
elif tier == "B":
|
||||
weakest = min(
|
||||
["output_velocity", "quality", "independence", "initiative"],
|
||||
key=lambda k: scores[k],
|
||||
)
|
||||
weak_labels = {
|
||||
"output_velocity": "speed/throughput",
|
||||
"quality": "deliverable quality",
|
||||
"independence": "self-direction",
|
||||
"initiative": "proactive contribution",
|
||||
}
|
||||
return f"COACH — Solid contributor. Focus development on {weak_labels[weakest]} (score: {scores[weakest]})."
|
||||
|
||||
else: # C
|
||||
if scores["composite"] < 30:
|
||||
return "EXIT — Significant underperformance across dimensions. Consider transition plan."
|
||||
return "REASSIGN or PIP — Underperforming in current role. Evaluate fit for different position."
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Elon Algorithm: 5-Step Analysis (LLM-powered)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def run_elon_algorithm(data: dict) -> str:
|
||||
"""
|
||||
Run the 5-step Elon Algorithm analysis using LLM.
|
||||
|
||||
Steps:
|
||||
1. Question every requirement
|
||||
2. Delete redundant processes
|
||||
3. Simplify workflows
|
||||
4. Accelerate bottlenecks
|
||||
5. Automate what's possible
|
||||
"""
|
||||
team_summary = []
|
||||
for m in data["team_members"]:
|
||||
team_summary.append(
|
||||
f"- {m['name']} ({m['role']}): {m.get('role_description', 'No description')}"
|
||||
)
|
||||
|
||||
org_ctx = data.get("org_context", {})
|
||||
goals = org_ctx.get("company_goals", ["Not specified"])
|
||||
|
||||
prompt = f"""Analyze this team using the Elon Algorithm — a ruthless 5-step organizational optimization framework.
|
||||
|
||||
## Team ({len(data['team_members'])} members)
|
||||
{chr(10).join(team_summary)}
|
||||
|
||||
## Company Goals
|
||||
{chr(10).join(f'- {g}' for g in goals)}
|
||||
|
||||
## Evaluation Period
|
||||
{org_ctx.get('evaluation_period', 'Current quarter')}
|
||||
|
||||
## Full Team Data
|
||||
{json.dumps(data['team_members'], indent=2, default=str)}
|
||||
|
||||
---
|
||||
|
||||
For each of the 5 steps below, provide SPECIFIC, ACTIONABLE findings (not generic advice):
|
||||
|
||||
### Step 1: Question Every Requirement
|
||||
For each role, ask: Is this role necessary? Is every task they do necessary? Could the team function without this position? Which tasks they perform have no clear connection to company goals?
|
||||
|
||||
### Step 2: Delete Redundant Processes
|
||||
Identify: Overlapping responsibilities between team members. Duplicate efforts. Roles that could be consolidated. Meetings or processes that exist by inertia.
|
||||
|
||||
### Step 3: Simplify
|
||||
Find: Overcomplicated workflows. Multi-step processes that could be 1-2 steps. Unnecessary approval chains. Reports nobody reads.
|
||||
|
||||
### Step 4: Accelerate
|
||||
Identify bottlenecks: Who/what is the slowest link? Where do tasks get stuck? What dependencies create wait times? What would unblock the most throughput?
|
||||
|
||||
### Step 5: Automate
|
||||
Flag tasks ripe for AI/automation: Data entry, reporting, scheduling, template-based work, monitoring, routing, classification. Estimate effort saved.
|
||||
|
||||
Be specific. Name names. Reference actual data. This is a performance audit, not a feel-good exercise."""
|
||||
|
||||
return call_llm(
|
||||
prompt,
|
||||
system_prompt=(
|
||||
"You are a ruthless organizational efficiency consultant. "
|
||||
"Your job is to find waste, redundancy, and inefficiency. "
|
||||
"Be direct, specific, and actionable. Name names when the data supports it. "
|
||||
"Do not hedge or soften findings."
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Report Generation
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def generate_scorecards(data: dict) -> list[dict]:
|
||||
"""Score every team member and return sorted scorecards."""
|
||||
scorecards = []
|
||||
for member in data["team_members"]:
|
||||
metrics = member.get("metrics", {})
|
||||
scores = compute_composite_score(metrics)
|
||||
action = recommend_action(scores["tier"], scores)
|
||||
|
||||
# OKR progress summary
|
||||
okrs = member.get("okrs", [])
|
||||
okr_avg = 0.0
|
||||
if okrs:
|
||||
okr_avg = sum(o.get("progress", 0) for o in okrs) / len(okrs)
|
||||
|
||||
scorecards.append({
|
||||
"name": member["name"],
|
||||
"role": member["role"],
|
||||
"scores": scores,
|
||||
"action": action,
|
||||
"okr_progress": round(okr_avg * 100, 1),
|
||||
"deliverables": member.get("deliverables", []),
|
||||
})
|
||||
|
||||
# Sort by composite score descending (stack rank)
|
||||
scorecards.sort(key=lambda x: x["scores"]["composite"], reverse=True)
|
||||
|
||||
# Add rank
|
||||
for i, sc in enumerate(scorecards, 1):
|
||||
sc["rank"] = i
|
||||
|
||||
return scorecards
|
||||
|
||||
|
||||
def format_markdown_report(scorecards: list[dict], elon_analysis: str, data: dict) -> str:
|
||||
"""Generate the full markdown report."""
|
||||
org_ctx = data.get("org_context", {})
|
||||
now = datetime.now().strftime("%Y-%m-%d %H:%M")
|
||||
|
||||
lines = [
|
||||
f"# Team Performance Audit",
|
||||
f"",
|
||||
f"**Generated:** {now}",
|
||||
f"**Team Size:** {len(scorecards)}",
|
||||
f"**Period:** {org_ctx.get('evaluation_period', 'Current')}",
|
||||
f"",
|
||||
]
|
||||
|
||||
# --- Executive Summary ---
|
||||
a_count = sum(1 for s in scorecards if s["scores"]["tier"] == "A")
|
||||
b_count = sum(1 for s in scorecards if s["scores"]["tier"] == "B")
|
||||
c_count = sum(1 for s in scorecards if s["scores"]["tier"] == "C")
|
||||
avg_composite = sum(s["scores"]["composite"] for s in scorecards) / max(len(scorecards), 1)
|
||||
|
||||
lines.extend([
|
||||
"## Executive Summary",
|
||||
"",
|
||||
f"| Metric | Value |",
|
||||
f"|--------|-------|",
|
||||
f"| Team Average Score | {avg_composite:.1f}/100 |",
|
||||
f"| A-Players | {a_count} ({a_count/max(len(scorecards),1)*100:.0f}%) |",
|
||||
f"| B-Players | {b_count} ({b_count/max(len(scorecards),1)*100:.0f}%) |",
|
||||
f"| C-Players | {c_count} ({c_count/max(len(scorecards),1)*100:.0f}%) |",
|
||||
"",
|
||||
])
|
||||
|
||||
# Health assessment
|
||||
if a_count / max(len(scorecards), 1) >= 0.3:
|
||||
lines.append("**Assessment:** Strong team core. Focus on coaching B-players up and addressing C-players decisively.")
|
||||
elif c_count / max(len(scorecards), 1) >= 0.3:
|
||||
lines.append("**Assessment:** ⚠️ Significant underperformance. Org restructuring recommended.")
|
||||
else:
|
||||
lines.append("**Assessment:** Average team composition. Targeted development can move the needle.")
|
||||
|
||||
lines.append("")
|
||||
|
||||
# --- Stack Rank ---
|
||||
lines.extend([
|
||||
"## Stack Rank",
|
||||
"",
|
||||
"| Rank | Name | Role | Composite | Tier | Action |",
|
||||
"|------|------|------|-----------|------|--------|",
|
||||
])
|
||||
for sc in scorecards:
|
||||
tier_emoji = {"A": "🟢", "B": "🟡", "C": "🔴"}[sc["scores"]["tier"]]
|
||||
lines.append(
|
||||
f"| {sc['rank']} | {sc['name']} | {sc['role']} | "
|
||||
f"{sc['scores']['composite']} | {tier_emoji} {sc['scores']['tier']} | "
|
||||
f"{sc['action'].split(' — ')[0]} |"
|
||||
)
|
||||
lines.append("")
|
||||
|
||||
# --- Elon Algorithm Analysis ---
|
||||
lines.extend([
|
||||
"## Elon Algorithm — 5-Step Analysis",
|
||||
"",
|
||||
elon_analysis,
|
||||
"",
|
||||
])
|
||||
|
||||
# --- Individual Scorecards ---
|
||||
lines.extend([
|
||||
"## Individual Scorecards",
|
||||
"",
|
||||
])
|
||||
|
||||
for sc in scorecards:
|
||||
tier_emoji = {"A": "🟢", "B": "🟡", "C": "🔴"}[sc["scores"]["tier"]]
|
||||
scores = sc["scores"]
|
||||
lines.extend([
|
||||
f"### #{sc['rank']} — {sc['name']} ({sc['role']})",
|
||||
"",
|
||||
f"**Tier:** {tier_emoji} {scores['tier']}-Player | **Composite:** {scores['composite']}/100",
|
||||
"",
|
||||
f"| Dimension | Score |",
|
||||
f"|-----------|-------|",
|
||||
f"| Output Velocity | {scores['output_velocity']}/100 |",
|
||||
f"| Quality | {scores['quality']}/100 |",
|
||||
f"| Independence | {scores['independence']}/100 |",
|
||||
f"| Initiative | {scores['initiative']}/100 |",
|
||||
"",
|
||||
])
|
||||
|
||||
if sc["okr_progress"] > 0:
|
||||
lines.append(f"**OKR Progress:** {sc['okr_progress']}%")
|
||||
lines.append("")
|
||||
|
||||
if sc["deliverables"]:
|
||||
lines.append("**Recent Deliverables:**")
|
||||
for d in sc["deliverables"]:
|
||||
status_emoji = "✅" if d.get("status") == "completed" else "🔄"
|
||||
lines.append(f"- {status_emoji} {d.get('name', 'Unknown')} ({d.get('status', 'unknown')}, {d.get('date', 'no date')})")
|
||||
lines.append("")
|
||||
|
||||
lines.append(f"**Recommended Action:** {sc['action']}")
|
||||
lines.append("")
|
||||
lines.append("---")
|
||||
lines.append("")
|
||||
|
||||
# --- Org-Level Recommendations ---
|
||||
lines.extend([
|
||||
"## Org-Level Recommendations",
|
||||
"",
|
||||
f"1. **Immediate:** Address {c_count} C-player(s) — each underperformer costs the team velocity.",
|
||||
f"2. **Short-term:** Invest in coaching for {b_count} B-player(s) — targeted development on their weakest dimension.",
|
||||
f"3. **Strategic:** Retain and challenge {a_count} A-player(s) — they leave when bored, not when overworked.",
|
||||
])
|
||||
|
||||
if avg_composite < 60:
|
||||
lines.append("4. **Warning:** Team average below 60. Consider structural changes, not just individual coaching.")
|
||||
|
||||
lines.append("")
|
||||
lines.append("---")
|
||||
lines.append(f"*Generated by Team Performance Audit (Elon Algorithm)*")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def format_json_report(scorecards: list[dict], elon_analysis: str, data: dict) -> str:
|
||||
"""Generate the full JSON report."""
|
||||
org_ctx = data.get("org_context", {})
|
||||
report = {
|
||||
"generated": datetime.now().isoformat(),
|
||||
"team_size": len(scorecards),
|
||||
"evaluation_period": org_ctx.get("evaluation_period", "Current"),
|
||||
"summary": {
|
||||
"average_composite": round(
|
||||
sum(s["scores"]["composite"] for s in scorecards) / max(len(scorecards), 1), 1
|
||||
),
|
||||
"tier_distribution": {
|
||||
"A": sum(1 for s in scorecards if s["scores"]["tier"] == "A"),
|
||||
"B": sum(1 for s in scorecards if s["scores"]["tier"] == "B"),
|
||||
"C": sum(1 for s in scorecards if s["scores"]["tier"] == "C"),
|
||||
},
|
||||
},
|
||||
"stack_rank": scorecards,
|
||||
"elon_algorithm_analysis": elon_analysis,
|
||||
}
|
||||
return json.dumps(report, indent=2, default=str)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CLI
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Team Performance Audit — The Elon Algorithm",
|
||||
epilog="Scores team members on velocity, quality, independence, and initiative. "
|
||||
"Stack ranks with A/B/C tiers and generates actionable recommendations.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--input", "-i",
|
||||
required=True,
|
||||
help="Path to team data file (JSON or CSV). See --help for format details.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--output", "-o",
|
||||
help="Output file path. If omitted, prints to stdout.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--format", "-f",
|
||||
choices=["markdown", "json"],
|
||||
default="markdown",
|
||||
help="Output format (default: markdown).",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--dry-run",
|
||||
action="store_true",
|
||||
help="Skip LLM calls. Only compute quantitative scores.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--weights",
|
||||
type=str,
|
||||
help='Custom score weights as JSON: \'{"output_velocity":0.4,"quality":0.3,"independence":0.15,"initiative":0.15}\'',
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Apply custom weights if provided
|
||||
if args.weights:
|
||||
try:
|
||||
custom_weights = json.loads(args.weights)
|
||||
for key in SCORE_WEIGHTS:
|
||||
if key in custom_weights:
|
||||
SCORE_WEIGHTS[key] = float(custom_weights[key])
|
||||
# Validate weights sum to ~1.0
|
||||
total = sum(SCORE_WEIGHTS.values())
|
||||
if abs(total - 1.0) > 0.01:
|
||||
print(f"Warning: Weights sum to {total}, not 1.0. Normalizing.", file=sys.stderr)
|
||||
for key in SCORE_WEIGHTS:
|
||||
SCORE_WEIGHTS[key] /= total
|
||||
except (json.JSONDecodeError, ValueError) as e:
|
||||
print(f"Error parsing --weights: {e}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
# Load data
|
||||
try:
|
||||
data = load_input(args.input)
|
||||
except FileNotFoundError:
|
||||
print(f"Error: File not found: {args.input}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
except (json.JSONDecodeError, ValueError) as e:
|
||||
print(f"Error: Invalid input file: {e}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
print(f"📊 Loaded {len(data['team_members'])} team members", file=sys.stderr)
|
||||
|
||||
# Compute scorecards
|
||||
scorecards = generate_scorecards(data)
|
||||
print(f"✅ Scored and ranked all members", file=sys.stderr)
|
||||
|
||||
# Run Elon Algorithm (LLM analysis)
|
||||
if args.dry_run:
|
||||
elon_analysis = "[Dry run — LLM analysis skipped. Quantitative scores only.]"
|
||||
print("⏭️ Dry run — skipping LLM analysis", file=sys.stderr)
|
||||
else:
|
||||
print("🤖 Running Elon Algorithm analysis...", file=sys.stderr)
|
||||
elon_analysis = run_elon_algorithm(data)
|
||||
print("✅ Analysis complete", file=sys.stderr)
|
||||
|
||||
# Generate report
|
||||
if args.format == "json":
|
||||
report = format_json_report(scorecards, elon_analysis, data)
|
||||
else:
|
||||
report = format_markdown_report(scorecards, elon_analysis, data)
|
||||
|
||||
# Output
|
||||
if args.output:
|
||||
with open(args.output, "w") as f:
|
||||
f.write(report)
|
||||
print(f"📝 Report written to {args.output}", file=sys.stderr)
|
||||
else:
|
||||
print(report)
|
||||
|
||||
# Summary to stderr
|
||||
a_count = sum(1 for s in scorecards if s["scores"]["tier"] == "A")
|
||||
b_count = sum(1 for s in scorecards if s["scores"]["tier"] == "B")
|
||||
c_count = sum(1 for s in scorecards if s["scores"]["tier"] == "C")
|
||||
print(f"\n🏆 Results: {a_count}A / {b_count}B / {c_count}C players", file=sys.stderr)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Loading…
Add table
Add a link
Reference in a new issue