feat: add agent/skill quality audit tooling + Grenier evaluation

AUDIT TOOLING (3 templates):
- Command: /audit-agents-skills (quick project audits)
  - 16-criteria framework (Identity 3x, Prompt 2x, Validation 1x, Design 2x)
  - Weighted scoring: 32 pts (agents/skills), 20 pts (commands)
  - Production grading (A-F, 80% threshold)
  - Fix mode with actionable suggestions
- Skill: audit-agents-skills (advanced audits)
  - 3 modes: Quick (top-5), Full (all 16), Comparative (vs templates)
  - JSON + Markdown output for CI/CD
- Scoring grids: criteria.yaml (externalized for reuse)

EVALUATION:
- Grenier agent/skill quality (3/5 - Moderate Value)
  - Gap: 29.5% deploy without evaluation (LangChang 2026)
  - Integration: Created audit command + skill + criteria
  - Industry context: 18% cite agent bugs as top challenge

DOCUMENTATION:
- Guide refs: 2 strategic call-outs (after Agent/Skill validation)
- CHANGELOG: New "Added" section + evaluation details
- README: Templates 106→107, Evaluations 49→24 (count corrections)
- reference.yaml: 10 new audit entries + updated counts

SYNC:
- Landing index.html: Templates 107, Evals 24, Quiz 257
- Landing examples/index.html: Templates 107

FILES: 14 changed, 4148 insertions (+1250 lines new audit content)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Florian BRUNIAUX 2026-02-07 15:40:18 +01:00
parent c5fad9f092
commit b48d95c024
14 changed files with 4148 additions and 13 deletions

View file

@ -0,0 +1,317 @@
# Resource Evaluation: Awesome Claude Skills (BehiSecc)
**URL**: https://github.com/BehiSecc/awesome-claude-skills
**Maintainer**: BehiSecc
**Created**: 2025-10-17
**Evaluated**: 2026-02-07
**Evaluator**: Claude (via /eval-resource skill)
---
## Executive Summary
| Criterion | Value |
|-----------|-------|
| **Initial Score** | 3/5 |
| **Score after challenge** | 3/5 (maintained) |
| **Score after fact-check** | **3/5** (Moderate) |
| **Final Decision** | Integrate with specialized mention |
| **Reason** | Skills-only taxonomy, complementary to awesome-claude-code |
---
## Content Summary
GitHub repository curating Claude Code skills across 12 categories:
**Actual skill count**: 62 skills (not 125+ as initially observed)
### Category Breakdown
| Category | Skills | Notable Items |
|----------|--------|---------------|
| Development & Code Tools | 14 | Web artifact builders, testing frameworks, AWS integrations |
| Collaboration & Project Management | 10 | Git, Linear, meeting analysis |
| Security & Web Testing | 7 | OWASP compliance, fuzzing, systematic debugging |
| Media & Content | 6 | Video/image processing, generation tools |
| Document Skills | 5 | Word, PDF, PowerPoint, spreadsheet manipulation |
| Writing & Research | 5 | Content creation, article extraction, brainstorming |
| Utility & Automation | 5 | File organization, invoice processing, deployment |
| Scientific & Research Tools | 4 | Links to K-Dense-AI (125+ external skills) |
| Data & Analysis | 3 | CSV analysis, PostgreSQL queries, root-cause tracing |
| Learning & Knowledge | 2 | Document linking, knowledge network creation |
| Health & Life Sciences | 1 | Medical report analysis, wellness tracking |
**Key distinction**: The "125+ scientific skills" referenced in repository descriptions refers to an *external repository* (K-Dense-AI/claude-scientific-skills), not to skills within this collection.
---
## Fact-Check Results
### Claims Verified Against Repository
| Claim | Reality | Status |
|-------|---------|--------|
| 5.5k stars, 489 forks | ✅ Confirmed | Verified |
| 27 contributors, 81 commits | ✅ Confirmed | Verified |
| Created October 2025 | ✅ 2025-10-17 | Verified |
| 12 categories | ✅ Confirmed | Verified |
| **125+ scientific skills** | ⚠️ **External link** (K-Dense-AI) | **Clarified** |
| **Actual skill count** | **62 skills** (recount) | **Corrected** |
| Detailed documentation | ❌ Link-only (minimal docs) | Verified |
| LICENSE file | ❌ None present | Verified |
| 0 open issues, 5 open PRs | ✅ Confirmed | Verified |
### Repository Quality Indicators
| Aspect | Assessment |
|--------|------------|
| **Documentation** | Minimal - One-line descriptions + GitHub links only |
| **Installation guides** | ❌ Not provided |
| **Usage examples** | ❌ Not provided |
| **Maintenance** | ✅ Active (5 PRs open, recent activity) |
| **Community** | ✅ Strong (5.5k stars in 3 months) |
| **License** | ❌ Not specified |
---
## Gap Analysis
### What awesome-claude-skills Covers
**Unique aspects**:
- Skills-only taxonomy (vs awesome-claude-code covering everything)
- 12-category organization
- Recent curation (reflects 2025-2026 ecosystem)
- Strong community traction (5.5k stars in 3 months)
### What Claude Code Ultimate Guide Already Has
**Existing coverage**:
- awesome-claude-code (20k stars) - general ecosystem curation
- skills.sh marketplace (35K+ installs) - installation-focused
- Plugin ecosystem documentation (Section 8.5)
- 66+ examples in `examples/` directory
### Estimated Overlap
**~30-40%** with awesome-claude-code (partial duplication)
### True Gap Identified
**Research/Science skills NOT substantially covered**:
- BehiSecc has only **4 scientific skills** directly
- K-Dense-AI (125+ skills) is external and should be evaluated separately
- Ultimate Guide has **zero research-focused workflows** or examples
---
## Challenge Results (technical-writer agent)
### Agent Critique Summary
**Initial proposal**: Score should be 4/5 (agent's position)
**Arguments for higher score**:
1. 5.5k stars in 3 months = exceptional traction
2. 27 contributors = active community (vs centralized curation)
3. 125+ scientific skills = massive gap in Ultimate Guide
4. Research audience completely missed (20-30% of advanced use cases)
**Counter-arguments after fact-check**:
1. ✅ Traction confirmed, but doesn't change content quality
2. ✅ Active community validated
3. ❌ **125+ scientific claim is misleading** (external link, not direct content)
4. ❌ **Research gap exists but BehiSecc doesn't fill it** (only 4 skills)
**Agent's recommended actions** (adjusted after fact-check):
- Phase 1: Ecosystem mention (3-5 lines) ← **Adopted**
- Phase 2: Research section (500-1000 lines) ← **Deferred** (evaluate K-Dense-AI separately)
- Phase 3: Example skills ← **Deferred**
### Final Agent Assessment
**Score maintained at 3/5** after fact-check revealed:
- Actual content (62 skills) < claimed content (125+)
- Scientific gap less substantial than initially perceived
- Documentation quality is minimal (link directory, not instructional guide)
---
## Comparison Matrix
| Aspect | awesome-claude-skills (BehiSecc) | Claude Code Ultimate Guide |
|--------|----------------------------------|----------------------------|
| **Total skills** | 62 curated | 66+ examples (agents/skills/commands) |
| **Documentation depth** | ❌ Links only | ✅ Full guides with usage |
| **Scientific/Research** | 4 skills + external link | ❌ Zero dedicated section |
| **Development** | ✅ 14 skills | ✅ Extensive (TDD, design patterns, etc.) |
| **Collaboration** | ✅ 10 skills | Git MCP documented, Linear not detailed |
| **Security** | ✅ 7 skills | ✅ security-hardening.md + examples |
| **Installation** | ❌ Not provided | ✅ scripts/install-templates.sh |
| **Maintenance** | ✅ Active (5 PRs, 27 contributors) | ✅ Active (v3.23.1, 24 evaluations) |
| **License** | ❌ Not specified | ✅ MIT |
| **Audience** | 🎯 Quick discovery (directory) | 🎯 Deep learning (education) |
---
## Integration Plan
### Primary Integration Points
#### 1. `guide/ultimate-guide.md` (Section 8.5 - Line ~9720)
**Context**: Community Resources & Ecosystem
**Content to add**:
```markdown
- [awesome-claude-skills](https://github.com/BehiSecc/awesome-claude-skills) - Skills-only taxonomy (62 skills across 12 categories)
```
**Rationale**: Positioned after awesome-claude-code (general) and awesome-claude-code-plugins (specialized), following the progression: general → specialized by component type.
#### 2. `guide/ultimate-guide.md` (Appendix - Line ~17521)
**Context**: External Resources table
**Content to add**:
```markdown
| [awesome-claude-skills (BehiSecc)](https://github.com/BehiSecc/awesome-claude-skills) | Skills taxonomy (62 skills, 12 categories) |
```
**Note**: Differentiation from existing ComposioHQ/awesome-claude-skills entry required (different maintainer, different taxonomy approach).
#### 3. `machine-readable/reference.yaml` (Line ~1003)
**Context**: ecosystem.complementary section
**Content to add**:
```yaml
awesome_claude_skills:
url: "github.com/BehiSecc/awesome-claude-skills"
maintainer: "BehiSecc"
focus: "Skills taxonomy - 62 skills across 12 categories"
categories: ["Development", "Design", "Documentation", "Testing", "DevOps", "Security", "Data", "AI/ML", "Productivity", "Content", "Integration", "Fun"]
positioning: "Complementary to awesome-claude-code (skills-only vs full ecosystem)"
evaluation: "docs/resource-evaluations/awesome-claude-skills-github.md"
score: "3/5 (Moderate - Useful complement)"
note: "Distinct from ComposioHQ/awesome-claude-skills (different maintainer, taxonomy approach)"
```
#### 4. `README.md` (Line ~342)
**Context**: Complementary Resources table
**Content to add**:
```markdown
| [awesome-claude-skills](https://github.com/BehiSecc/awesome-claude-skills) | Skills taxonomy | 62 skills across 12 categories |
```
### CHANGELOG Entry
**Section**: Unreleased → Documentation
```markdown
- **Ecosystem**: Added awesome-claude-skills (BehiSecc) to curated lists
- 62 skills taxonomy across 12 categories
- Positioned as complementary to awesome-claude-code (skills-only focus)
- Distinct from ComposioHQ version (different taxonomy approach)
- Referenced in guide section 8.5, Further Reading, reference.yaml
```
---
## Positioning Strategy
### Value Proposition
awesome-claude-skills serves as a **specialized taxonomy** for users who want:
- Skills-only filtering (not mixed with agents/commands/hooks)
- 12-category organization for discovery
- Community-curated collection with active maintenance
### Differentiation from Existing Resources
| Resource | Scope | Best For |
|----------|-------|----------|
| **awesome-claude-code** | Full ecosystem | Discovering all types of resources |
| **awesome-claude-skills (BehiSecc)** | Skills-only | Finding skills by category |
| **awesome-claude-skills (ComposioHQ)** | General skills | Alternative curation |
| **skills.sh marketplace** | Installation-focused | Installing via CLI |
| **Ultimate Guide examples/** | Educational | Learning with documentation |
### Risks of Non-Integration
**Low-to-moderate risk**:
- Partial overlap with existing resources (~30-40%)
- Alternative discovery paths exist (awesome-claude-code, skills.sh)
- Scientific/research gap exists but BehiSecc doesn't fully address it (only 4 skills)
**Opportunity cost**:
- Missing a specialized taxonomy approach (12 categories)
- Not acknowledging community traction (5.5k stars in 3 months)
- Potential user confusion (2 awesome-claude-skills exist)
---
## Deferred Actions
### Evaluate K-Dense-AI Separately
**Rationale**: The "125+ scientific skills" claim refers to an external repository. If research/science audience is a priority, K-Dense-AI should receive its own evaluation.
**Proposed evaluation criteria**:
- Skill quality (documentation, tests, examples)
- Maintenance status (last update, issue count)
- Overlap with existing scientific tools
- Integration feasibility (dependencies, prerequisites)
### Research/Science Section (Future)
If K-Dense-AI scores 4/5 or higher, consider:
- `guide/workflows/research-science.md` (500-1000 lines)
- Top 10-15 scientific skills documented
- Use cases: bioinformatics, ML, data analysis
- MCP integration (Context7 for scientific docs, Sequential for workflows)
---
## Lessons Learned
1. **Verify skill counts manually** - Repository descriptions can be misleading (125+ vs 62)
2. **Distinguish direct vs external content** - Links to other repos ≠ integrated content
3. **Documentation quality matters** - Link directories have lower value than instructional guides
4. **Community traction ≠ content quality** - 5.5k stars impressive, but doesn't change documentation depth
5. **Scientific gap exists but requires separate evaluation** - BehiSecc points to K-Dense-AI, evaluate that repo independently
---
## Related Evaluations
- [agentskills-io-specification.md](./agentskills-io-specification.md) - Skills open standard (4/5)
- [self-improve-skill.md](./self-improve-skill.md) - Skill lifecycle automation (3/5)
- [grenier-agent-skill-quality.md](./grenier-agent-skill-quality.md) - Quality audit framework (3/5)
---
## Metadata
```yaml
evaluated_by: Claude Sonnet 4.5
skill_used: /eval-resource
date: 2026-02-07
time_spent: ~45 minutes
verification_method: WebFetch (2 passes) + agent challenge + manual recount
stats_verified: Yes (5.5k stars, 489 forks, 62 skills, 12 categories)
primary_sources_checked: GitHub repository, README, category listings
integration_status: Pending (4 files to modify)
version_impact: None (minor addition, no version bump required)
```
---
**Next Steps**:
1. ✅ Create this evaluation file
2. ⏳ Modify 4 files (guide, reference.yaml, README, CHANGELOG)
3. ⏳ Verify cross-references
4. ⏳ Consider K-Dense-AI separate evaluation (if research audience prioritized)