feat: add Voice-to-Text section and voice-refine skill
- Add section 5 "Voice-to-Text Tools" to ai-ecosystem.md - Tool comparison (Wispr Flow, Superwhisper, MacWhisper) - Vibe Coding workflow and trade-offs - Recommendations by user profile - Create voice-refine skill in examples/skills/ - Transforms verbose voice input into structured prompts - 4-step pipeline: Dedupe → Extract → Structure → Compress - Before/after examples with ~3.3x compression ratio - Update reference.yaml with new entries and corrected line numbers - ai_ecosystem_voice_to_text: line 449 - voice_refine_skill: new skill reference - cowork_section: 701 → 760 - alternative_providers: 902 → 959 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
09eb141976
commit
360e5203f6
4 changed files with 390 additions and 12 deletions
125
examples/skills/voice-refine/SKILL.md
Normal file
125
examples/skills/voice-refine/SKILL.md
Normal file
|
|
@ -0,0 +1,125 @@
|
|||
---
|
||||
name: voice-refine
|
||||
description: Transform verbose voice input into optimized Claude prompts
|
||||
allowed-tools: Read
|
||||
context: inherit
|
||||
agent: specialist
|
||||
---
|
||||
|
||||
# Voice Refine Skill
|
||||
|
||||
Transform verbose, stream-of-consciousness voice dictation into structured,
|
||||
token-efficient prompts for Claude Code.
|
||||
|
||||
## When to Use
|
||||
|
||||
- Input from voice dictation (Wispr Flow, Superwhisper, macOS Dictation)
|
||||
- Verbose text >150 words
|
||||
- Contains filler words, repetitions, or tangents
|
||||
- Natural speech patterns that need structure
|
||||
|
||||
## Transformation Pipeline
|
||||
|
||||
```
|
||||
1. DEDUPE → Remove repetitions and filler words
|
||||
2. EXTRACT → Identify core requirements and constraints
|
||||
3. STRUCTURE → Organize into standard sections
|
||||
4. COMPRESS → Reduce to ~30% of original while preserving intent
|
||||
```
|
||||
|
||||
## Output Format
|
||||
|
||||
```markdown
|
||||
## Contexte
|
||||
[Project context, existing stack, relevant files]
|
||||
|
||||
## Objectif
|
||||
[Single sentence: what needs to be built/changed]
|
||||
|
||||
## Contraintes
|
||||
- [Constraint 1]
|
||||
- [Constraint 2]
|
||||
- [etc.]
|
||||
|
||||
## Output attendu
|
||||
[Expected deliverables: files, format, tests]
|
||||
```
|
||||
|
||||
## Flags
|
||||
|
||||
| Flag | Effect |
|
||||
|------|--------|
|
||||
| `--confirm` | Show refined prompt before sending to Claude (default) |
|
||||
| `--direct` | Send refined prompt directly without confirmation |
|
||||
| `--verbose` | Keep more detail, less compression |
|
||||
| `--en` | Output in English (default: matches input language) |
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```
|
||||
/voice-refine
|
||||
|
||||
Alors euh j'aimerais que tu m'aides à faire un truc, en fait j'ai une API
|
||||
qui renvoie des données utilisateurs et je voudrais les afficher dans un
|
||||
tableau React, mais attention il faut que ça soit paginé parce que y'a
|
||||
beaucoup de données, genre des milliers d'utilisateurs, et aussi faudrait
|
||||
pouvoir trier par nom ou par date d'inscription, ah et on utilise Tailwind
|
||||
dans le projet donc faut que ça matche avec ça...
|
||||
```
|
||||
|
||||
### With Flags
|
||||
|
||||
```
|
||||
/voice-refine --direct --en
|
||||
|
||||
[voice input in any language → sends English prompt directly]
|
||||
```
|
||||
|
||||
## Compression Metrics
|
||||
|
||||
| Metric | Target |
|
||||
|--------|--------|
|
||||
| Token reduction | 60-70% |
|
||||
| Information retention | >95% |
|
||||
| Structure clarity | High |
|
||||
|
||||
## Integration with Voice Tools
|
||||
|
||||
### Wispr Flow
|
||||
1. Dictate with `Cmd+Shift+Space`
|
||||
2. Paste into Claude Code
|
||||
3. Run `/voice-refine`
|
||||
|
||||
### Superwhisper
|
||||
1. Record with hotkey
|
||||
2. Text appears in active window
|
||||
3. Run `/voice-refine` to structure
|
||||
|
||||
### macOS Dictation
|
||||
1. `Fn Fn` to start
|
||||
2. Speak naturally
|
||||
3. Run `/voice-refine` to clean up
|
||||
|
||||
## What Gets Removed
|
||||
|
||||
- Filler words: "euh", "um", "like", "you know", "basically"
|
||||
- Repetitions: same concept stated multiple ways
|
||||
- Tangents: off-topic thoughts
|
||||
- Hedging: "maybe", "I think", "probably" (unless relevant)
|
||||
- Politeness padding: "please", "could you", "I'd like"
|
||||
|
||||
## What Gets Preserved
|
||||
|
||||
- Technical requirements
|
||||
- Constraints and limitations
|
||||
- Context about existing code
|
||||
- Expected output format
|
||||
- Edge cases mentioned
|
||||
- Business logic rules
|
||||
|
||||
## See Also
|
||||
|
||||
- `guide/ai-ecosystem.md` - Voice-to-Text Tools section
|
||||
- `examples/before-after.md` - Full transformation examples
|
||||
192
examples/skills/voice-refine/examples/before-after.md
Normal file
192
examples/skills/voice-refine/examples/before-after.md
Normal file
|
|
@ -0,0 +1,192 @@
|
|||
# Voice Refine - Before/After Examples
|
||||
|
||||
Real-world transformations from verbose voice input to structured prompts.
|
||||
|
||||
---
|
||||
|
||||
## Example 1: React Table Component (French)
|
||||
|
||||
### Before (147 words)
|
||||
|
||||
> Alors euh j'aimerais que tu m'aides à faire un truc, en fait j'ai une API
|
||||
> qui renvoie des données utilisateurs et je voudrais les afficher dans un
|
||||
> tableau React, mais attention il faut que ça soit paginé parce que y'a
|
||||
> beaucoup de données, genre des milliers d'utilisateurs, et aussi faudrait
|
||||
> pouvoir trier par nom ou par date d'inscription, ah et on utilise Tailwind
|
||||
> dans le projet donc faut que ça matche avec ça, et euh je sais pas si c'est
|
||||
> important mais on a déjà un composant Button qu'on utilise partout donc
|
||||
> faudrait l'utiliser aussi, et puis j'imagine qu'il faut un état de
|
||||
> chargement aussi quand les données arrivent...
|
||||
|
||||
### After (45 words)
|
||||
|
||||
```markdown
|
||||
## Contexte
|
||||
App React + Tailwind, API users existante, composant Button réutilisable
|
||||
|
||||
## Objectif
|
||||
Tableau utilisateurs avec pagination et tri
|
||||
|
||||
## Contraintes
|
||||
- Pagination (milliers d'entrées)
|
||||
- Tri: nom, date inscription
|
||||
- Loading state
|
||||
- Utiliser Button existant
|
||||
|
||||
## Output
|
||||
Composant React + TypeScript
|
||||
```
|
||||
|
||||
**Ratio**: 3.3x compression
|
||||
|
||||
---
|
||||
|
||||
## Example 2: Authentication Flow (English)
|
||||
|
||||
### Before (189 words)
|
||||
|
||||
> So I've been thinking about the authentication flow and like basically
|
||||
> what we need is something that handles both regular login and also
|
||||
> social login with Google and maybe GitHub too, and um the thing is
|
||||
> we already have some code for this but it's kind of a mess and I want
|
||||
> to clean it up, so we're using Next.js 15 by the way and we have
|
||||
> Prisma for the database, and I think we should use JWT tokens but
|
||||
> maybe with refresh tokens too because we don't want users to have to
|
||||
> log in every hour, that would be annoying, and oh we also need to
|
||||
> handle the case where someone's token expires while they're in the
|
||||
> middle of doing something, like filling out a form, we shouldn't just
|
||||
> kick them out, and I guess we need some kind of middleware to check
|
||||
> the auth status on protected routes, does that make sense?
|
||||
|
||||
### After (52 words)
|
||||
|
||||
```markdown
|
||||
## Context
|
||||
Next.js 15 + Prisma, existing auth code needs refactoring
|
||||
|
||||
## Objective
|
||||
Implement clean auth flow with credentials + OAuth (Google, GitHub)
|
||||
|
||||
## Constraints
|
||||
- JWT + refresh tokens
|
||||
- Graceful token expiry handling (no mid-action logout)
|
||||
- Middleware for protected routes
|
||||
- Silent token refresh
|
||||
|
||||
## Output
|
||||
Auth module with middleware and hooks
|
||||
```
|
||||
|
||||
**Ratio**: 3.6x compression
|
||||
|
||||
---
|
||||
|
||||
## Example 3: Bug Report (Mixed)
|
||||
|
||||
### Before (112 words)
|
||||
|
||||
> Hey so there's this weird bug, basically when users try to upload
|
||||
> an image that's like really big, you know like over 5MB or something,
|
||||
> the app just kind of hangs and then eventually it times out I think,
|
||||
> and the error message isn't helpful at all it just says "something
|
||||
> went wrong" which is pretty useless, so we need to fix this, probably
|
||||
> add some validation before the upload starts, and show a proper
|
||||
> error message if the file is too big, oh and maybe we should also
|
||||
> add a progress bar so users know something is happening...
|
||||
|
||||
### After (38 words)
|
||||
|
||||
```markdown
|
||||
## Bug
|
||||
Image upload hangs/times out for files >5MB, unhelpful error message
|
||||
|
||||
## Fix Required
|
||||
- Client-side size validation before upload
|
||||
- Clear error message for oversized files
|
||||
- Progress indicator during upload
|
||||
|
||||
## Files
|
||||
Likely: upload component, API route
|
||||
```
|
||||
|
||||
**Ratio**: 2.9x compression
|
||||
|
||||
---
|
||||
|
||||
## Example 4: Feature Request (Detailed)
|
||||
|
||||
### Before (203 words)
|
||||
|
||||
> OK so product wants this new dashboard feature and basically the idea
|
||||
> is that we show some key metrics at the top, like total users and
|
||||
> active users and maybe revenue if we have access to that, and then
|
||||
> below that there should be some charts, probably a line chart showing
|
||||
> user growth over time, and maybe a bar chart for like monthly active
|
||||
> users or something, and they want it to be filterable by date range,
|
||||
> like last 7 days, last 30 days, last 90 days, that kind of thing,
|
||||
> and oh they also mentioned wanting to export the data to CSV at some
|
||||
> point but that might be phase 2, and the design team sent over some
|
||||
> mockups but they're not final yet so we should probably make it
|
||||
> flexible enough to change the styling later, and we're using Chart.js
|
||||
> in other parts of the app so we should probably stick with that for
|
||||
> consistency, and this needs to be ready for the demo next week so
|
||||
> it's kind of urgent...
|
||||
|
||||
### After (58 words)
|
||||
|
||||
```markdown
|
||||
## Context
|
||||
Dashboard feature, mockups pending finalization, Chart.js already in use
|
||||
|
||||
## Objective
|
||||
Metrics dashboard with KPIs + charts
|
||||
|
||||
## Requirements
|
||||
- KPIs: total users, active users, revenue (if available)
|
||||
- Charts: user growth (line), MAU (bar)
|
||||
- Date filters: 7d, 30d, 90d
|
||||
- CSV export (phase 2)
|
||||
|
||||
## Constraints
|
||||
- Flexible styling (design WIP)
|
||||
- Demo deadline: next week
|
||||
|
||||
## Output
|
||||
Dashboard page + components
|
||||
```
|
||||
|
||||
**Ratio**: 3.5x compression
|
||||
|
||||
---
|
||||
|
||||
## Compression Summary
|
||||
|
||||
| Example | Before | After | Ratio | Info Retained |
|
||||
|---------|--------|-------|-------|---------------|
|
||||
| React Table | 147 | 45 | 3.3x | 100% |
|
||||
| Auth Flow | 189 | 52 | 3.6x | 100% |
|
||||
| Bug Report | 112 | 38 | 2.9x | 100% |
|
||||
| Feature Request | 203 | 58 | 3.5x | 100% |
|
||||
| **Average** | **163** | **48** | **3.3x** | **100%** |
|
||||
|
||||
---
|
||||
|
||||
## Patterns Identified
|
||||
|
||||
### Common Filler Phrases Removed
|
||||
|
||||
- "basically", "like", "you know", "I mean"
|
||||
- "kind of", "sort of", "I think", "I guess"
|
||||
- "so yeah", "that kind of thing", "or something"
|
||||
- "by the way", "oh and", "also"
|
||||
|
||||
### Structure Mapping
|
||||
|
||||
| Voice Pattern | Structured Section |
|
||||
|---------------|-------------------|
|
||||
| "we're using X" | Context |
|
||||
| "I want to..." | Objective |
|
||||
| "it needs to..." | Constraints |
|
||||
| "probably should..." | Constraints (if technical) |
|
||||
| "deadline is..." | Constraints |
|
||||
| "output should be..." | Output |
|
||||
|
|
@ -13,11 +13,12 @@
|
|||
- [2. Google Gemini (Visual Understanding)](#2-google-gemini-visual-understanding)
|
||||
- [3. Kimi (PPTX & Long Document Generation)](#3-kimi-pptx--long-document-generation)
|
||||
- [4. NotebookLM (Synthesis & Audio)](#4-notebooklm-synthesis--audio)
|
||||
- [5. IDE-Based Tools (Cursor, Windsurf, Cline)](#5-ide-based-tools-cursor-windsurf-cline)
|
||||
- [6. UI Prototypers (v0, Bolt, Lovable)](#6-ui-prototypers-v0-bolt-lovable)
|
||||
- [7. Workflow Orchestration](#7-workflow-orchestration)
|
||||
- [8. Cost & Subscription Strategy](#8-cost--subscription-strategy)
|
||||
- [9. Claude Cowork (Research Preview)](#9-claude-cowork-research-preview)
|
||||
- [5. Voice-to-Text Tools (Wispr Flow, Superwhisper)](#5-voice-to-text-tools-wispr-flow-superwhisper)
|
||||
- [6. IDE-Based Tools (Cursor, Windsurf, Cline)](#6-ide-based-tools-cursor-windsurf-cline)
|
||||
- [7. UI Prototypers (v0, Bolt, Lovable)](#7-ui-prototypers-v0-bolt-lovable)
|
||||
- [8. Workflow Orchestration](#8-workflow-orchestration)
|
||||
- [9. Cost & Subscription Strategy](#9-cost--subscription-strategy)
|
||||
- [10. Claude Cowork (Research Preview)](#10-claude-cowork-research-preview)
|
||||
- [Appendix: Ready-to-Use Prompts](#appendix-ready-to-use-prompts)
|
||||
- [Alternative Providers (Community Workarounds)](#alternative-providers-community-workarounds)
|
||||
|
||||
|
|
@ -445,7 +446,65 @@ After NotebookLM synthesis, export key insights to your project:
|
|||
|
||||
---
|
||||
|
||||
## 5. IDE-Based Tools (Cursor, Windsurf, Cline)
|
||||
## 5. Voice-to-Text Tools (Wispr Flow, Superwhisper)
|
||||
|
||||
**Philosophy**: "Vibe coding" — dictate intent, let AI implement
|
||||
|
||||
Voice input delivers ~4x typing speed (~150 WPM vs ~40 WPM) with richer context.
|
||||
You say more when you don't have to type it.
|
||||
|
||||
### Tool Comparison
|
||||
|
||||
| Tool | Processing | Latency | Privacy | Price | Platform |
|
||||
|------|------------|---------|---------|-------|----------|
|
||||
| **Wispr Flow** | Cloud | ~500ms | SOC 2 certified | $12/mo | Mac, Win, iOS |
|
||||
| **Superwhisper** | Local | 1-2s | 100% offline | ~$50 one-time | Mac only |
|
||||
| **MacWhisper** | Local | Variable | 100% offline | $49 one-time | Mac only |
|
||||
|
||||
### When Voice + Claude Code Shines
|
||||
|
||||
| Scenario | Why voice wins |
|
||||
|----------|---------------|
|
||||
| Long context dumps | You naturally include constraints, edge cases, business context |
|
||||
| Brainstorming | Less self-filtering, more raw ideas |
|
||||
| Multi-agent management | Dictate to 3-4 Claude sessions simultaneously |
|
||||
| Accessibility | RSI, mobility constraints, eye strain |
|
||||
|
||||
### Vibe Coding Workflow
|
||||
|
||||
1. Open Claude Code or Cursor
|
||||
2. Activate voice (Wispr hotkey or system dictation)
|
||||
3. Dictate naturally: "I need a component that shows user stats,
|
||||
it should have pagination because we have thousands of users,
|
||||
and sorting by name or signup date, use our existing Tailwind setup"
|
||||
4. Let Claude process the verbose input
|
||||
5. Iterate vocally: "Add loading state and error handling"
|
||||
|
||||
### Trade-offs
|
||||
|
||||
| Advantage | Limitation |
|
||||
|-----------|------------|
|
||||
| ~4x faster input | ~3x more verbose output |
|
||||
| Richer context | Cloud privacy (Wispr) |
|
||||
| Flow state preserved | ~800MB RAM overhead |
|
||||
| Natural expression | Technical terms need training |
|
||||
|
||||
### Recommendation
|
||||
|
||||
| Profile | Tool |
|
||||
|---------|------|
|
||||
| Productivity-first | Wispr Flow Pro ($12/mo) |
|
||||
| Privacy-required | Superwhisper (Mac) |
|
||||
| Budget-conscious | MacWhisper ($49 one-time) |
|
||||
| Windows user | Wait for Wispr stability improvements |
|
||||
|
||||
**Pro tip**: For complex prompts, consider a "refine" step to compress
|
||||
verbose voice input into structured prompts before sending to Claude.
|
||||
See `/voice-refine` skill template in `examples/skills/`.
|
||||
|
||||
---
|
||||
|
||||
## 6. IDE-Based Tools (Cursor, Windsurf, Cline)
|
||||
|
||||
### When IDE Tools Complement Claude Code
|
||||
|
||||
|
|
@ -506,7 +565,7 @@ When you need Claude's deeper analysis:
|
|||
|
||||
---
|
||||
|
||||
## 6. UI Prototypers (v0, Bolt, Lovable)
|
||||
## 7. UI Prototypers (v0, Bolt, Lovable)
|
||||
|
||||
### When to Use Prototypers
|
||||
|
||||
|
|
@ -554,7 +613,7 @@ When you need Claude's deeper analysis:
|
|||
|
||||
---
|
||||
|
||||
## 7. Workflow Orchestration
|
||||
## 8. Workflow Orchestration
|
||||
|
||||
### The Complete Pipeline
|
||||
|
||||
|
|
@ -652,7 +711,7 @@ claude
|
|||
|
||||
---
|
||||
|
||||
## 8. Cost & Subscription Strategy
|
||||
## 9. Cost & Subscription Strategy
|
||||
|
||||
### Monthly Cost Comparison
|
||||
|
||||
|
|
@ -698,7 +757,7 @@ claude
|
|||
|
||||
---
|
||||
|
||||
## 9. Claude Cowork (Research Preview)
|
||||
## 10. Claude Cowork (Research Preview)
|
||||
|
||||
> **Research Preview** (January 2026) — Limited documentation, expect bugs, local-only access. No production use recommended yet.
|
||||
|
||||
|
|
|
|||
|
|
@ -116,7 +116,9 @@ deep_dive:
|
|||
ai_ecosystem_workflows: 10545
|
||||
ai_ecosystem_integration: 10671
|
||||
ai_ecosystem_detailed: "guide/ai-ecosystem.md"
|
||||
ai_ecosystem_alternative_providers: "guide/ai-ecosystem.md:902"
|
||||
ai_ecosystem_voice_to_text: "guide/ai-ecosystem.md:449"
|
||||
ai_ecosystem_alternative_providers: "guide/ai-ecosystem.md:959"
|
||||
voice_refine_skill: "examples/skills/voice-refine/SKILL.md"
|
||||
# Cowork documentation (expanded - see cowork/)
|
||||
cowork_hub: "cowork/README.md"
|
||||
cowork_summary: "guide/cowork.md"
|
||||
|
|
@ -130,7 +132,7 @@ deep_dive:
|
|||
cowork_faq: "cowork/reference/faq.md"
|
||||
cowork_prompts: "cowork/prompts/README.md"
|
||||
cowork_workflows: "cowork/workflows/README.md"
|
||||
cowork_section: "guide/ai-ecosystem.md:701"
|
||||
cowork_section: "guide/ai-ecosystem.md:760"
|
||||
cowork_ultimate_guide: 10725
|
||||
|
||||
# ════════════════════════════════════════════════════════════════
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue