9router/gitbook/content/en/features/combos.md
2026-05-11 11:50:24 +07:00

10 KiB

Combos - Custom Fallback Chains

Create custom model combinations with automatic fallback. Combos let you define your own routing strategy based on cost, quality, and availability.


What Are Combos?

Combos are custom fallback chains that you create in the dashboard. Instead of using a single model, you define a sequence of models that 9Router tries in order.

Example:

Combo name: premium-coding
Models:
  1. cc/claude-opus-4-5-20251101 (try first)
  2. glm/glm-4.7 (if #1 quota exhausted)
  3. minimax/MiniMax-M2.1 (if #2 quota exhausted)

Usage in CLI:

Model: premium-coding

9Router automatically tries each model in sequence until one succeeds.


Why Use Combos?

1. Maximize Subscription Value

cc/claude-opus → glm/glm-4.7 → if/kimi-k2-thinking

→ Use subscription first, cheap backup, free emergency
→ Get full value from subscriptions you already pay for

2. Minimize Costs

glm/glm-4.7 → minimax/MiniMax-M2.1 → if/kimi-k2-thinking

→ Start with cheapest paid option ($0.60/1M)
→ Fallback to even cheaper ($0.20/1M)
→ Emergency free tier
→ Total cost: ~$5-10/month vs $2000 on ChatGPT API

3. Ensure 24/7 Availability

cc/claude-opus → cx/gpt-5.2-codex → glm/glm-4.7 → if/kimi-k2-thinking

→ Always include free tier at the end
→ Never run out of quota
→ Code anytime, anywhere

4. Optimize for Quality

cc/claude-opus-4-5 → cx/gpt-5.2-codex → gc/gemini-3-pro

→ Best models first
→ Fallback to other premium models
→ Maintain high quality across fallback chain

How to Create Combos

Step 1: Open Dashboard

http://localhost:20128
→ Login with your password

Step 2: Navigate to Combos

Dashboard → Combos → Create New Combo

Step 3: Configure Combo

Combo Name:

premium-coding

Description (optional):

Subscription first, cheap backup, free emergency

Select Models:

1. cc/claude-opus-4-5-20251101
2. glm/glm-4.7
3. minimax/MiniMax-M2.1

Drag to reorder - Priority from top to bottom.

Step 4: Save

Click "Save Combo"
→ Combo appears in model list

Step 5: Use in CLI

Cursor/Cline/Any tool:
  Model: premium-coding

Example Combos

Example 1: Premium Coding (Subscription → Cheap → Free)

Goal: Maximize subscription value, minimize extra costs.

Dashboard → Combos → Create New

Name: premium-coding
Models:
  1. cc/claude-opus-4-5-20251101
  2. glm/glm-4.7
  3. minimax/MiniMax-M2.1

Usage:

Cursor IDE:
  Model: premium-coding

Behavior:

Morning (fresh quota):
  Request → cc/claude-opus-4-5 ✅

Afternoon (Claude quota out):
  Request → glm/glm-4.7 ✅ (auto switched)

Evening (GLM quota out):
  Request → minimax/MiniMax-M2.1 ✅ (auto switched)

Monthly cost (100M tokens):

80M via Claude Code: $0 (subscription)
15M via GLM: $9
5M via MiniMax: $1
Total: $10 + your subscription

Savings: ~99% vs ChatGPT API ($2000).


Example 2: Budget Combo (Cheap → Free)

Goal: Minimize costs, use free tier as backup.

Dashboard → Combos → Create New

Name: budget-combo
Models:
  1. glm/glm-4.7
  2. minimax/MiniMax-M2.1
  3. if/kimi-k2-thinking

Usage:

Cline:
  Provider: OpenAI Compatible
  Base URL: http://localhost:20128/v1
  Model: budget-combo

Behavior:

Request → glm/glm-4.7
  ✅ Daily quota available → Use GLM ($0.60/1M)
  ❌ Quota exhausted → Try MiniMax ($0.20/1M)
  ❌ MiniMax quota out → Use iFlow (FREE)

Monthly cost (100M tokens):

70M via GLM: $42
20M via MiniMax: $4
10M via iFlow: $0
Total: $46 vs $2000 on ChatGPT API

Savings: 97%.


Example 3: Free Combo (Zero Cost)

Goal: 100% free, no costs ever.

Dashboard → Combos → Create New

Name: free-combo
Models:
  1. if/kimi-k2-thinking
  2. qw/qwen3-coder-plus
  3. kr/claude-sonnet-4.5

Usage:

Claude Desktop:
  Model: free-combo

Behavior:

Request → if/kimi-k2-thinking
  ✅ Available → Use iFlow
  ❌ Error → Try Qwen
  ❌ Error → Try Kiro

Monthly cost:

100M tokens via free providers: $0
Total: $0 forever

Use case: Personal projects, learning, experimentation.


Example 4: Quality First (Premium Models Only)

Goal: Best quality, no cheap fallback.

Dashboard → Combos → Create New

Name: quality-first
Models:
  1. cc/claude-opus-4-5-20251101
  2. cx/gpt-5.2-codex
  3. gc/gemini-3-pro-preview

Usage:

Codex CLI:
  export OPENAI_BASE_URL="http://localhost:20128"
  Model: quality-first

Behavior:

Request → cc/claude-opus-4-5
  ❌ Quota out → cx/gpt-5.2-codex
  ❌ Quota out → gc/gemini-3-pro-preview
  ❌ All out → Return error (no cheap fallback)

Use case: Critical production code, complex refactoring.


Example 5: Multi-Subscription (Maximize All)

Goal: Use all subscriptions before paying extra.

Dashboard → Combos → Create New

Name: multi-sub
Models:
  1. gc/gemini-3-flash-preview (FREE 180K/month)
  2. cc/claude-opus-4-5-20251101 (Pro subscription)
  3. cx/gpt-5.2-codex (Plus subscription)
  4. gh/gpt-5 (Copilot subscription)
  5. glm/glm-4.7 (Cheap backup)
  6. if/kimi-k2-thinking (Free emergency)

Monthly cost (200M tokens):

50M via Gemini CLI: $0 (free tier)
80M via Claude Code: $0 (subscription)
40M via Codex: $0 (subscription)
20M via Copilot: $0 (subscription)
8M via GLM: $4.80
2M via iFlow: $0
Total: $4.80 + existing subscriptions

Result: Use 190M tokens from subscriptions, only $4.80 extra.


Example 6: Quota Reset Optimization

Goal: Distribute usage based on reset times.

Dashboard → Combos → Create New

Name: reset-optimized
Models:
  1. cc/claude-opus-4-5 (5h reset, use morning)
  2. gc/gemini-3-flash (1K/day, use afternoon)
  3. glm/glm-4.7 (daily 10AM reset, use evening)
  4. minimax/MiniMax-M2.1 (5h rolling, use night)
  5. if/kimi-k2-thinking (unlimited, emergency)

Daily routine:

08:00 - 13:00: Claude Code (fresh 5h quota)
13:00 - 18:00: Gemini CLI (1K/day quota)
18:00 - 22:00: GLM (resets 10AM next day)
22:00 - 08:00: MiniMax (5h rolling) or iFlow

Result: Code 24/7 with minimal costs.


Use Combos in CLI Tools

Cursor IDE

Settings → Models → Advanced:
  OpenAI API Base URL: http://localhost:20128/v1
  OpenAI API Key: [from dashboard]
  Model: premium-coding

Claude Desktop

Edit ~/.claude/config.json:

{
  "anthropic_api_base": "http://localhost:20128/v1",
  "anthropic_api_key": "your-9router-api-key",
  "model": "budget-combo"
}

Codex CLI

export OPENAI_BASE_URL="http://localhost:20128"
export OPENAI_API_KEY="your-9router-api-key"

codex --model quality-first "your prompt"

Cline / Continue / RooCode

Provider: OpenAI Compatible
Base URL: http://localhost:20128/v1
API Key: [from dashboard]
Model: free-combo

API Request

curl http://localhost:20128/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "premium-coding",
    "messages": [
      {"role": "user", "content": "Write a function to..."}
    ],
    "stream": true
  }'

Best Practices

1. Always Include Free Tier

✅ Good:
cc/claude-opus → glm/glm-4.7 → if/kimi-k2-thinking

❌ Bad:
cc/claude-opus → glm/glm-4.7
(no free fallback, can run out of quota)

Why: Ensures 24/7 availability, never blocked by quota.

2. Order by Cost (Cheap to Expensive)

✅ Good:
glm/glm-4.7 → minimax/MiniMax-M2.1 → cc/claude-opus

❌ Bad:
cc/claude-opus → glm/glm-4.7
(wastes subscription quota on simple tasks)

Exception: If you want to maximize subscription value, put subscription first.

3. Match Quality Requirements

For production code:
cc/claude-opus → cx/gpt-5.2-codex → glm/glm-4.7

For quick tasks:
glm/glm-4.7 → if/kimi-k2-thinking

For experimentation:
if/kimi-k2-thinking → qw/qwen3-coder-plus

4. Consider Quota Reset Times

Morning combo (fresh quotas):
cc/claude-opus → cx/gpt-5.2-codex

Evening combo (quotas likely exhausted):
glm/glm-4.7 → minimax/MiniMax-M2.1 → if/kimi-k2-thinking

5. Create Multiple Combos for Different Use Cases

premium-coding: For complex tasks
budget-combo: For simple tasks
free-combo: For experimentation
quality-first: For production code

Switch between combos based on task requirements.

6. Monitor Combo Performance

Dashboard → Analytics → Combo Usage:
  premium-coding:
    80% via cc/claude-opus (good, using subscription)
    15% via glm/glm-4.7 (acceptable backup)
    5% via minimax (rare fallback)

Optimize: If too much fallback usage, increase primary quota or reorder models.


Advanced Configuration

Set Budget Limits per Combo

Dashboard → Combos → Edit → Budget:
  Daily limit: $5
  Monthly limit: $50

When limit reached, 9Router skips paid models and uses free tier only.

Enable/Disable Models in Combo

Dashboard → Combos → Edit → Models:
  ✅ cc/claude-opus-4-5 (enabled)
  ❌ glm/glm-4.7 (temporarily disabled)
  ✅ if/kimi-k2-thinking (enabled)

Use case: Temporarily disable expensive models without deleting combo.

Clone Existing Combo

Dashboard → Combos → Clone "premium-coding"
→ Creates copy with "-copy" suffix
→ Modify and save as new combo

Use case: Create variations for different scenarios.


Troubleshooting

Issue: Combo not appearing in model list

Solution:

  1. Refresh dashboard
  2. Check combo is saved (green checkmark)
  3. Restart CLI tool to refresh model list

Issue: Combo always uses last model (free tier)

Solution:

  1. Check quota for primary models (Dashboard → Quota)
  2. Verify API keys are valid (Dashboard → Providers)
  3. Check budget limits not exceeded

Issue: Combo costs more than expected

Solution:

  1. Dashboard → Analytics → Review combo usage
  2. Check if primary models are quota-exhausted
  3. Reorder models (put cheaper first)
  4. Set budget limits