# Combos - Custom Fallback Chains

Create custom model combinations with automatic fallback. Combos let you define your own routing strategy based on cost, quality, and availability.

---

## What Are Combos?

Combos are **custom fallback chains** that you create in the dashboard. Instead of using a single model, you define a sequence of models that 9Router tries in order.

**Example:**
```
Combo name: premium-coding
Models:
  1. cc/claude-opus-4-5-20251101 (try first)
  2. glm/glm-4.7 (if #1 quota exhausted)
  3. minimax/MiniMax-M2.1 (if #2 quota exhausted)
```

**Usage in CLI:**
```
Model: premium-coding
```

9Router automatically tries each model in sequence until one succeeds.

---

## Why Use Combos?

### 1. Maximize Subscription Value
```
cc/claude-opus → glm/glm-4.7 → if/kimi-k2-thinking

→ Use subscription first, cheap backup, free emergency
→ Get full value from subscriptions you already pay for
```

### 2. Minimize Costs
```
glm/glm-4.7 → minimax/MiniMax-M2.1 → if/kimi-k2-thinking

→ Start with cheapest paid option ($0.60/1M)
→ Fallback to even cheaper ($0.20/1M)
→ Emergency free tier
→ Total cost: ~$5-10/month vs $2000 on ChatGPT API
```

### 3. Ensure 24/7 Availability
```
cc/claude-opus → cx/gpt-5.2-codex → glm/glm-4.7 → if/kimi-k2-thinking

→ Always include free tier at the end
→ Never run out of quota
→ Code anytime, anywhere
```

### 4. Optimize for Quality
```
cc/claude-opus-4-5 → cx/gpt-5.2-codex → gc/gemini-3-pro

→ Best models first
→ Fallback to other premium models
→ Maintain high quality across fallback chain
```

---

## How to Create Combos

### Step 1: Open Dashboard

```
http://localhost:20128
→ Login with your password
```

### Step 2: Navigate to Combos

```
Dashboard → Combos → Create New Combo
```

### Step 3: Configure Combo

**Combo Name:**
```
premium-coding
```

**Description (optional):**
```
Subscription first, cheap backup, free emergency
```

**Select Models:**
```
1. cc/claude-opus-4-5-20251101
2. glm/glm-4.7
3. minimax/MiniMax-M2.1
```

**Drag to reorder** - Priority from top to bottom.

### Step 4: Save

```
Click "Save Combo"
→ Combo appears in model list
```

### Step 5: Use in CLI

```
Cursor/Cline/Any tool:
  Model: premium-coding
```

---

## Example Combos

### Example 1: Premium Coding (Subscription → Cheap → Free)

**Goal**: Maximize subscription value, minimize extra costs.

```
Dashboard → Combos → Create New

Name: premium-coding
Models:
  1. cc/claude-opus-4-5-20251101
  2. glm/glm-4.7
  3. minimax/MiniMax-M2.1
```

**Usage:**
```
Cursor IDE:
  Model: premium-coding
```

**Behavior:**
```
Morning (fresh quota):
  Request → cc/claude-opus-4-5 ✅

Afternoon (Claude quota out):
  Request → glm/glm-4.7 ✅ (auto switched)

Evening (GLM quota out):
  Request → minimax/MiniMax-M2.1 ✅ (auto switched)
```

**Monthly cost (100M tokens):**
```
80M via Claude Code: $0 (subscription)
15M via GLM: $9
5M via MiniMax: $1
Total: $10 + your subscription
```

**Savings**: ~99% vs ChatGPT API ($2000).

---

### Example 2: Budget Combo (Cheap → Free)

**Goal**: Minimize costs, use free tier as backup.

```
Dashboard → Combos → Create New

Name: budget-combo
Models:
  1. glm/glm-4.7
  2. minimax/MiniMax-M2.1
  3. if/kimi-k2-thinking
```

**Usage:**
```
Cline:
  Provider: OpenAI Compatible
  Base URL: http://localhost:20128/v1
  Model: budget-combo
```

**Behavior:**
```
Request → glm/glm-4.7
  ✅ Daily quota available → Use GLM ($0.60/1M)
  ❌ Quota exhausted → Try MiniMax ($0.20/1M)
  ❌ MiniMax quota out → Use iFlow (FREE)
```

**Monthly cost (100M tokens):**
```
70M via GLM: $42
20M via MiniMax: $4
10M via iFlow: $0
Total: $46 vs $2000 on ChatGPT API
```

**Savings**: 97%.

---

### Example 3: Free Combo (Zero Cost)

**Goal**: 100% free, no costs ever.

```
Dashboard → Combos → Create New

Name: free-combo
Models:
  1. if/kimi-k2-thinking
  2. qw/qwen3-coder-plus
  3. kr/claude-sonnet-4.5
```

**Usage:**
```
Claude Desktop:
  Model: free-combo
```

**Behavior:**
```
Request → if/kimi-k2-thinking
  ✅ Available → Use iFlow
  ❌ Error → Try Qwen
  ❌ Error → Try Kiro
```

**Monthly cost:**
```
100M tokens via free providers: $0
Total: $0 forever
```

**Use case**: Personal projects, learning, experimentation.

---

### Example 4: Quality First (Premium Models Only)

**Goal**: Best quality, no cheap fallback.

```
Dashboard → Combos → Create New

Name: quality-first
Models:
  1. cc/claude-opus-4-5-20251101
  2. cx/gpt-5.2-codex
  3. gc/gemini-3-pro-preview
```

**Usage:**
```
Codex CLI:
  export OPENAI_BASE_URL="http://localhost:20128"
  Model: quality-first
```

**Behavior:**
```
Request → cc/claude-opus-4-5
  ❌ Quota out → cx/gpt-5.2-codex
  ❌ Quota out → gc/gemini-3-pro-preview
  ❌ All out → Return error (no cheap fallback)
```

**Use case**: Critical production code, complex refactoring.

---

### Example 5: Multi-Subscription (Maximize All)

**Goal**: Use all subscriptions before paying extra.

```
Dashboard → Combos → Create New

Name: multi-sub
Models:
  1. gc/gemini-3-flash-preview (FREE 180K/month)
  2. cc/claude-opus-4-5-20251101 (Pro subscription)
  3. cx/gpt-5.2-codex (Plus subscription)
  4. gh/gpt-5 (Copilot subscription)
  5. glm/glm-4.7 (Cheap backup)
  6. if/kimi-k2-thinking (Free emergency)
```

**Monthly cost (200M tokens):**
```
50M via Gemini CLI: $0 (free tier)
80M via Claude Code: $0 (subscription)
40M via Codex: $0 (subscription)
20M via Copilot: $0 (subscription)
8M via GLM: $4.80
2M via iFlow: $0
Total: $4.80 + existing subscriptions
```

**Result**: Use 190M tokens from subscriptions, only $4.80 extra.

---

### Example 6: Quota Reset Optimization

**Goal**: Distribute usage based on reset times.

```
Dashboard → Combos → Create New

Name: reset-optimized
Models:
  1. cc/claude-opus-4-5 (5h reset, use morning)
  2. gc/gemini-3-flash (1K/day, use afternoon)
  3. glm/glm-4.7 (daily 10AM reset, use evening)
  4. minimax/MiniMax-M2.1 (5h rolling, use night)
  5. if/kimi-k2-thinking (unlimited, emergency)
```

**Daily routine:**
```
08:00 - 13:00: Claude Code (fresh 5h quota)
13:00 - 18:00: Gemini CLI (1K/day quota)
18:00 - 22:00: GLM (resets 10AM next day)
22:00 - 08:00: MiniMax (5h rolling) or iFlow
```

**Result**: Code 24/7 with minimal costs.

---

## Use Combos in CLI Tools

### Cursor IDE

```
Settings → Models → Advanced:
  OpenAI API Base URL: http://localhost:20128/v1
  OpenAI API Key: [from dashboard]
  Model: premium-coding
```

### Claude Desktop

Edit `~/.claude/config.json`:
```json
{
  "anthropic_api_base": "http://localhost:20128/v1",
  "anthropic_api_key": "your-9router-api-key",
  "model": "budget-combo"
}
```

### Codex CLI

```bash
export OPENAI_BASE_URL="http://localhost:20128"
export OPENAI_API_KEY="your-9router-api-key"

codex --model quality-first "your prompt"
```

### Cline / Continue / RooCode

```
Provider: OpenAI Compatible
Base URL: http://localhost:20128/v1
API Key: [from dashboard]
Model: free-combo
```

### API Request

```bash
curl http://localhost:20128/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "premium-coding",
    "messages": [
      {"role": "user", "content": "Write a function to..."}
    ],
    "stream": true
  }'
```

---

## Best Practices

### 1. Always Include Free Tier

```
✅ Good:
cc/claude-opus → glm/glm-4.7 → if/kimi-k2-thinking

❌ Bad:
cc/claude-opus → glm/glm-4.7
(no free fallback, can run out of quota)
```

**Why**: Ensures 24/7 availability, never blocked by quota.

### 2. Order by Cost (Cheap to Expensive)

```
✅ Good:
glm/glm-4.7 → minimax/MiniMax-M2.1 → cc/claude-opus

❌ Bad:
cc/claude-opus → glm/glm-4.7
(wastes subscription quota on simple tasks)
```

**Exception**: If you want to maximize subscription value, put subscription first.

### 3. Match Quality Requirements

```
For production code:
cc/claude-opus → cx/gpt-5.2-codex → glm/glm-4.7

For quick tasks:
glm/glm-4.7 → if/kimi-k2-thinking

For experimentation:
if/kimi-k2-thinking → qw/qwen3-coder-plus
```

### 4. Consider Quota Reset Times

```
Morning combo (fresh quotas):
cc/claude-opus → cx/gpt-5.2-codex

Evening combo (quotas likely exhausted):
glm/glm-4.7 → minimax/MiniMax-M2.1 → if/kimi-k2-thinking
```

### 5. Create Multiple Combos for Different Use Cases

```
premium-coding: For complex tasks
budget-combo: For simple tasks
free-combo: For experimentation
quality-first: For production code
```

**Switch between combos** based on task requirements.

### 6. Monitor Combo Performance

```
Dashboard → Analytics → Combo Usage:
  premium-coding:
    80% via cc/claude-opus (good, using subscription)
    15% via glm/glm-4.7 (acceptable backup)
    5% via minimax (rare fallback)
```

**Optimize**: If too much fallback usage, increase primary quota or reorder models.

---

## Advanced Configuration

### Set Budget Limits per Combo

```
Dashboard → Combos → Edit → Budget:
  Daily limit: $5
  Monthly limit: $50
```

When limit reached, 9Router skips paid models and uses free tier only.

### Enable/Disable Models in Combo

```
Dashboard → Combos → Edit → Models:
  ✅ cc/claude-opus-4-5 (enabled)
  ❌ glm/glm-4.7 (temporarily disabled)
  ✅ if/kimi-k2-thinking (enabled)
```

**Use case**: Temporarily disable expensive models without deleting combo.

### Clone Existing Combo

```
Dashboard → Combos → Clone "premium-coding"
→ Creates copy with "-copy" suffix
→ Modify and save as new combo
```

**Use case**: Create variations for different scenarios.

---

## Troubleshooting

**Issue: Combo not appearing in model list**

**Solution:**
1. Refresh dashboard
2. Check combo is saved (green checkmark)
3. Restart CLI tool to refresh model list

**Issue: Combo always uses last model (free tier)**

**Solution:**
1. Check quota for primary models (Dashboard → Quota)
2. Verify API keys are valid (Dashboard → Providers)
3. Check budget limits not exceeded

**Issue: Combo costs more than expected**

**Solution:**
1. Dashboard → Analytics → Review combo usage
2. Check if primary models are quota-exhausted
3. Reorder models (put cheaper first)
4. Set budget limits

---

## Related

- [Smart Routing](./smart-routing.md) - How auto fallback works
- [Quota Tracking](./quota-tracking.md) - Monitor usage and costs