2026-05-11 11:50:24 +07:00

8.6 KiB

Raw Blame History

Smart Routing & Auto Fallback

9Router automatically routes your requests through the best available provider using a 3-tier fallback system. Never stop coding due to quota limits or rate limiting.

How It Works

9Router uses intelligent routing to maximize your existing subscriptions, minimize costs, and ensure 24/7 availability:

Request → 9Router → Check Tier 1 (Subscription)
                     ↓ quota exhausted
                     Check Tier 2 (Cheap)
                     ↓ budget limit
                     Check Tier 3 (Free)
                     ↓
                     Response

3-Tier Fallback System

Tier 1: SUBSCRIPTION (Primary)

Claude Code (Pro/Max)
OpenAI Codex (Plus/Pro)
Gemini CLI (FREE 180K/month)
GitHub Copilot
Antigravity (Google)

Goal: Maximize value from subscriptions you already pay for.

Tier 2: CHEAP (Backup)

GLM-4.7 ($0.60/1M input)
MiniMax M2.1 ($0.20/1M input)
Kimi K2 ($9/month flat)

Goal: Ultra-cheap backup when subscription quota runs out (~90% cheaper than ChatGPT API).

Tier 3: FREE (Emergency)

iFlow (8 models)
Qwen (3 models)
Kiro (Claude FREE)

Goal: Zero-cost fallback for unlimited coding.

Automatic Switching

9Router monitors quota in real-time and switches providers automatically:

Scenario 1: Subscription Quota Exhausted

User request → cc/claude-opus-4-5
               ↓ quota exhausted (5-hour limit reached)
               Auto switch → glm/glm-4.7
               ↓ daily quota exhausted
               Auto switch → minimax/MiniMax-M2.1
               ↓ 5-hour quota exhausted
               Auto switch → if/kimi-k2-thinking (FREE)
               ↓
               Response delivered ✅

Result: Zero downtime, seamless experience.

Scenario 2: Rate Limiting

User request → cx/gpt-5.2-codex
               ↓ rate limited (too many requests)
               Auto switch → glm/glm-4.7
               ↓
               Response delivered ✅

Scenario 3: Provider Unavailable

User request → cc/claude-opus-4-5
               ↓ provider error (503)
               Auto switch → next available model
               ↓
               Response delivered ✅

Model Selection Logic

9Router selects the best model based on:

Quota availability - Check if provider has remaining quota
Cost tier - Prefer subscription → cheap → free
Reset timing - Consider when quota resets
Provider health - Skip providers with errors

Priority Order Example

For a request to cc/claude-opus-4-5:

1. Check Claude Code quota
   ✅ Available → Use cc/claude-opus-4-5
   ❌ Exhausted → Continue to step 2

2. Check fallback tier (if configured)
   ✅ GLM quota available → Use glm/glm-4.7
   ❌ Exhausted → Continue to step 3

3. Check free tier
   ✅ iFlow available → Use if/kimi-k2-thinking
   ❌ All exhausted → Return quota error

Configuration Options

Dashboard Settings

1. Enable/Disable Auto Fallback

Dashboard → Settings → Smart Routing
→ Toggle "Auto Fallback" ON/OFF

ON (default): Automatic tier switching
OFF: Strict mode, return error if primary model unavailable

2. Set Budget Limits

Dashboard → Settings → Budget Control
→ Daily limit: $5
→ Monthly limit: $50

When budget reached, 9Router automatically switches to free tier.

3. Configure Fallback Order

Dashboard → Settings → Fallback Priority
→ Drag to reorder providers within each tier

Example custom order:

Tier 1: Gemini CLI → Claude Code → Codex
Tier 2: MiniMax → GLM → Kimi
Tier 3: iFlow → Kiro → Qwen

4. Quota Reset Notifications

Dashboard → Settings → Notifications
→ Email when quota resets
→ Alert when 80% quota used

Examples

Example 1: Basic Auto Fallback

Setup:

Model: cc/claude-opus-4-5-20251101
Fallback: Auto (default 3-tier)

Behavior:

Morning (fresh quota):
  Request → cc/claude-opus-4-5 ✅

Afternoon (quota exhausted):
  Request → glm/glm-4.7 ✅ (auto switched)

Evening (GLM quota out):
  Request → minimax/MiniMax-M2.1 ✅ (auto switched)

Late night (all paid quota out):
  Request → if/kimi-k2-thinking ✅ (free tier)

Cost: ~$5-10/month extra (mostly covered by subscription).

Example 2: Budget-Conscious Routing

Setup:

Dashboard → Settings:
  Daily budget: $2
  Monthly budget: $20
  Fallback: Enabled

Behavior:

Day 1-15 (within budget):
  Requests → glm/glm-4.7 (cheap tier)
  Cost: $1.50/day

Day 16 (budget reached):
  Requests → if/kimi-k2-thinking (free tier)
  Cost: $0

Next month (budget resets):
  Requests → glm/glm-4.7 again

Result: Never exceed $20/month, always available.

Example 3: Subscription-Only Mode

Setup:

Dashboard → Settings:
  Auto Fallback: OFF
  Strict mode: ON

Behavior:

Request → cc/claude-opus-4-5
  ✅ Quota available → Success
  ❌ Quota exhausted → Return error (no fallback)

Use case: When you only want to use paid subscriptions, no extra costs.

Example 4: Free-Only Mode

Setup:

Model: if/kimi-k2-thinking
Fallback: qw/qwen3-coder-plus → kr/claude-sonnet-4.5

Behavior:

All requests → Free tier only
Cost: $0 forever

Use case: Personal projects, learning, experimentation.

Best Practices

1. Maximize Subscription Value

Strategy:
- Set subscription models as Tier 1
- Monitor quota usage in dashboard
- Use cheap tier only when subscription exhausted

Example combo:

cc/claude-opus-4-5 → glm/glm-4.7 → if/kimi-k2-thinking

2. Optimize for Cost

Strategy:
- Use Gemini CLI free tier first (180K/month)
- Fallback to GLM/MiniMax (ultra-cheap)
- Emergency: iFlow (free)

Example combo:

gc/gemini-3-flash-preview → glm/glm-4.7 → if/kimi-k2-thinking

3. Optimize for Quality

Strategy:
- Use best models (Claude Opus, GPT-5.2)
- Fallback to good cheap models (GLM-4.7)
- Last resort: Free tier

Example combo:

cc/claude-opus-4-5 → cx/gpt-5.2-codex → glm/glm-4.7

4. 24/7 Availability

Strategy:
- Always include free tier in fallback
- Monitor quota reset times
- Distribute usage across providers

Example combo:

cc/claude-opus-4-5 → glm/glm-4.7 → minimax/MiniMax-M2.1 → if/kimi-k2-thinking

Result: Never run out of quota, code anytime.

Quota Reset Strategy

Plan your usage around quota reset times:

Provider	Quota Reset	Strategy
Claude Code	5-hour + weekly	Use in morning, fresh quota
Codex	5-hour + weekly	Use after Claude quota out
Gemini CLI	Daily (1K) + Monthly (180K)	Use throughout day
GLM-4.7	Daily 10:00 AM	Use evening, resets next morning
MiniMax M2.1	5-hour rolling	Use anytime, tracks rolling window
iFlow/Qwen/Kiro	No limit	Emergency backup

Daily routine example:

08:00 - 13:00: Claude Code (fresh 5h quota)
13:00 - 18:00: Gemini CLI (1K/day quota)
18:00 - 22:00: GLM-4.7 (cheap, resets 10AM)
22:00 - 08:00: MiniMax or iFlow (5h rolling or free)

Monitoring & Alerts

Dashboard Quota Tracker

Dashboard → Quota Overview:
  Claude Code: 2.5h / 5h remaining (50%)
  Gemini CLI: 450 / 1000 requests today
  GLM-4.7: 5M / 10M tokens (resets in 8h)
  MiniMax: 3M / 5M tokens (rolling 5h)

Real-Time Notifications

Dashboard → Notifications:
  ⚠️ Claude Code quota 80% used (1h remaining)
  ✅ GLM-4.7 quota reset (10M tokens available)
  💰 Daily budget 50% used ($2.50 / $5)

Usage Analytics

Dashboard → Analytics:
  Today: 50M tokens
    - 30M via Claude Code (subscription)
    - 15M via GLM-4.7 ($9)
    - 5M via iFlow (free)
  
  Cost: $9 (vs $1000 on ChatGPT API)
  Savings: 99%

Troubleshooting

Issue: "All providers quota exhausted"

Solution:

Check dashboard quota tracker
Wait for quota reset (see countdown)
Add free tier to fallback chain
Or increase budget limit

Issue: "Too many fallback switches"

Solution:

Check if primary provider is down
Increase quota limits (upgrade subscription)
Use cheaper primary model (GLM instead of Claude)

Issue: "Unexpected costs"

Solution:

Dashboard → Analytics → Review usage
Set daily/monthly budget limits
Switch to free tier for non-critical tasks
Use combos with free fallback

Combos - Create custom fallback chains
Quota Tracking - Monitor usage and costs

8.6 KiB Raw Blame History

Smart Routing & Auto Fallback

How It Works

3-Tier Fallback System

Automatic Switching

Scenario 1: Subscription Quota Exhausted

Scenario 2: Rate Limiting

Scenario 3: Provider Unavailable

Model Selection Logic

Priority Order Example

Configuration Options

Dashboard Settings

Examples

Example 1: Basic Auto Fallback

Example 2: Budget-Conscious Routing

Example 3: Subscription-Only Mode

Example 4: Free-Only Mode

Best Practices

1. Maximize Subscription Value

2. Optimize for Cost

3. Optimize for Quality

4. 24/7 Availability

Quota Reset Strategy

Monitoring & Alerts

Dashboard Quota Tracker

Real-Time Notifications

Usage Analytics

Troubleshooting

Related

8.6 KiB

Raw Blame History