marketing-shibata50/9router

Fork 0

Unlimited FREE AI coding. Connect Claude Code, Codex, Cursor, Cline, Copilot, Antigravity to FREE Claude/GPT/Gemini via 40+ providers. Auto-fallback, RTK -40% tokens, never hit limits. https://9router.com

Find a file

Blade 85b7a0b136 Feature/ai observability dashboard (#79 ) * feat: add AI request details feature with latency tracking Add comprehensive request history and debugging capability to the Usage dashboard: Storage Layer (usageDb.js): - Add saveRequestDetail() for storing full request/response details - Implement FIFO queue with 1000-record limit in request-details.json - Auto-sanitize sensitive headers (authorization, api-key, cookie, token) - Add getRequestDetails() with pagination and filtering support - Add getRequestDetailById() for single record lookup Pipeline Integration (chatCore.js): - Track request start time and calculate total latency - Record TTFT (Time To First Token) and total latency for all requests - Capture full request details (messages, model, parameters) - Save response content for non-streaming, mark streaming responses - Handle error cases with detailed error information - Async non-blocking saves to avoid impacting request performance API Layer (/api/usage/request-details): - GET endpoint with pagination (page, pageSize: 1-100) - Filter by provider, model, connectionId, status, date range - Returns { details: [...], pagination: {...} } format UI Components: - Drawer.js: Right slide-out panel with backdrop blur and ESC close - Pagination.js: Full pagination with page size selector (10/20/50) - RequestDetailsTab.js: Complete table view with filters and detail drawer Dashboard Integration: - Add "Details" tab to Usage page (4th tab after Overview/Logger/Limits) - Table columns: Timestamp, Model, Provider, Input Tokens, Output Tokens, Latency (TTFT/Total), Action - Provider filter dropdown (9 providers supported) - Date range filters (start/end datetime) - Click "Detail" button to view full request/response JSON in slide-out drawer Features: - Real-time latency monitoring (TTFT & Total) - Complete request/response inspection for debugging - Filterable and searchable request history - Responsive design with mobile-friendly filters - Data security with automatic header sanitization - Performance: async saves don't block request pipeline Files Created/Modified: - src/lib/usageDb.js (modified) - open-sse/handlers/chatCore.js (modified) - src/app/api/usage/request-details/route.js (new) - src/shared/components/Drawer.js (new) - src/shared/components/Pagination.js (new) - src/app/(dashboard)/dashboard/usage/components/RequestDetailsTab.js (new) - src/app/(dashboard)/dashboard/usage/page.js (modified) Closes: AI Observability Dashboard feature * feat: enhance request details with full config and streaming content capture Improve Request Details feature to capture comprehensive request parameters and actual streaming response content: Request Configuration Enhancement (chatCore.js): - Add extractRequestConfig() helper function to capture all request parameters - Include temperature controls: temperature, top_p, top_k - Include token limits: max_tokens, max_completion_tokens - Include thinking/reasoning modes: thinking, reasoning, enable_thinking - Include OpenAI parameters: presence_penalty, frequency_penalty, seed, stop, tools, tool_choice, response_format, n, logprobs, top_logprobs, logit_bias, user, parallel_tool_calls, prediction, store, metadata - Apply to all request types: non-streaming, streaming, and error cases Streaming Content Capture (chatCore.js & stream.js): - Add onStreamComplete callback mechanism to stream processors - Accumulate content from all formats: OpenAI, Claude, Gemini - Track content from delta.content, delta.reasoning_content, delta.text, delta.thinking, and Gemini content.parts - Save initial record with "[Streaming in progress...]" marker - Update record with actual content when stream completes - Include usage tokens when available from stream Files Modified: - open-sse/handlers/chatCore.js - extractRequestConfig() + streaming capture - open-sse/utils/stream.js - onStreamComplete callback + content accumulation Benefits: - View complete request configuration in Request Details (thinking mode, etc.) - See actual streaming response content instead of placeholder - Better debugging and observability for AI requests Refs: #request-details-enhancement * feat: separate thinking/reasoning content from response content Improve Request Details to display thinking process separately from final response: Backend Changes: - stream.js: Capture content and thinking separately in streaming mode - Add accumulatedThinking variable alongside accumulatedContent - Route delta.content to content, delta.reasoning_content to thinking - Support OpenAI (reasoning_content), Claude (thinking), Gemini (part.thought) - Update onStreamComplete callback to return { content, thinking } object - chatCore.js: Update response structure to include thinking field - Non-streaming: Extract thinking from reasoning_content field - Streaming: Receive { content, thinking } from stream callback - Error responses: Include thinking: null - Initial streaming save: Include thinking: null Frontend Changes: - RequestDetailsTab.js: Display thinking and content in separate sections - Add amber/yellow themed "Thinking Process" section with psychology icon - Show "Final Response" label when thinking is present - Use distinct visual styling for thinking (amber bg) vs content (gray bg) - Only show thinking section when thinking content exists Benefits: - Users can clearly see model's reasoning process vs final answer - Better debugging for models with thinking capabilities (Claude, o1, etc.) - Visual distinction makes it easy to identify thinking vs response Refs: #thinking-content-separation * fix: map Claude thinking to reasoning_content field Fix Claude thinking content to be properly captured as reasoning_content instead of regular content, enabling separate display in Request Details: Changes: - claude-to-openai.js: Use reasoning_content field for thinking blocks - thinking start: send { reasoning_content: "" } instead of { content: "```\n```" } - thinking delta: map to reasoning_content instead of content - thinking stop: send { reasoning_content: "" } instead of { content: "```\n```" } Why This Matters: - Previously Claude thinking was sent as `content` field, mixed with actual response - Now thinking uses `reasoning_content` field, matching OpenAI's o1 format - stream.js can now properly route thinking to accumulatedThinking variable - Request Details UI will show Claude thinking in separate "Thinking Process" section Supported Thinking Formats: - OpenAI: delta.reasoning_content → thinking - Claude: delta.thinking → reasoning_content (now fixed) - Gemini: part.thought === true → thinking Refs: #claude-thinking-fix * feat(observability): capture and display full 4-layer request chain Capture complete request/response chain in AI Request Details: - Add providerRequest field (translated request sent to provider) - Add providerResponse field (raw provider response, streaming indicator) - Update chatCore.js at all 5 saveRequestDetail() call sites - Reorganize UI into 4 collapsible sections with Material icons - Preserve backward compatibility for old records - Add distinct styling for streaming indicator * fix(observability): resolve React duplicate key warning in request details table - Use composite key (detail.id + index) to ensure unique keys - Prevents React warnings when database contains duplicate IDs from old ID generation * fix(observability): display actual content in streaming request details Change providerResponse field for streaming requests from placeholder "[Streaming - raw response not captured]" to actual final content. This improves debugging experience by showing the real AI response in the "Provider Response (Raw)" section instead of a confusing placeholder message. Files changed: - open-sse/handlers/chatCore.js: Save contentObj.content to providerResponse - src/app/.../RequestDetailsTab.js: Remove special handling for placeholder * refactor(observability): migrate request details to SQLite for improved concurrency - Replace LowDB JSON storage with better-sqlite3 - Enable WAL mode for true concurrent read/write support - Add 5 indexes to accelerate queries (timestamp, provider, model, connection_id, status) - Perform pagination at the database level to reduce memory footprint - Maintain 1000 record limit with automatic cleanup of old data - Ensure API compatibility via re-exports, requiring no caller changes Performance improvements: - Concurrent Writes: Lock-free WAL mode prevents data contention - Query Efficiency: Index-based searches replace full dataset loading - Data Integrity: Atomic operations prevent file corruption * fix(observability): resolve pagination statistics display issues - Fix issue where totalItems=0 showed 'Showing 1 to 0 of 0 results' - Hide pagination controls when totalItems=0 or totalPages<=1 - Standardize API response fields: pagination.total -> pagination.totalItems Before: Incorrect stats shown for empty data, and pager visible even for single-page results After: Stats hidden for empty data, pager hidden when navigation is unnecessary * feat(observability): display friendly provider names in request details - Add /api/usage/providers endpoint to dynamically fetch provider list with names - Replace hardcoded provider options with dynamic loading from database - Display friendly provider names instead of IDs in both table and detail drawer - Support custom provider nodes (e.g., OpenAI-compatible) with user-defined names - Add provider name caching to optimize performance * fix(observability): use INSERT OR REPLACE for request details to handle streaming updates * fix(observability): resolve zero-token display issue by ensuring streaming usage capture and fixing key mismatch * fix(observability): separate TTFT and total latency calculation for streaming requests * feat(observability): implement SQLite write queue and JSON size limits - Added in-memory buffer and batch writing for SQLite to prevent lock contention - Implemented with configurable 1MB limit to prevent DB bloat - Added dashboard UI for observability performance and data management settings - Integrated graceful shutdown handlers to prevent data loss * fix(observability): resolve ReferenceError by declaring dbInstance		2026-02-09 10:30:42 +07:00
.vscode	fix: resolve SonarQube findings and Next.js Image warnings	2026-01-20 13:31:36 +07:00
docs	feat(docker): add Docker setup, environment examples, and architecture docs	2026-02-06 22:45:03 +00:00
images	Update Readme	2026-01-31 13:44:19 +07:00
open-sse	Feature/ai observability dashboard (#79 )	2026-02-09 10:30:42 +07:00
public	feat(cli-tools): update CLI tools and add new models	2026-02-06 20:53:20 +07:00
src	Feature/ai observability dashboard (#79 )	2026-02-09 10:30:42 +07:00
tester/translator	Feat Kiro OAuth, Fix Codex	2026-01-15 18:29:47 +07:00
.dockerignore	feat(docker): add Docker setup, environment examples, and architecture docs	2026-02-06 22:45:03 +00:00
.env.example	feat(cloud): harden sync/auth flow, SSE fallback, and update changelog	2026-02-08 16:45:31 +07:00
.gitignore	feat(docker): add Docker setup, environment examples, and architecture docs	2026-02-06 22:45:03 +00:00
.npmignore	Initial commit	2026-01-05 09:58:59 +07:00
CHANGELOG.md	feat(cloud): harden sync/auth flow, SSE fallback, and update changelog	2026-02-08 16:45:31 +07:00
Dockerfile	feat(docker): add Docker setup, environment examples, and architecture docs	2026-02-06 22:45:03 +00:00
eslint.config.mjs	Initial commit	2026-01-05 09:58:59 +07:00
jsconfig.json	Update jsconfig.json and package.json to correct open-sse path references from relative to local directory.	2026-01-05 10:37:09 +07:00
next.config.mjs	feat(codex): Cursor compatibility + Next.js 16 proxy migration	2026-02-02 09:37:30 +07:00
package.json	chore: update package version to 0.2.71 and enhance MITM tools	2026-02-09 09:58:24 +07:00
postcss.config.mjs	Initial commit	2026-01-05 09:58:59 +07:00
README.md	feat(cloud): harden sync/auth flow, SSE fallback, and update changelog	2026-02-08 16:45:31 +07:00

README.md

9Router - Free AI Router

Never stop coding. Auto-route to FREE & cheap AI models with smart fallback.

Free AI Provider for OpenClaw.

OpenClaw

🚀 Quick Start • 💡 Features • 📖 Setup • 🌐 Website

🤔 Why 9Router?

Stop wasting money and hitting limits:

❌ Subscription quota expires unused every month
❌ Rate limits stop you mid-coding
❌ Expensive APIs ($20-50/month per provider)
❌ Manual switching between providers

9Router solves this:

✅ Maximize subscriptions - Track quota, use every bit before reset
✅ Auto fallback - Subscription → Cheap → Free, zero downtime
✅ Multi-account - Round-robin between accounts per provider
✅ Universal - Works with Claude Code, Codex, Gemini CLI, Cursor, Cline, any CLI tool

🔄 How It Works

┌─────────────┐
│  Your CLI   │  (Claude Code, Codex, Gemini CLI, OpenClaw, Cursor, Cline...)
│   Tool      │
└──────┬──────┘
       │ http://localhost:20128/v1
       ↓
┌─────────────────────────────────────────┐
│           9Router (Smart Router)        │
│  • Format translation (OpenAI ↔ Claude) │
│  • Quota tracking                       │
│  • Auto token refresh                   │
└──────┬──────────────────────────────────┘
       │
       ├─→ [Tier 1: SUBSCRIPTION] Claude Code, Codex, Gemini CLI
       │   ↓ quota exhausted
       ├─→ [Tier 2: CHEAP] GLM ($0.6/1M), MiniMax ($0.2/1M)
       │   ↓ budget limit
       └─→ [Tier 3: FREE] iFlow, Qwen, Kiro (unlimited)

Result: Never stop coding, minimal cost

⚡ Quick Start

1. Install globally:

npm install -g 9router
9router

🎉 Dashboard opens at http://localhost:20128

2. Connect a FREE provider (no signup needed):

Dashboard → Providers → Connect Claude Code or Antigravity → OAuth login → Done!

3. Use in your CLI tool:

Claude Code/Codex/Gemini CLI/OpenClaw/Cursor/Cline Settings:
  Endpoint: http://localhost:20128/v1
  API Key: [copy from dashboard]
  Model: if/kimi-k2-thinking

That's it! Start coding with FREE AI models.

Alternative: run from source (this repository):

This repository package is private (9router-app), so source/Docker execution is the expected local development path.

cp .env.example .env
npm install
PORT=20128 NEXT_PUBLIC_BASE_URL=http://localhost:20128 npm run dev

Production mode:

npm run build
PORT=20128 HOSTNAME=0.0.0.0 NEXT_PUBLIC_BASE_URL=http://localhost:20128 npm run start

Default URLs:

Dashboard: http://localhost:20128/dashboard
OpenAI-compatible API: http://localhost:20128/v1

💡 Key Features

Feature	What It Does	Why It Matters
🎯 Smart 3-Tier Fallback	Auto-route: Subscription → Cheap → Free	Never stop coding, zero downtime
📊 Real-Time Quota Tracking	Live token count + reset countdown	Maximize subscription value
🔄 Format Translation	OpenAI ↔ Claude ↔ Gemini seamless	Works with any CLI tool
👥 Multi-Account Support	Multiple accounts per provider	Load balancing + redundancy
🔄 Auto Token Refresh	OAuth tokens refresh automatically	No manual re-login needed
🎨 Custom Combos	Create unlimited model combinations	Tailor fallback to your needs
📝 Request Logging	Debug mode with full request/response logs	Troubleshoot issues easily
💾 Cloud Sync	Sync config across devices	Same setup everywhere
📊 Usage Analytics	Track tokens, cost, trends over time	Optimize spending
🌐 Deploy Anywhere	Localhost, VPS, Docker, Cloudflare Workers	Flexible deployment options

📖 Feature Details

🎯 Smart 3-Tier Fallback

Create combos with automatic fallback:

Combo: "my-coding-stack"
  1. cc/claude-opus-4-6        (your subscription)
  2. glm/glm-4.7               (cheap backup, $0.6/1M)
  3. if/kimi-k2-thinking       (free fallback)

→ Auto switches when quota runs out or errors occur

📊 Real-Time Quota Tracking

Token consumption per provider
Reset countdown (5-hour, daily, weekly)
Cost estimation for paid tiers
Monthly spending reports

🔄 Format Translation

Seamless translation between formats:

OpenAI ↔ Claude ↔ Gemini ↔ OpenAI Responses
Your CLI tool sends OpenAI format → 9Router translates → Provider receives native format
Works with any tool that supports custom OpenAI endpoints

👥 Multi-Account Support

Add multiple accounts per provider
Auto round-robin or priority-based routing
Fallback to next account when one hits quota

🔄 Auto Token Refresh

OAuth tokens automatically refresh before expiration
No manual re-authentication needed
Seamless experience across all providers

🎨 Custom Combos

Create unlimited model combinations
Mix subscription, cheap, and free tiers
Name your combos for easy access
Share combos across devices with Cloud Sync

📝 Request Logging

Enable debug mode for full request/response logs
Track API calls, headers, and payloads
Troubleshoot integration issues
Export logs for analysis

💾 Cloud Sync

Sync providers, combos, and settings across devices
Automatic background sync
Secure encrypted storage
Access your setup from anywhere

Cloud Runtime Notes

Prefer server-side cloud variables in production:
- BASE_URL (internal callback URL used by sync scheduler)
- CLOUD_URL (cloud sync endpoint base)
NEXT_PUBLIC_BASE_URL and NEXT_PUBLIC_CLOUD_URL are still supported for compatibility/UI, but server runtime now prioritizes BASE_URL/CLOUD_URL.
Cloud sync requests now use timeout + fail-fast behavior to avoid UI hanging when cloud DNS/network is unavailable.

📊 Usage Analytics

Track token usage per provider and model
Cost estimation and spending trends
Monthly reports and insights
Optimize your AI spending

🌐 Deploy Anywhere

💻 Localhost - Default, works offline
☁️ VPS/Cloud - Share across devices
🐳 Docker - One-command deployment
🚀 Cloudflare Workers - Global edge network

💰 Pricing at a Glance

Tier	Provider	Cost	Quota Reset	Best For
💳 SUBSCRIPTION	Claude Code (Pro)	$20/mo	5h + weekly	Already subscribed
	Codex (Plus/Pro)	$20-200/mo	5h + weekly	OpenAI users
	Gemini CLI	FREE	180K/mo + 1K/day	Everyone!
	GitHub Copilot	$10-19/mo	Monthly	GitHub users
💰 CHEAP	GLM-4.7	$0.6/1M	Daily 10AM	Budget backup
	MiniMax M2.1	$0.2/1M	5-hour rolling	Cheapest option
	Kimi K2	$9/mo flat	10M tokens/mo	Predictable cost
🆓 FREE	iFlow	$0	Unlimited	8 models free
	Qwen	$0	Unlimited	3 models free
	Kiro	$0	Unlimited	Claude free

💡 Pro Tip: Start with Gemini CLI (180K free/month) + iFlow (unlimited free) combo = $0 cost!

🎯 Use Cases

Case 1: "I have Claude Pro subscription"

Problem: Quota expires unused, rate limits during heavy coding

Solution:

Combo: "maximize-claude"
  1. cc/claude-opus-4-6        (use subscription fully)
  2. glm/glm-4.7               (cheap backup when quota out)
  3. if/kimi-k2-thinking       (free emergency fallback)

Monthly cost: $20 (subscription) + ~$5 (backup) = $25 total
vs. $20 + hitting limits = frustration

Case 2: "I want zero cost"

Problem: Can't afford subscriptions, need reliable AI coding

Solution:

Combo: "free-forever"
  1. gc/gemini-3-flash         (180K free/month)
  2. if/kimi-k2-thinking       (unlimited free)
  3. qw/qwen3-coder-plus       (unlimited free)

Monthly cost: $0
Quality: Production-ready models

Case 3: "I need 24/7 coding, no interruptions"

Problem: Deadlines, can't afford downtime

Solution:

Combo: "always-on"
  1. cc/claude-opus-4-6        (best quality)
  2. cx/gpt-5.2-codex          (second subscription)
  3. glm/glm-4.7               (cheap, resets daily)
  4. minimax/MiniMax-M2.1      (cheapest, 5h reset)
  5. if/kimi-k2-thinking       (free unlimited)

Result: 5 layers of fallback = zero downtime
Monthly cost: $20-200 (subscriptions) + $10-20 (backup)

Case 4: "I want FREE AI in OpenClaw"

Problem: Need AI assistant in messaging apps (WhatsApp, Telegram, Slack...), completely free

Solution:

Combo: "openclaw-free"
  1. if/glm-4.7                (unlimited free)
  2. if/minimax-m2.1           (unlimited free)
  3. if/kimi-k2-thinking       (unlimited free)

Monthly cost: $0
Access via: WhatsApp, Telegram, Slack, Discord, iMessage, Signal...

📖 Setup Guide

🔐 Subscription Providers (Maximize Value)

Claude Code (Pro/Max)

Dashboard → Providers → Connect Claude Code
→ OAuth login → Auto token refresh
→ 5-hour + weekly quota tracking

Models:
  cc/claude-opus-4-6
  cc/claude-sonnet-4-5-20250929
  cc/claude-haiku-4-5-20251001

Pro Tip: Use Opus for complex tasks, Sonnet for speed. 9Router tracks quota per model!

OpenAI Codex (Plus/Pro)

Dashboard → Providers → Connect Codex
→ OAuth login (port 1455)
→ 5-hour + weekly reset

Models:
  cx/gpt-5.2-codex
  cx/gpt-5.1-codex-max

Gemini CLI (FREE 180K/month!)

Dashboard → Providers → Connect Gemini CLI
→ Google OAuth
→ 180K completions/month + 1K/day

Models:
  gc/gemini-3-flash-preview
  gc/gemini-2.5-pro

Best Value: Huge free tier! Use this before paid tiers.

GitHub Copilot

Dashboard → Providers → Connect GitHub
→ OAuth via GitHub
→ Monthly reset (1st of month)

Models:
  gh/gpt-5
  gh/claude-4.5-sonnet
  gh/gemini-3-pro

💰 Cheap Providers (Backup)

GLM-4.7 (Daily reset, $0.6/1M)

Sign up: Zhipu AI
Get API key from Coding Plan
Dashboard → Add API Key:
- Provider: glm
- API Key: your-key

Use: glm/glm-4.7

Pro Tip: Coding Plan offers 3× quota at 1/7 cost! Reset daily 10:00 AM.

MiniMax M2.1 (5h reset, $0.20/1M)

Sign up: MiniMax
Get API key
Dashboard → Add API Key

Use: minimax/MiniMax-M2.1

Pro Tip: Cheapest option for long context (1M tokens)!

Kimi K2 ($9/month flat)

Subscribe: Moonshot AI
Get API key
Dashboard → Add API Key

Use: kimi/kimi-latest

Pro Tip: Fixed $9/month for 10M tokens = $0.90/1M effective cost!

🆓 FREE Providers (Emergency Backup)

iFlow (8 FREE models)

Dashboard → Connect iFlow
→ iFlow OAuth login
→ Unlimited usage

Models:
  if/kimi-k2-thinking
  if/qwen3-coder-plus
  if/glm-4.7
  if/minimax-m2
  if/deepseek-r1

Qwen (3 FREE models)

Dashboard → Connect Qwen
→ Device code authorization
→ Unlimited usage

Models:
  qw/qwen3-coder-plus
  qw/qwen3-coder-flash

Kiro (Claude FREE)

Dashboard → Connect Kiro
→ AWS Builder ID or Google/GitHub
→ Unlimited usage

Models:
  kr/claude-sonnet-4.5
  kr/claude-haiku-4.5

🎨 Create Combos

Example 1: Maximize Subscription → Cheap Backup

Dashboard → Combos → Create New

Name: premium-coding
Models:
  1. cc/claude-opus-4-6 (Subscription primary)
  2. glm/glm-4.7 (Cheap backup, $0.6/1M)
  3. minimax/MiniMax-M2.1 (Cheapest fallback, $0.20/1M)

Use in CLI: premium-coding

Monthly cost example (100M tokens):
  80M via Claude (subscription): $0 extra
  15M via GLM: $9
  5M via MiniMax: $1
  Total: $10 + your subscription

Example 2: Free-Only (Zero Cost)

Name: free-combo
Models:
  1. gc/gemini-3-flash-preview (180K free/month)
  2. if/kimi-k2-thinking (unlimited)
  3. qw/qwen3-coder-plus (unlimited)

Cost: $0 forever!

🔧 CLI Integration

Cursor IDE

Settings → Models → Advanced:
  OpenAI API Base URL: http://localhost:20128/v1
  OpenAI API Key: [from 9router dashboard]
  Model: cc/claude-opus-4-6

Or use combo: premium-coding

Claude Code

Edit ~/.claude/config.json:

{
  "anthropic_api_base": "http://localhost:20128/v1",
  "anthropic_api_key": "your-9router-api-key"
}

Codex CLI

export OPENAI_BASE_URL="http://localhost:20128"
export OPENAI_API_KEY="your-9router-api-key"

codex "your prompt"

OpenClaw

Edit ~/.openclaw/openclaw.json:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "9router/if/glm-4.7"
      }
    }
  },
  "models": {
    "providers": {
      "9router": {
        "baseUrl": "http://localhost:20128/v1",
        "apiKey": "your-9router-api-key",
        "api": "openai-completions",
        "models": [
          {
            "id": "if/glm-4.7",
            "name": "glm-4.7"
          }
        ]
      }
    }
  }
}

Or use Dashboard: CLI Tools → OpenClaw → Auto-config

Cline / Continue / RooCode

Provider: OpenAI Compatible
Base URL: http://localhost:20128/v1
API Key: [from dashboard]
Model: cc/claude-opus-4-6

🚀 Deployment

VPS Deployment

# Clone and install
git clone https://github.com/decolua/9router.git
cd 9router
npm install
npm run build

# Configure
export JWT_SECRET="your-secure-secret-change-this"
export INITIAL_PASSWORD="your-password"
export DATA_DIR="/var/lib/9router"
export PORT="20128"
export HOSTNAME="0.0.0.0"
export NODE_ENV="production"
export NEXT_PUBLIC_BASE_URL="http://localhost:20128"
export NEXT_PUBLIC_CLOUD_URL="https://9router.com"
export API_KEY_SECRET="endpoint-proxy-api-key-secret"
export MACHINE_ID_SALT="endpoint-proxy-salt"

# Start
npm run start

# Or use PM2
npm install -g pm2
pm2 start npm --name 9router -- start
pm2 save
pm2 startup

Docker

# Build image (from repository root)
docker build -t 9router .

# Run container (command used in current setup)
docker run -d \
  --name 9router \
  -p 20128:20128 \
  --env-file /root/dev/9router/.env \
  -v 9router-data:/app/data \
  -v 9router-usage:/root/.9router \
  9router

Portable command (if you are already at repository root):

docker run -d \
  --name 9router \
  -p 20128:20128 \
  --env-file ./.env \
  -v 9router-data:/app/data \
  -v 9router-usage:/root/.9router \
  9router

Container defaults:

PORT=20128
HOSTNAME=0.0.0.0

Useful commands:

docker logs -f 9router
docker restart 9router
docker stop 9router && docker rm 9router

Environment Variables

Variable	Default	Description
`JWT_SECRET`	`9router-default-secret-change-me`	JWT signing secret for dashboard auth cookie (change in production)
`INITIAL_PASSWORD`	`123456`	First login password when no saved hash exists
`DATA_DIR`	`~/.9router`	Main app database location (`db.json`)
`PORT`	framework default	Service port (`20128` in examples)
`HOSTNAME`	framework default	Bind host (Docker defaults to `0.0.0.0`)
`NODE_ENV`	runtime default	Set `production` for deploy
`BASE_URL`	`http://localhost:20128`	Server-side internal base URL used by cloud sync jobs
`CLOUD_URL`	`https://9router.com`	Server-side cloud sync endpoint base URL
`NEXT_PUBLIC_BASE_URL`	`http://localhost:3000`	Backward-compatible/public base URL (prefer `BASE_URL` for server runtime)
`NEXT_PUBLIC_CLOUD_URL`	`https://9router.com`	Backward-compatible/public cloud URL (prefer `CLOUD_URL` for server runtime)
`API_KEY_SECRET`	`endpoint-proxy-api-key-secret`	HMAC secret for generated API keys
`MACHINE_ID_SALT`	`endpoint-proxy-salt`	Salt for stable machine ID hashing
`ENABLE_REQUEST_LOGS`	`false`	Enables request/response logs under `logs/`
`AUTH_COOKIE_SECURE`	`false`	Force `Secure` auth cookie (set `true` behind HTTPS reverse proxy)
`REQUIRE_API_KEY`	`false`	Enforce Bearer API key on `/v1/*` routes (recommended for internet-exposed deploys)
`HTTP_PROXY`, `HTTPS_PROXY`, `ALL_PROXY`, `NO_PROXY`	empty	Optional outbound proxy for upstream provider calls

Notes:

Lowercase proxy variables are also supported: http_proxy, https_proxy, all_proxy, no_proxy.
.env is not baked into Docker image (.dockerignore); inject runtime config with --env-file or -e.
On Windows, APPDATA can be used for local storage path resolution.
INSTANCE_NAME appears in older docs/env templates, but is currently not used at runtime.

Runtime Files and Storage

Main app state: ${DATA_DIR}/db.json (providers, combos, aliases, keys, settings), managed by src/lib/localDb.js.
Usage history and logs: ~/.9router/usage.json and ~/.9router/log.txt, managed by src/lib/usageDb.js.
Optional request/translator logs: <repo>/logs/... when ENABLE_REQUEST_LOGS=true.
Usage storage currently follows ~/.9router path logic and is independent from DATA_DIR.

📊 Available Models

View all available models

Claude Code (cc/) - Pro/Max:

cc/claude-opus-4-6
cc/claude-sonnet-4-5-20250929
cc/claude-haiku-4-5-20251001

Codex (cx/) - Plus/Pro:

cx/gpt-5.2-codex
cx/gpt-5.1-codex-max

Gemini CLI (gc/) - FREE:

gc/gemini-3-flash-preview
gc/gemini-2.5-pro

GitHub Copilot (gh/):

gh/gpt-5
gh/claude-4.5-sonnet

GLM (glm/) - $0.6/1M:

glm/glm-4.7

MiniMax (minimax/) - $0.2/1M:

minimax/MiniMax-M2.1

iFlow (if/) - FREE:

if/kimi-k2-thinking
if/qwen3-coder-plus
if/deepseek-r1

Qwen (qw/) - FREE:

qw/qwen3-coder-plus
qw/qwen3-coder-flash

Kiro (kr/) - FREE:

kr/claude-sonnet-4.5
kr/claude-haiku-4.5

🐛 Troubleshooting

"Language model did not provide messages"

Provider quota exhausted → Check dashboard quota tracker
Solution: Use combo fallback or switch to cheaper tier

Rate limiting

Subscription quota out → Fallback to GLM/MiniMax
Add combo: cc/claude-opus-4-6 → glm/glm-4.7 → if/kimi-k2-thinking

OAuth token expired

Auto-refreshed by 9Router
If issues persist: Dashboard → Provider → Reconnect

High costs

Check usage stats in Dashboard
Switch primary model to GLM/MiniMax
Use free tier (Gemini CLI, iFlow) for non-critical tasks

Dashboard opens on wrong port

Set PORT=20128 and NEXT_PUBLIC_BASE_URL=http://localhost:20128

Cloud sync errors

Verify BASE_URL points to your running instance (example: http://localhost:20128)
Verify CLOUD_URL points to your expected cloud endpoint (example: https://9router.com)
Keep NEXT_PUBLIC_* values aligned with server-side values when possible.

Cloud endpoint stream=false returns 500 (Unexpected token 'd'...)

Symptom usually appears on public cloud endpoint (https://9router.com/v1) for non-streaming calls.
Root cause: upstream returns SSE payload (data: ...) while client expects JSON.
Workaround: use stream=true for cloud direct calls.
Local 9Router runtime includes SSE→JSON fallback for non-streaming calls when upstream returns text/event-stream.

Cloud says connected, but request still fails with Invalid API key

Create a fresh key from local dashboard (/api/keys) and run cloud sync (Enable Cloud then Sync Now).
Old/non-synced keys can still return 401 on cloud even if local endpoint works.

First login not working

Check INITIAL_PASSWORD in .env
If unset, fallback password is 123456

No request logs under logs/

Set ENABLE_REQUEST_LOGS=true

🛠️ Tech Stack

Runtime: Node.js 20+
Framework: Next.js 16
UI: React 19 + Tailwind CSS 4
Database: LowDB (JSON file-based)
Streaming: Server-Sent Events (SSE)
Auth: OAuth 2.0 (PKCE) + JWT + API Keys

📝 API Reference

Chat Completions

POST http://localhost:20128/v1/chat/completions
Authorization: Bearer your-api-key
Content-Type: application/json

{
  "model": "cc/claude-opus-4-6",
  "messages": [
    {"role": "user", "content": "Write a function to..."}
  ],
  "stream": true
}

List Models

GET http://localhost:20128/v1/models
Authorization: Bearer your-api-key

→ Returns all models + combos in OpenAI format

Compatibility Endpoints

POST /v1/chat/completions
POST /v1/messages
POST /v1/responses
GET /v1/models
POST /v1/messages/count_tokens
GET /v1beta/models
POST /v1beta/models/{...path} (Gemini-style generateContent)
POST /v1/api/chat (Ollama-style transform path)

Cloud Validation Scripts

Added test scripts under tester/security/:

tester/security/test-docker-hardening.sh
- Builds Docker image and validates hardening checks (/api/cloud/auth auth guard, REQUIRE_API_KEY, secure auth cookie behavior).
tester/security/test-cloud-openai-compatible.sh
- Sends a direct OpenAI-compatible request to cloud endpoint (https://9router.com/v1/chat/completions) with provided model/key.
tester/security/test-cloud-sync-and-call.sh
- End-to-end flow: create local key -> enable/sync cloud -> call cloud endpoint with retry.
- Includes fallback check with stream=true to distinguish auth errors from non-streaming parse issues.

Security note for cloud test scripts:

Never hardcode real API keys in scripts/commits.
Provide keys only via environment variables:
- API_KEY, CLOUD_API_KEY, or OPENAI_API_KEY (supported by test-cloud-openai-compatible.sh)
Example:

OPENAI_API_KEY="your-cloud-key" bash tester/security/test-cloud-openai-compatible.sh

Expected behavior from recent validation:

Local runtime (http://127.0.0.1:20128/v1/chat/completions): works with stream=false and stream=true.
Docker runtime (same API path exposed by container): hardening checks pass, cloud auth guard works, strict API key mode works when enabled.
Public cloud endpoint (https://9router.com/v1/chat/completions):
- stream=true: expected to succeed (SSE chunks returned).
- stream=false: may fail with 500 + parse error (Unexpected token 'd') when upstream returns SSE content to a non-streaming client path.

Dashboard and Management API

Auth/settings: /api/auth/login, /api/auth/logout, /api/settings, /api/settings/require-login
Provider management: /api/providers, /api/providers/[id], /api/providers/[id]/test, /api/providers/[id]/models, /api/providers/validate, /api/provider-nodes*
OAuth flows: /api/oauth/[provider]/[action] (+ provider-specific imports like Cursor/Kiro)
Routing config: /api/models/alias, /api/combos*, /api/keys*, /api/pricing
Usage/logs: /api/usage/history, /api/usage/logs, /api/usage/request-logs, /api/usage/[connectionId]
Cloud sync: /api/sync/cloud, /api/sync/initialize, /api/cloud/*
CLI helpers: /api/cli-tools/claude-settings, /api/cli-tools/codex-settings, /api/cli-tools/droid-settings, /api/cli-tools/openclaw-settings

Authentication Behavior

Dashboard routes (/dashboard/*) use auth_token cookie protection.
Login uses saved password hash when present; otherwise it falls back to INITIAL_PASSWORD.
requireLogin can be toggled via /api/settings/require-login.

Request Processing (High Level)

Client sends request to /v1/*.
Route handler calls handleChat (src/sse/handlers/chat.js).
Model is resolved (direct provider/model or alias/combo resolution).
Credentials are selected from local DB with account availability filtering.
handleChatCore (open-sse/handlers/chatCore.js) detects format and translates request.
Provider executor sends upstream request.
Stream is translated back to client format when needed.
Usage/logging is recorded (src/lib/usageDb.js).
Fallback applies on provider/account/model errors according to combo rules.

Full architecture reference: docs/ARCHITECTURE.md

📧 Support

Website: 9router.com
GitHub: github.com/decolua/9router
Issues: github.com/decolua/9router/issues

👥 Contributors

Thanks to all contributors who helped make 9Router better!

📊 Star Chart

How to Contribute

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

See CONTRIBUTING.md for detailed guidelines.

🙏 Acknowledgments

Special thanks to CLIProxyAPI - the original Go implementation that inspired this JavaScript port.

📄 License

MIT License - see LICENSE for details.

_{Built with ❤️ for developers who code 24/7}

README.md Unescape Escape

9Router - Free AI Router

🤔 Why 9Router?

🔄 How It Works

⚡ Quick Start

💡 Key Features

🎯 Smart 3-Tier Fallback

📊 Real-Time Quota Tracking

🔄 Format Translation

👥 Multi-Account Support

🔄 Auto Token Refresh

🎨 Custom Combos

📝 Request Logging

💾 Cloud Sync

Cloud Runtime Notes

📊 Usage Analytics

🌐 Deploy Anywhere

💰 Pricing at a Glance

🎯 Use Cases

Case 1: "I have Claude Pro subscription"

Case 2: "I want zero cost"

Case 3: "I need 24/7 coding, no interruptions"

Case 4: "I want FREE AI in OpenClaw"

📖 Setup Guide

Claude Code (Pro/Max)

OpenAI Codex (Plus/Pro)

Gemini CLI (FREE 180K/month!)

GitHub Copilot

GLM-4.7 (Daily reset, $0.6/1M)

MiniMax M2.1 (5h reset, $0.20/1M)

Kimi K2 ($9/month flat)

iFlow (8 FREE models)

Qwen (3 FREE models)

Kiro (Claude FREE)

Example 1: Maximize Subscription → Cheap Backup

Example 2: Free-Only (Zero Cost)

Cursor IDE

Claude Code

Codex CLI

OpenClaw

Cline / Continue / RooCode

VPS Deployment

Docker

Environment Variables

Runtime Files and Storage

📊 Available Models

🐛 Troubleshooting

🛠️ Tech Stack

📝 API Reference

Chat Completions

List Models

Compatibility Endpoints

Cloud Validation Scripts

Dashboard and Management API

Authentication Behavior

Request Processing (High Level)

📧 Support

👥 Contributors

📊 Star Chart

How to Contribute

🙏 Acknowledgments

📄 License

README.md