15 KiB
15 KiB
Quota Tracking & Giám sát Usage
Theo dõi tiêu thụ token thời gian thực, giám sát giới hạn quota, ước tính chi phí và nhận cảnh báo trước khi hết. Không bao giờ lãng phí quota subscription hoặc vượt giới hạn ngân sách.
Tổng quan
9Router cung cấp quota tracking toàn diện cho mọi provider:
- Tiêu thụ token thời gian thực - Xem tokens dùng mỗi request
- Giới hạn quota & còn lại - Theo dõi usage so với giới hạn
- Đếm ngược Reset - Biết khi nào quota refresh
- Ước tính chi phí - Tính chi tiêu cho tier trả phí
- Báo cáo hàng tháng - Phân tích pattern sử dụng
- Cảnh báo & thông báo - Nhận cảnh báo trước giới hạn
Tổng quan Dashboard
Tóm tắt Quota
Dashboard → Home → Quota Overview
┌─────────────────────────────────────────────┐
│ Claude Code (cc/) │
│ ████████████░░░░░░░░ 2.5h / 5h (50%) │
│ Resets in: 2h 30m │
│ Cost: $0 (subscription) │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│ Gemini CLI (gc/) │
│ ████████░░░░░░░░░░░░ 450 / 1000 (45%) │
│ Daily reset in: 18h 30m │
│ Monthly: 45K / 180K (25%) │
│ Cost: $0 (free tier) │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│ GLM-4.7 (glm/) │
│ ██████████████░░░░░░ 7M / 10M tokens (70%) │
│ Resets: Daily 10:00 AM (in 5h 35m) │
│ Cost today: $4.20 │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│ MiniMax M2.1 (minimax/) │
│ ████████████████░░░░ 4M / 5M tokens (80%) │
│ Rolling 5h window │
│ Cost (5h): $0.80 │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│ iFlow (if/) │
│ ████████████████████ Unlimited │
│ Cost: $0 (free forever) │
└─────────────────────────────────────────────┘
Tiêu thụ Token Thời gian thực
Theo dõi từng Request
Mỗi request hiển thị usage token chi tiết:
Dashboard → Activity → Recent Requests
Request #1234
Model: cc/claude-opus-4-5-20251101
Timestamp: 2026-02-04 04:15:32
Tokens:
Input: 1,250 tokens
Output: 850 tokens
Total: 2,100 tokens
Cost: $0 (subscription quota)
Duration: 3.2s
Status: ✅ Success
Live Usage Monitor
Dashboard → Live Monitor
Current request:
Model: glm/glm-4.7
Tokens streamed: 450 / ~800 estimated
Cost so far: $0.0009
Duration: 1.8s
Phân tích Token theo Model
Dashboard → Analytics → Token Usage
Today (Feb 4, 2026):
cc/claude-opus-4-5: 15M tokens ($0, subscription)
glm/glm-4.7: 8M tokens ($4.80)
if/kimi-k2-thinking: 3M tokens ($0, free)
Total: 26M tokens
Cost: $4.80
Giới hạn Quota & Thời gian Reset
Subscription Providers
Claude Code (Pro/Max)
Quota type: Time-based (5-hour rolling)
Limit: 5 hours of usage
Reset: Rolling 5-hour window + Weekly refresh
Tracking: Usage time per model
Dashboard shows:
Opus: 2.5h / 5h used
Sonnet: 1.2h / 5h used
Haiku: 0.8h / 5h used
Weekly reset: Every Monday 00:00 UTC
OpenAI Codex (Plus/Pro)
Quota type: Time-based (5-hour rolling)
Limit: 5 hours (Plus) / 10 hours (Pro)
Reset: Rolling 5-hour window + Weekly refresh
Dashboard shows:
GPT-5.2 Codex: 3.5h / 5h used
Resets in: 1h 30m
Gemini CLI (MIỄN PHÍ)
Quota type: Request count + Monthly tokens
Daily limit: 1,000 requests
Monthly limit: 180,000 completions
Reset: Daily 00:00 UTC + Monthly 1st
Dashboard shows:
Today: 450 / 1,000 requests (45%)
This month: 45K / 180K completions (25%)
Daily reset in: 18h 30m
Monthly reset in: 26 days
GitHub Copilot
Quota type: Monthly usage
Limit: Varies by plan
Reset: 1st of each month
Dashboard shows:
Usage: 60% of monthly quota
Resets: March 1, 2026 (in 25 days)
Cheap Providers
GLM-4.7
Quota type: Daily token limit
Limit: 10M tokens/day (Coding Plan)
Reset: Daily 10:00 AM Beijing Time (UTC+8)
Dashboard shows:
Used: 7M / 10M tokens (70%)
Remaining: 3M tokens
Resets in: 5h 35m
Cost today: $4.20
MiniMax M2.1
Quota type: Rolling 5-hour window
Limit: 5M tokens per 5 hours
Reset: Continuous rolling window
Dashboard shows:
Used (5h): 4M / 5M tokens (80%)
Oldest usage expires in: 45m
Cost (5h): $0.80
Kimi K2
Quota type: Monthly subscription
Limit: 10M tokens/month ($9 flat)
Reset: Monthly on subscription date
Dashboard shows:
Used: 6M / 10M tokens (60%)
Resets: Feb 15, 2026 (in 11 days)
Cost: $9/month (prepaid)
Free Providers
iFlow / Qwen / Kiro
Quota type: Unlimited (rate-limited)
Limit: No hard limit
Reset: N/A
Dashboard shows:
Used today: 5M tokens
Cost: $0 (free forever)
Status: ✅ Available
Ước tính Chi phí
Theo dõi Chi phí Thời gian thực
Dashboard → Costs → Today
Subscription providers: $0
Claude Code: 15M tokens ($0, included)
Gemini CLI: 3M tokens ($0, free tier)
Paid providers: $4.80
GLM-4.7: 8M tokens ($4.80)
Input: 6M × $0.60/1M = $3.60
Output: 2M × $2.20/1M = $4.40
Total: $4.80
Free providers: $0
iFlow: 3M tokens ($0)
Total today: $4.80
Báo cáo Chi tiêu Hàng tháng
Dashboard → Costs → This Month (February 2026)
Week 1 (Feb 1-7):
Subscription: $0 (80M tokens)
Paid: $15.20 (25M tokens)
Free: $0 (10M tokens)
Total: $15.20
Week 2 (Feb 8-14):
Subscription: $0 (75M tokens)
Paid: $12.80 (20M tokens)
Free: $0 (8M tokens)
Total: $12.80
Month to date: $28.00
Projected (30 days): ~$120
Breakdown by provider:
GLM-4.7: $22.00 (78%)
MiniMax M2.1: $6.00 (22%)
Average cost per 1M tokens: $0.62
Savings vs ChatGPT API: 97% ($4,000 → $120)
Dự kiến Chi phí
Dashboard → Costs → Projections
Based on last 7 days usage:
Daily average: 50M tokens
Daily cost: $4.50
Monthly projection:
Tokens: 1,500M (1.5B)
Cost: $135
Breakdown:
Subscription: 900M tokens ($0)
GLM-4.7: 450M tokens ($90)
MiniMax: 120M tokens ($24)
Free: 30M tokens ($0)
Budget status:
Daily limit: $5 → 90% used today
Monthly limit: $150 → 90% projected
⚠️ Warning: May exceed monthly budget
Dashboard Usage
Thống kê Tổng quan
Dashboard → Analytics → Overview
Today (Feb 4, 2026):
Requests: 1,234
Tokens: 26M
Cost: $4.80
Avg response time: 2.1s
This week:
Requests: 8,456
Tokens: 180M
Cost: $28.00
Success rate: 99.2%
This month:
Requests: 15,234
Tokens: 320M
Cost: $52.00
Top model: cc/claude-opus-4-5 (45%)
Usage theo Model
Dashboard → Analytics → Models
Top models (this month):
1. cc/claude-opus-4-5: 145M tokens (45%)
2. glm/glm-4.7: 95M tokens (30%)
3. if/kimi-k2-thinking: 50M tokens (16%)
4. minimax/MiniMax-M2.1: 20M tokens (6%)
5. gc/gemini-3-flash: 10M tokens (3%)
Cost breakdown:
cc/claude-opus: $0 (subscription)
glm/glm-4.7: $45.00
if/kimi-k2-thinking: $0 (free)
minimax/MiniMax-M2.1: $7.00
gc/gemini-3-flash: $0 (free)
Usage theo Thời gian
Dashboard → Analytics → Timeline
Hourly usage (today):
00:00 - 01:00: 0.5M tokens
01:00 - 02:00: 0.2M tokens
...
08:00 - 09:00: 3.2M tokens (peak)
09:00 - 10:00: 2.8M tokens
...
23:00 - 00:00: 0.8M tokens
Peak hours: 08:00 - 12:00 (morning coding)
Low hours: 00:00 - 06:00 (night)
Usage theo Combo
Dashboard → Analytics → Combos
premium-coding:
Requests: 456
Tokens: 12M
Cost: $2.40
Breakdown:
cc/claude-opus: 8M tokens (67%, $0)
glm/glm-4.7: 3M tokens (25%, $1.80)
minimax/MiniMax-M2.1: 1M tokens (8%, $0.20)
budget-combo:
Requests: 234
Tokens: 6M
Cost: $1.20
Breakdown:
glm/glm-4.7: 4M tokens (67%, $2.40)
if/kimi-k2-thinking: 2M tokens (33%, $0)
Cảnh báo & Thông báo
Cảnh báo Quota
Dashboard → Settings → Alerts
Quota warnings:
✅ Alert at 80% quota used
✅ Alert at 90% quota used
✅ Alert when quota exhausted
✅ Notify when quota resets
Delivery:
✅ Dashboard notification
✅ Email (optional)
✅ Webhook (optional)
Ví dụ thông báo:
⚠️ Claude Code quota 80% used
2.5h remaining (resets in 1h 30m)
⚠️ GLM-4.7 quota 90% used
1M tokens remaining (resets in 5h)
✅ Gemini CLI quota reset
1,000 requests available (daily limit)
Cảnh báo Ngân sách
Dashboard → Settings → Budget Alerts
Daily budget: $5
✅ Alert at 80% ($4)
✅ Alert at 100% ($5)
✅ Auto-switch to free tier when exceeded
Monthly budget: $150
✅ Alert at 50% ($75)
✅ Alert at 80% ($120)
✅ Alert at 100% ($150)
Ví dụ thông báo:
⚠️ Daily budget 80% used
$4.00 / $5.00 spent today
⚠️ Monthly budget 50% reached
$75 / $150 spent this month
Projected: $135 (within budget)
🚨 Daily budget exceeded
$5.20 / $5.00 spent today
Auto-switched to free tier
Phát hiện Bất thường Chi phí
Dashboard → Settings → Anomaly Detection
✅ Detect unusual spending patterns
✅ Alert on cost spikes (>2× daily average)
✅ Warn on quota exhaustion patterns
Example alert:
⚠️ Cost spike detected
Today: $12.50 (2.5× daily average)
Reason: High GLM-4.7 usage (20M tokens)
Suggestion: Check if primary models quota-exhausted
Best Practices
1. Theo dõi Quota Hàng ngày
Daily routine:
1. Check dashboard quota overview (30 seconds)
2. Review reset times
3. Plan usage around quota availability
Ví dụ:
Morning check:
✅ Claude Code: 5h available (fresh reset)
✅ Gemini CLI: 1K requests available
⚠️ GLM-4.7: 2M tokens left (resets 10AM)
Action: Use Claude Code for morning work
2. Đặt Giới hạn Ngân sách
Dashboard → Settings → Budget:
Daily: $5 (prevents overspending)
Monthly: $150 (aligns with budget)
Kết quả: Auto-switch sang free tier khi đạt giới hạn.
3. Tối ưu Combo Usage
Dashboard → Analytics → Combos:
Review which models are used most
Adjust combo order to minimize costs
Ví dụ:
Current: cc/claude-opus → glm/glm-4.7
80% via Claude (good)
20% via GLM ($12/month)
Optimized: gc/gemini-3-flash → cc/claude-opus → glm/glm-4.7
50% via Gemini (free)
40% via Claude (subscription)
10% via GLM ($6/month)
Savings: $6/month
4. Theo dõi Thời gian Reset
Dashboard → Quota → Reset Schedule:
Claude Code: 5h rolling + Weekly Monday
Gemini CLI: Daily 00:00 UTC + Monthly 1st
GLM-4.7: Daily 10:00 AM Beijing Time
MiniMax: Rolling 5h window
Chiến lược: Dùng provider khi quota mới reset.
5. Xem Báo cáo Hàng tháng
Dashboard → Analytics → Monthly Report:
Total tokens: 1.5B
Total cost: $120
Savings: 97% vs ChatGPT API
Insights:
- 60% usage via subscriptions ($0)
- 30% via GLM ($90)
- 10% via free tier ($0)
Optimization:
- Increase Gemini CLI usage (free)
- Reduce GLM usage (expensive)
Truy cập API
Lấy trạng thái Quota
GET http://localhost:20128/api/quota
Authorization: Bearer your-api-key
Response:
{
"providers": [
{
"id": "cc",
"name": "Claude Code",
"quota": {
"used": 2.5,
"limit": 5,
"unit": "hours",
"percentage": 50
},
"reset": {
"type": "rolling",
"window": "5h",
"nextReset": "2026-02-04T06:45:00Z"
},
"cost": {
"today": 0,
"month": 0,
"currency": "USD"
}
},
{
"id": "glm",
"name": "GLM-4.7",
"quota": {
"used": 7000000,
"limit": 10000000,
"unit": "tokens",
"percentage": 70
},
"reset": {
"type": "daily",
"time": "10:00 AM UTC+8",
"nextReset": "2026-02-04T10:00:00+08:00"
},
"cost": {
"today": 4.20,
"month": 52.00,
"currency": "USD"
}
}
]
}
Lấy Usage Stats
GET http://localhost:20128/api/usage?period=today
Authorization: Bearer your-api-key
Response:
{
"period": "today",
"date": "2026-02-04",
"summary": {
"requests": 1234,
"tokens": 26000000,
"cost": 4.80
},
"byModel": [
{
"model": "cc/claude-opus-4-5",
"requests": 456,
"tokens": 15000000,
"cost": 0
},
{
"model": "glm/glm-4.7",
"requests": 234,
"tokens": 8000000,
"cost": 4.80
}
]
}
Troubleshooting
Issue: Quota hiển thị 0% nhưng request thất bại
Giải pháp:
- Kiểm tra kết nối provider (Dashboard → Providers)
- Xác minh API keys hợp lệ
- Kiểm tra provider có down không (trang status)
- Thử kết nối lại OAuth providers
Issue: Ước tính chi phí sai
Giải pháp:
- Dashboard → Settings → Pricing
- Xác minh giá mỗi provider khớp với mức hiện tại
- Cập nhật giá nếu provider thay đổi
- Liên hệ support nếu vẫn lệch
Issue: Thời gian reset không cập nhật
Giải pháp:
- Refresh dashboard (F5)
- Kiểm tra thời gian hệ thống đúng
- Xác minh cài đặt timezone
- Khởi động lại 9Router nếu vẫn lỗi
Issue: Không nhận được cảnh báo
Giải pháp:
- Dashboard → Settings → Alerts
- Xác minh địa chỉ email đúng
- Kiểm tra folder spam
- Test notification (nút Send Test)
Liên quan
- Smart Routing - Auto fallback dựa trên quota
- Combos - Tạo chuỗi fallback tùy chỉnh