Cherry-picked from decolua/9router PR #183.
Note: open-sse changes included but need further review due to extensive modifications.
Co-authored-by: Cursor <cursoragent@cursor.com>
Providers like Antigravity maintain separate quota buckets per model family
(e.g. Claude vs Gemini). A 429 on claude-opus previously locked the entire
account, preventing gemini-pro requests even though its quota was full.
This adds in-memory per-model locking so that only the specific model is
skipped during account selection while other models remain accessible.
Changes:
- Add model-aware lock tracking in auth.js (Map<connectionId:model, expiry>)
- Pass model context from chat handler to auth service
- Multi-bucket behavior gated to known providers (MULTI_BUCKET_PROVIDERS set)
- No database schema changes — locks are in-memory and clear on restart
Closes#110
Adds a mutex to serialize account selection and updates in the
proxy engine. This ensures that concurrent requests respect the
sticky limit and don't distribute to the same account simultaneously.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implements a "sticky" round-robin strategy that uses the same provider
account for a configurable number of consecutive calls (default 3)
before switching to the next one. This optimizes for prompt caching
by reducing organization/account rotation. Adds a configuration input
to the Profile settings page.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implements a round-robin (least recently used) account selection strategy
alongside the existing fill-first priority system. Adds a toggle in the
Profile dashboard to switch between strategies.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>