diff --git a/.gitignore b/.gitignore index b4b1d85..bf08aba 100644 --- a/.gitignore +++ b/.gitignore @@ -67,3 +67,4 @@ README1.md deploy.sh ecosystem.config.* start.sh +src/mitm/server2.js diff --git a/README.md b/README.md index 2776022..fa08180 100644 --- a/README.md +++ b/README.md @@ -5,11 +5,7 @@ **Never stop coding. Auto-route to FREE & cheap AI models with smart fallback.** - **Free AI Provider for OpenClaw.** - -

- OpenClaw -

+ **Connect All AI Code Tools (Claude Code, Cursor, Antigravity, Copilot, Codex, Gemini, OpenCode, Cline, OpenClaw...) to 40+ AI Providers & 100+ Models.** [![npm](https://img.shields.io/npm/v/9router.svg)](https://www.npmjs.com/package/9router) [![Downloads](https://img.shields.io/npm/dm/9router.svg)](https://www.npmjs.com/package/9router) @@ -1122,21 +1118,6 @@ Notes: **Dashboard opens on wrong port** - Set `PORT=20128` and `NEXT_PUBLIC_BASE_URL=http://localhost:20128` -**Cloud sync errors** -- Verify `BASE_URL` points to your running instance (example: `http://localhost:20128`) -- Verify `CLOUD_URL` points to your expected cloud endpoint (example: `https://9router.com`) -- Keep `NEXT_PUBLIC_*` values aligned with server-side values when possible. - -**Cloud endpoint `stream=false` returns 500 (`Unexpected token 'd'...`)** -- Symptom usually appears on public cloud endpoint (`https://9router.com/v1`) for non-streaming calls. -- Root cause: upstream returns SSE payload (`data: ...`) while client expects JSON. -- Workaround: use `stream=true` for cloud direct calls. -- Local 9Router runtime includes SSE→JSON fallback for non-streaming calls when upstream returns `text/event-stream`. - -**Cloud says connected, but request still fails with `Invalid API key`** -- Create a fresh key from local dashboard (`/api/keys`) and run cloud sync (`Enable Cloud` then `Sync Now`). -- Old/non-synced keys can still return `401` on cloud even if local endpoint works. - **First login not working** - Check `INITIAL_PASSWORD` in `.env` - If unset, fallback password is `123456` @@ -1184,80 +1165,6 @@ Authorization: Bearer your-api-key → Returns all models + combos in OpenAI format ``` -### Compatibility Endpoints - -- `POST /v1/chat/completions` -- `POST /v1/messages` -- `POST /v1/responses` -- `GET /v1/models` -- `POST /v1/messages/count_tokens` -- `GET /v1beta/models` -- `POST /v1beta/models/{...path}` (Gemini-style `generateContent`) -- `POST /v1/api/chat` (Ollama-style transform path) - -### Cloud Validation Scripts - -Added test scripts under `tester/security/`: - -- `tester/security/test-docker-hardening.sh` - - Builds Docker image and validates hardening checks (`/api/cloud/auth` auth guard, `REQUIRE_API_KEY`, secure auth cookie behavior). -- `tester/security/test-cloud-openai-compatible.sh` - - Sends a direct OpenAI-compatible request to cloud endpoint (`https://9router.com/v1/chat/completions`) with provided model/key. -- `tester/security/test-cloud-sync-and-call.sh` - - End-to-end flow: create local key -> enable/sync cloud -> call cloud endpoint with retry. - - Includes fallback check with `stream=true` to distinguish auth errors from non-streaming parse issues. - -Security note for cloud test scripts: - -- Never hardcode real API keys in scripts/commits. -- Provide keys only via environment variables: - - `API_KEY`, `CLOUD_API_KEY`, or `OPENAI_API_KEY` (supported by `test-cloud-openai-compatible.sh`) -- Example: - -```bash -OPENAI_API_KEY="your-cloud-key" bash tester/security/test-cloud-openai-compatible.sh -``` - -Expected behavior from recent validation: - -- Local runtime (`http://127.0.0.1:20128/v1/chat/completions`): works with `stream=false` and `stream=true`. -- Docker runtime (same API path exposed by container): hardening checks pass, cloud auth guard works, strict API key mode works when enabled. -- Public cloud endpoint (`https://9router.com/v1/chat/completions`): - - `stream=true`: expected to succeed (SSE chunks returned). - - `stream=false`: may fail with `500` + parse error (`Unexpected token 'd'`) when upstream returns SSE content to a non-streaming client path. - -### Dashboard and Management API - -- Auth/settings: `/api/auth/login`, `/api/auth/logout`, `/api/settings`, `/api/settings/require-login` -- Provider management: `/api/providers`, `/api/providers/[id]`, `/api/providers/[id]/test`, `/api/providers/[id]/models`, `/api/providers/validate`, `/api/provider-nodes*` -- OAuth flows: `/api/oauth/[provider]/[action]` (+ provider-specific imports like Cursor/Kiro) -- Routing config: `/api/models/alias`, `/api/combos*`, `/api/keys*`, `/api/pricing` -- Usage/logs: `/api/usage/history`, `/api/usage/logs`, `/api/usage/request-logs`, `/api/usage/[connectionId]` -- Cloud sync: `/api/sync/cloud`, `/api/sync/initialize`, `/api/cloud/*` -- CLI helpers: `/api/cli-tools/claude-settings`, `/api/cli-tools/codex-settings`, `/api/cli-tools/droid-settings`, `/api/cli-tools/openclaw-settings` - -### Authentication Behavior - -- Dashboard routes (`/dashboard/*`) use `auth_token` cookie protection. -- Login uses saved password hash when present; otherwise it falls back to `INITIAL_PASSWORD`. -- `requireLogin` can be toggled via `/api/settings/require-login`. - -### Request Processing (High Level) - -1. Client sends request to `/v1/*`. -2. Route handler calls `handleChat` (`src/sse/handlers/chat.js`). -3. Model is resolved (direct provider/model or alias/combo resolution). -4. Credentials are selected from local DB with account availability filtering. -5. `handleChatCore` (`open-sse/handlers/chatCore.js`) detects format and translates request. -6. Provider executor sends upstream request. -7. Stream is translated back to client format when needed. -8. Usage/logging is recorded (`src/lib/usageDb.js`). -9. Fallback applies on provider/account/model errors according to combo rules. - -Full architecture reference: [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) - ---- - ## 📧 Support - **Website**: [9router.com](https://9router.com) @@ -1278,17 +1185,7 @@ Thanks to all contributors who helped make 9Router better! [![Star Chart](https://starchart.cc/decolua/9router.svg?variant=adaptive)](https://starchart.cc/decolua/9router) -### How to Contribute -1. Fork the repository -2. Create your feature branch (`git checkout -b feature/amazing-feature`) -3. Commit your changes (`git commit -m 'Add amazing feature'`) -4. Push to the branch (`git push origin feature/amazing-feature`) -5. Open a Pull Request - -See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines. - ---- ## 🔀 Forks diff --git a/cloud/src/handlers/chat.js b/cloud/src/handlers/chat.js index 0b4a206..a6b8e4d 100644 --- a/cloud/src/handlers/chat.js +++ b/cloud/src/handlers/chat.js @@ -3,7 +3,7 @@ import { handleChatCore } from "open-sse/handlers/chatCore.js"; import { errorResponse } from "open-sse/utils/error.js"; import { checkFallbackError, isAccountUnavailable, getUnavailableUntil, getEarliestRateLimitedUntil, formatRetryAfter } from "open-sse/services/accountFallback.js"; import { getComboModelsFromData, handleComboChat } from "open-sse/services/combo.js"; -import { HTTP_STATUS } from "open-sse/config/constants.js"; +import { HTTP_STATUS } from "open-sse/config/runtimeConfig.js"; import * as log from "../utils/logger.js"; import { refreshTokenByProvider } from "../services/tokenRefresh.js"; import { parseApiKey, extractBearerToken } from "../utils/apiKey.js"; diff --git a/cloud/src/handlers/embeddings.js b/cloud/src/handlers/embeddings.js index 52d86be..41a9108 100644 --- a/cloud/src/handlers/embeddings.js +++ b/cloud/src/handlers/embeddings.js @@ -8,7 +8,7 @@ import { getUnavailableUntil, formatRetryAfter } from "open-sse/services/accountFallback.js"; -import { HTTP_STATUS } from "open-sse/config/constants.js"; +import { HTTP_STATUS } from "open-sse/config/runtimeConfig.js"; import * as log from "../utils/logger.js"; import { parseApiKey, extractBearerToken } from "../utils/apiKey.js"; import { getMachineData, saveMachineData } from "../services/storage.js"; diff --git a/images/9router.png b/images/9router.png index 47bac22..283210a 100644 Binary files a/images/9router.png and b/images/9router.png differ diff --git a/open-sse/config/appConstants.js b/open-sse/config/appConstants.js new file mode 100644 index 0000000..e9609d3 --- /dev/null +++ b/open-sse/config/appConstants.js @@ -0,0 +1,133 @@ +import { platform, arch } from "os"; + +// === Gemini CLI === +export const GEMINI_CLI_VERSION = "0.31.0"; +export const GEMINI_CLI_API_CLIENT = "google-genai-sdk/1.41.0 gl-node/v22.19.0"; + +export function geminiCLIUserAgent(model = "unknown") { + const os = platform() === "win32" ? "windows" : platform(); + return `GeminiCLI/${GEMINI_CLI_VERSION}/${model || "unknown"} (${os}; ${arch()})`; +} + +// === GitHub Copilot === +export const GITHUB_COPILOT = { + VSCODE_VERSION: "1.110.0", + COPILOT_CHAT_VERSION: "0.38.0", + USER_AGENT: "GitHubCopilotChat/0.38.0", + API_VERSION: "2025-04-01", +}; + +// === Antigravity enums === +export const IDE_TYPE = { + UNSPECIFIED: 0, + JETSKI: 10, + ANTIGRAVITY: 9, + PLUGINS: 7 +}; + +export const PLATFORM = { + UNSPECIFIED: 0, + DARWIN_AMD64: 1, + DARWIN_ARM64: 2, + LINUX_AMD64: 3, + LINUX_ARM64: 4, + WINDOWS_AMD64: 5 +}; + +export const PLUGIN_TYPE = { + UNSPECIFIED: 0, + CLOUD_CODE: 1, + GEMINI: 2 +}; + +export function getPlatformEnum() { + const os = platform(); + const architecture = arch(); + if (os === "darwin") return architecture === "arm64" ? PLATFORM.DARWIN_ARM64 : PLATFORM.DARWIN_AMD64; + if (os === "linux") return architecture === "arm64" ? PLATFORM.LINUX_ARM64 : PLATFORM.LINUX_AMD64; + if (os === "win32") return PLATFORM.WINDOWS_AMD64; + return PLATFORM.UNSPECIFIED; +} + +export function getPlatformUserAgent() { + return `antigravity/1.104.0 ${platform()}/${arch()}`; +} + +export const CLIENT_METADATA = { + ideType: IDE_TYPE.ANTIGRAVITY, + platform: getPlatformEnum(), + pluginType: PLUGIN_TYPE.GEMINI +}; + +// Internal anti-loop header +export const INTERNAL_REQUEST_HEADER = { name: "x-request-source", value: "local" }; + +// Antigravity chat/stream headers +export const ANTIGRAVITY_HEADERS = { + "X-Client-Name": "antigravity", + "X-Client-Version": "1.107.0", + "x-goog-api-client": "gl-node/18.18.2 fire/0.8.6 grpc/1.10.x", + "User-Agent": "antigravity/1.107.0 darwin/arm64" +}; + +// Cloud Code Assist API +export const CLOUD_CODE_API = { + loadCodeAssist: "https://cloudcode-pa.googleapis.com/v1internal:loadCodeAssist", + onboardUser: "https://cloudcode-pa.googleapis.com/v1internal:onboardUser", +}; + +export const LOAD_CODE_ASSIST_HEADERS = { + "Content-Type": "application/json", + "User-Agent": "google-api-nodejs-client/9.15.1", + "X-Goog-Api-Client": "google-cloud-sdk vscode_cloudshelleditor/0.1", + "Client-Metadata": JSON.stringify({ ideType: "IDE_UNSPECIFIED", platform: "PLATFORM_UNSPECIFIED", pluginType: "GEMINI" }), +}; + +export const LOAD_CODE_ASSIST_METADATA = { + ideType: "IDE_UNSPECIFIED", + platform: "PLATFORM_UNSPECIFIED", + pluginType: "GEMINI", +}; + +// System prompts +export const CLAUDE_SYSTEM_PROMPT = "You are a Claude agent, built on Anthropic's Claude Agent SDK."; +export const ANTIGRAVITY_DEFAULT_SYSTEM = "You are Antigravity, a powerful agentic AI coding assistant designed by the Google Deepmind team working on Advanced Agentic Coding.You are pair programming with a USER to solve their coding task. The task may require creating a new codebase, modifying or debugging an existing codebase, or simply answering a question.**Absolute paths only****Proactiveness**"; + +// OAuth endpoints +export const OAUTH_ENDPOINTS = { + google: { + token: "https://oauth2.googleapis.com/token", + auth: "https://accounts.google.com/o/oauth2/auth" + }, + openai: { + token: "https://auth.openai.com/oauth/token", + auth: "https://auth.openai.com/oauth/authorize" + }, + anthropic: { + token: "https://api.anthropic.com/v1/oauth/token", + auth: "https://api.anthropic.com/v1/oauth/authorize" + }, + qwen: { + token: "https://chat.qwen.ai/api/v1/oauth2/token", + auth: "https://chat.qwen.ai/api/v1/oauth2/device/code" + }, + iflow: { + token: "https://iflow.cn/oauth/token", + auth: "https://iflow.cn/oauth" + }, + github: { + token: "https://github.com/login/oauth/access_token", + auth: "https://github.com/login/oauth/authorize", + deviceCode: "https://github.com/login/device/code" + } +}; + +// Generate Kimi OAuth custom headers +export function buildKimiHeaders() { + return { + "X-Msh-Platform": "9router", + "X-Msh-Version": "2.1.2", + "X-Msh-Device-Model": typeof process !== "undefined" ? `${process.platform} ${process.arch}` : "unknown", + "X-Msh-Device-Id": `kimi-${Date.now()}` + }; +} diff --git a/open-sse/config/constants.js b/open-sse/config/constants.js index 63f2991..7607376 100644 --- a/open-sse/config/constants.js +++ b/open-sse/config/constants.js @@ -1,566 +1,4 @@ -import { platform, arch } from "os"; - -function mapStainlessOs() { - switch (platform()) { - case "darwin": return "MacOS"; - case "win32": return "Windows"; - case "linux": return "Linux"; - case "freebsd": return "FreeBSD"; - default: return `Other::${platform()}`; - } -} - -function mapStainlessArch() { - switch (arch()) { - case "x64": return "x64"; - case "arm64": return "arm64"; - case "ia32": return "x86"; - default: return `other::${arch()}`; - } -} - -// === Gemini CLI Version Constants === -export const GEMINI_CLI_VERSION = "0.31.0"; -export const GEMINI_CLI_API_CLIENT = "google-genai-sdk/1.41.0 gl-node/v22.19.0"; - -function mapGeminiCLIOs() { - switch (platform()) { - case "darwin": return "darwin"; - case "win32": return "windows"; - case "linux": return "linux"; - case "freebsd": return "freebsd"; - default: return platform(); - } -} - -function mapGeminiCLIArch() { - switch (arch()) { - case "x64": return "x64"; - case "arm64": return "arm64"; - case "ia32": return "x86"; - default: return arch(); - } -} - -/** Returns User-Agent matching native Gemini CLI format: GeminiCLI// (; ) */ -export function geminiCLIUserAgent(model = "unknown") { - return `GeminiCLI/${GEMINI_CLI_VERSION}/${model || "unknown"} (${mapGeminiCLIOs()}; ${mapGeminiCLIArch()})`; -} - -// === GitHub Copilot Version Constants === -export const GITHUB_COPILOT = { - VSCODE_VERSION: "1.110.0", - COPILOT_CHAT_VERSION: "0.38.0", - USER_AGENT: "GitHubCopilotChat/0.38.0", - API_VERSION: "2025-04-01", -}; - -// === Antigravity Binary Alignment: Numeric Enums === -// Reference: Antigravity binary analysis - google.internal.cloud.code.v1internal.ClientMetadata - -// IDE Type enum (numeric values as expected by Cloud Code API) -export const IDE_TYPE = { - UNSPECIFIED: 0, - JETSKI: 10, // Internal codename for Gemini CLI - ANTIGRAVITY: 9, - PLUGINS: 7 -}; - -// Platform enum (as specified in Antigravity binary) -export const PLATFORM = { - UNSPECIFIED: 0, - DARWIN_AMD64: 1, - DARWIN_ARM64: 2, - LINUX_AMD64: 3, - LINUX_ARM64: 4, - WINDOWS_AMD64: 5 -}; - -// Plugin type enum (as specified in Antigravity binary) -export const PLUGIN_TYPE = { - UNSPECIFIED: 0, - CLOUD_CODE: 1, - GEMINI: 2 -}; - -/** - * Get the platform enum value based on the current OS. - * @returns {number} Platform enum value - */ -export function getPlatformEnum() { - const os = platform(); - const architecture = arch(); - - if (os === "darwin") { - return architecture === "arm64" ? PLATFORM.DARWIN_ARM64 : PLATFORM.DARWIN_AMD64; - } else if (os === "linux") { - return architecture === "arm64" ? PLATFORM.LINUX_ARM64 : PLATFORM.LINUX_AMD64; - } else if (os === "win32") { - return PLATFORM.WINDOWS_AMD64; - } - return PLATFORM.UNSPECIFIED; -} - -/** - * Generate platform-specific User-Agent string. - * @returns {string} User-Agent in format "antigravity/version os/arch" - */ -export function getPlatformUserAgent() { - const os = platform(); - const architecture = arch(); - return `antigravity/1.104.0 ${os}/${architecture}`; -} - -// Centralized client metadata (used in request bodies for loadCodeAssist, onboardUser, etc.) -// Using numeric enum values as expected by the Cloud Code API -export const CLIENT_METADATA = { - ideType: IDE_TYPE.ANTIGRAVITY, // 9 - identifies as Antigravity client - platform: getPlatformEnum(), // Runtime platform detection - pluginType: PLUGIN_TYPE.GEMINI // 2 -}; - -// Internal anti-loop header to identify requests originating from this proxy -export const INTERNAL_REQUEST_HEADER = { name: "x-request-source", value: "local" }; - -// Antigravity headers (for chat/stream requests) -export const ANTIGRAVITY_HEADERS = { - "X-Client-Name": "antigravity", - "X-Client-Version": "1.107.0", - "x-goog-api-client": "gl-node/18.18.2 fire/0.8.6 grpc/1.10.x", - "User-Agent": "antigravity/1.107.0 darwin/arm64" -}; - -// Cloud Code Assist API endpoints (for Project ID discovery) -export const CLOUD_CODE_API = { - loadCodeAssist: "https://cloudcode-pa.googleapis.com/v1internal:loadCodeAssist", - onboardUser: "https://cloudcode-pa.googleapis.com/v1internal:onboardUser", -}; - -// Headers for loadCodeAssist / onboardUser API calls (matches CLIProxyAPI Go source) -export const LOAD_CODE_ASSIST_HEADERS = { - "Content-Type": "application/json", - "User-Agent": "google-api-nodejs-client/9.15.1", - "X-Goog-Api-Client": "google-cloud-sdk vscode_cloudshelleditor/0.1", - "Client-Metadata": JSON.stringify({ ideType: "IDE_UNSPECIFIED", platform: "PLATFORM_UNSPECIFIED", pluginType: "GEMINI" }), -}; - -// Metadata body for loadCodeAssist / onboardUser (string enum, matches CLIProxyAPI Go source) -export const LOAD_CODE_ASSIST_METADATA = { - ideType: "IDE_UNSPECIFIED", - platform: "PLATFORM_UNSPECIFIED", - pluginType: "GEMINI", -}; - -// Provider configurations -export const PROVIDERS = { - claude: { - baseUrl: "https://api.anthropic.com/v1/messages", - format: "claude", - headers: { - "Anthropic-Version": "2023-06-01", - "Anthropic-Beta": "claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05", - "Anthropic-Dangerous-Direct-Browser-Access": "true", - "User-Agent": "claude-cli/2.1.63 (external, cli)", - "X-App": "cli", - "X-Stainless-Helper-Method": "stream", - "X-Stainless-Retry-Count": "0", - "X-Stainless-Runtime-Version": "v24.3.0", - "X-Stainless-Package-Version": "0.74.0", - "X-Stainless-Runtime": "node", - "X-Stainless-Lang": "js", - "X-Stainless-Arch": mapStainlessArch(), - "X-Stainless-Os": mapStainlessOs(), - "X-Stainless-Timeout": "600" - }, - // Claude OAuth configuration - clientId: "9d1c250a-e61b-44d9-88ed-5944d1962f5e", - tokenUrl: "https://api.anthropic.com/v1/oauth/token" - }, - gemini: { - baseUrl: "https://generativelanguage.googleapis.com/v1beta/models", - format: "gemini", - clientId: "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com", - clientSecret: "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" - }, - "gemini-cli": { - baseUrl: "https://cloudcode-pa.googleapis.com/v1internal", - format: "gemini-cli", - clientId: "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com", - clientSecret: "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" - }, - codex: { - baseUrl: "https://chatgpt.com/backend-api/codex/responses", - format: "openai-responses", // Use OpenAI Responses API format (reuse translator) - headers: { - "originator": "codex-cli", - "User-Agent": "codex-cli/1.0.18 (macOS; arm64)" - }, - // OpenAI OAuth configuration - clientId: "app_EMoamEEZ73f0CkXaXp7hrann", - clientSecret: "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", - tokenUrl: "https://auth.openai.com/oauth/token" - }, - qwen: { - baseUrl: "https://portal.qwen.ai/v1/chat/completions", - format: "openai", - headers: { - "User-Agent": "google-api-nodejs-client/9.15.1", - "X-Goog-Api-Client": "gl-node/22.17.0" - }, - // Qwen OAuth configuration - clientId: "f0304373b74a44d2b584a3fb70ca9e56", // From CLIProxyAPI - tokenUrl: "https://chat.qwen.ai/api/v1/oauth2/token", - authUrl: "https://chat.qwen.ai/api/v1/oauth2/device/code" - }, - iflow: { - baseUrl: "https://apis.iflow.cn/v1/chat/completions", - format: "openai", - headers: { - "User-Agent": "iFlow-Cli" - }, - // iFlow OAuth configuration (from CLIProxyAPI) - clientId: "10009311001", - clientSecret: "4Z3YjXycVsQvyGF1etiNlIBB4RsqSDtW", - tokenUrl: "https://iflow.cn/oauth/token", - authUrl: "https://iflow.cn/oauth" - }, - antigravity: { - baseUrls: [ - "https://daily-cloudcode-pa.googleapis.com", - "https://cloudcode-pa.googleapis.com", - ], - format: "antigravity", - headers: { - "User-Agent": getPlatformUserAgent() - }, - clientId: "1071006060591-tmhssin2h21lcre235vtolojh4g403ep.apps.googleusercontent.com", - clientSecret: "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" - }, - openrouter: { - baseUrl: "https://openrouter.ai/api/v1/chat/completions", - format: "openai", - headers: { - "HTTP-Referer": "https://endpoint-proxy.local", - "X-Title": "Endpoint Proxy" - } - }, - openai: { - baseUrl: "https://api.openai.com/v1/chat/completions", - format: "openai" - }, - glm: { - baseUrl: "https://api.z.ai/api/anthropic/v1/messages", - format: "claude", - headers: { - "Anthropic-Version": "2023-06-01", - "Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14" - } - }, - "glm-cn": { - baseUrl: "https://open.bigmodel.cn/api/coding/paas/v4/chat/completions", - format: "openai", - headers: {} - }, - kimi: { - baseUrl: "https://api.kimi.com/coding/v1/messages", - format: "claude", - headers: { - "Anthropic-Version": "2023-06-01", - "Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14" - } - }, - minimax: { - baseUrl: "https://api.minimax.io/anthropic/v1/messages", - format: "claude", - headers: { - "Anthropic-Version": "2023-06-01", - "Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14" - } - }, - "minimax-cn": { - baseUrl: "https://api.minimaxi.com/anthropic/v1/messages", - format: "claude", - headers: { - "Anthropic-Version": "2023-06-01", - "Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14" - } - }, - alicode: { - baseUrl: "https://coding.dashscope.aliyuncs.com/v1/chat/completions", - format: "openai", - headers: {} - }, - "alicode-intl": { - baseUrl: "https://coding-intl.dashscope.aliyuncs.com/v1/chat/completions", - format: "openai", - headers: {} - }, - github: { - baseUrl: "https://api.githubcopilot.com/chat/completions", // GitHub Copilot API endpoint for chat - responsesUrl: "https://api.githubcopilot.com/responses", - format: "openai", // GitHub Copilot uses OpenAI-compatible format - headers: { - "copilot-integration-id": "vscode-chat", - "editor-version": `vscode/${GITHUB_COPILOT.VSCODE_VERSION}`, - "editor-plugin-version": `copilot-chat/${GITHUB_COPILOT.COPILOT_CHAT_VERSION}`, - "user-agent": GITHUB_COPILOT.USER_AGENT, - "openai-intent": "conversation-panel", - "x-github-api-version": GITHUB_COPILOT.API_VERSION, - "x-vscode-user-agent-library-version": "electron-fetch", - "X-Initiator": "user", - "Accept": "application/json", - "Content-Type": "application/json" - } - }, - kiro: { - baseUrl: "https://codewhisperer.us-east-1.amazonaws.com/generateAssistantResponse", - format: "kiro", - headers: { - "Content-Type": "application/json", - "Accept": "application/vnd.amazon.eventstream", - "X-Amz-Target": "AmazonCodeWhispererStreamingService.GenerateAssistantResponse", - "User-Agent": "AWS-SDK-JS/3.0.0 kiro-ide/1.0.0", - "X-Amz-User-Agent": "aws-sdk-js/3.0.0 kiro-ide/1.0.0" - }, - // Kiro OAuth endpoints - tokenUrl: "https://prod.us-east-1.auth.desktop.kiro.dev/refreshToken", - authUrl: "https://prod.us-east-1.auth.desktop.kiro.dev" - }, - cursor: { - baseUrl: "https://api2.cursor.sh", - chatPath: "/aiserver.v1.ChatService/StreamUnifiedChatWithTools", - format: "cursor", - headers: { - "connect-accept-encoding": "gzip", - "connect-protocol-version": "1", - "Content-Type": "application/connect+proto", - "User-Agent": "connect-es/1.6.1" - }, - clientVersion: "1.1.3" - }, - "kimi-coding": { - baseUrl: "https://api.kimi.com/coding/v1/messages", - format: "claude", - headers: { - "Anthropic-Version": "2023-06-01", - "Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14" - }, - clientId: "17e5f671-d194-4dfb-9706-5516cb48c098", - tokenUrl: "https://auth.kimi.com/api/oauth/token", - refreshUrl: "https://auth.kimi.com/api/oauth/token" - }, - kilocode: { - baseUrl: "https://api.kilo.ai/api/openrouter/chat/completions", - format: "openai", - headers: {} - }, - cline: { - baseUrl: "https://api.cline.bot/api/v1/chat/completions", - format: "openai", - headers: { - "HTTP-Referer": "https://cline.bot", - "X-Title": "Cline" - }, - tokenUrl: "https://api.cline.bot/api/v1/auth/token", - refreshUrl: "https://api.cline.bot/api/v1/auth/refresh" - }, - nvidia: { - baseUrl: "https://integrate.api.nvidia.com/v1/chat/completions", - format: "openai" - }, - anthropic: { - baseUrl: "https://api.anthropic.com/v1/messages", - format: "claude", - headers: { - "Anthropic-Version": "2023-06-01", - "Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14" - } - }, - deepseek: { - baseUrl: "https://api.deepseek.com/chat/completions", - format: "openai" - }, - groq: { - baseUrl: "https://api.groq.com/openai/v1/chat/completions", - format: "openai" - }, - xai: { - baseUrl: "https://api.x.ai/v1/chat/completions", - format: "openai" - }, - mistral: { - baseUrl: "https://api.mistral.ai/v1/chat/completions", - format: "openai" - }, - perplexity: { - baseUrl: "https://api.perplexity.ai/chat/completions", - format: "openai" - }, - together: { - baseUrl: "https://api.together.xyz/v1/chat/completions", - format: "openai" - }, - fireworks: { - baseUrl: "https://api.fireworks.ai/inference/v1/chat/completions", - format: "openai" - }, - cerebras: { - baseUrl: "https://api.cerebras.ai/v1/chat/completions", - format: "openai" - }, - cohere: { - baseUrl: "https://api.cohere.ai/v1/chat/completions", - format: "openai" - }, - nebius: { - baseUrl: "https://api.studio.nebius.ai/v1/chat/completions", - format: "openai" - }, - siliconflow: { - baseUrl: "https://api.siliconflow.cn/v1/chat/completions", - format: "openai" - }, - hyperbolic: { - baseUrl: "https://api.hyperbolic.xyz/v1/chat/completions", - format: "openai" - }, - deepgram: { - baseUrl: "https://api.deepgram.com/v1/listen", - format: "openai" - }, - assemblyai: { - baseUrl: "https://api.assemblyai.com/v1/audio/transcriptions", - format: "openai" - }, - nanobanana: { - baseUrl: "https://api.nanobananaapi.ai/v1/chat/completions", - format: "openai" - }, - chutes: { - baseUrl: "https://llm.chutes.ai/v1/chat/completions", - format: "openai" - } -}; - -// Claude system prompt -export const CLAUDE_SYSTEM_PROMPT = "You are a Claude agent, built on Anthropic's Claude Agent SDK."; - -// Antigravity default system prompt (required for API to work) -export const ANTIGRAVITY_DEFAULT_SYSTEM = "You are Antigravity, a powerful agentic AI coding assistant designed by the Google Deepmind team working on Advanced Agentic Coding.You are pair programming with a USER to solve their coding task. The task may require creating a new codebase, modifying or debugging an existing codebase, or simply answering a question.**Absolute paths only****Proactiveness**"; - -// OAuth endpoints -export const OAUTH_ENDPOINTS = { - google: { - token: "https://oauth2.googleapis.com/token", - auth: "https://accounts.google.com/o/oauth2/auth" - }, - openai: { - token: "https://auth.openai.com/oauth/token", - auth: "https://auth.openai.com/oauth/authorize" - }, - anthropic: { - token: "https://api.anthropic.com/v1/oauth/token", - auth: "https://api.anthropic.com/v1/oauth/authorize" - }, - qwen: { - token: "https://chat.qwen.ai/api/v1/oauth2/token", // From CLIProxyAPI - auth: "https://chat.qwen.ai/api/v1/oauth2/device/code" // From CLIProxyAPI - }, - iflow: { - token: "https://iflow.cn/oauth/token", - auth: "https://iflow.cn/oauth" - }, - github: { - token: "https://github.com/login/oauth/access_token", - auth: "https://github.com/login/oauth/authorize", - deviceCode: "https://github.com/login/device/code" - } -}; - -// Cache TTLs (seconds) -export const CACHE_TTL = { - userInfo: 300, // 5 minutes - modelAlias: 3600 // 1 hour -}; - -// Default max tokens -export const DEFAULT_MAX_TOKENS = 64000; - -// Minimum max tokens for tool calling (to prevent truncated arguments) -export const DEFAULT_MIN_TOKENS = 32000; - -// Retry config for 429 responses (used by BaseExecutor) -export const RETRY_CONFIG = { - maxAttempts: 2, - delayMs: 2000 -}; - -// HTTP status codes -export const HTTP_STATUS = { - BAD_REQUEST: 400, - UNAUTHORIZED: 401, - PAYMENT_REQUIRED: 402, - FORBIDDEN: 403, - NOT_FOUND: 404, - NOT_ACCEPTABLE: 406, - REQUEST_TIMEOUT: 408, - RATE_LIMITED: 429, - SERVER_ERROR: 500, - BAD_GATEWAY: 502, - SERVICE_UNAVAILABLE: 503, - GATEWAY_TIMEOUT: 504 -}; - -// OpenAI-compatible error types mapping -export const ERROR_TYPES = { - [HTTP_STATUS.BAD_REQUEST]: { type: "invalid_request_error", code: "bad_request" }, - [HTTP_STATUS.UNAUTHORIZED]: { type: "authentication_error", code: "invalid_api_key" }, - [HTTP_STATUS.FORBIDDEN]: { type: "permission_error", code: "insufficient_quota" }, - [HTTP_STATUS.NOT_FOUND]: { type: "invalid_request_error", code: "model_not_found" }, - [HTTP_STATUS.NOT_ACCEPTABLE]: { type: "invalid_request_error", code: "model_not_supported" }, - [HTTP_STATUS.RATE_LIMITED]: { type: "rate_limit_error", code: "rate_limit_exceeded" }, - [HTTP_STATUS.SERVER_ERROR]: { type: "server_error", code: "internal_server_error" }, - [HTTP_STATUS.BAD_GATEWAY]: { type: "server_error", code: "bad_gateway" }, - [HTTP_STATUS.SERVICE_UNAVAILABLE]: { type: "server_error", code: "service_unavailable" }, - [HTTP_STATUS.GATEWAY_TIMEOUT]: { type: "server_error", code: "gateway_timeout" } -}; - -// Default error messages per status code -export const DEFAULT_ERROR_MESSAGES = { - [HTTP_STATUS.BAD_REQUEST]: "Bad request", - [HTTP_STATUS.UNAUTHORIZED]: "Invalid API key provided", - [HTTP_STATUS.FORBIDDEN]: "You exceeded your current quota", - [HTTP_STATUS.NOT_FOUND]: "Model not found", - [HTTP_STATUS.NOT_ACCEPTABLE]: "Model not supported", - [HTTP_STATUS.RATE_LIMITED]: "Rate limit exceeded", - [HTTP_STATUS.SERVER_ERROR]: "Internal server error", - [HTTP_STATUS.BAD_GATEWAY]: "Bad gateway - upstream provider error", - [HTTP_STATUS.SERVICE_UNAVAILABLE]: "Service temporarily unavailable", - [HTTP_STATUS.GATEWAY_TIMEOUT]: "Gateway timeout" -}; - -// Exponential backoff config for rate limits (like CLIProxyAPI) -export const BACKOFF_CONFIG = { - base: 1000, // 1 second base - max: 2 * 60 * 1000, // 2 minutes max - maxLevel: 15 // Cap backoff level -}; - -// Error-based cooldown times (aligned with CLIProxyAPI) -export const COOLDOWN_MS = { - unauthorized: 2 * 60 * 1000, // 401 → 30 min - paymentRequired: 2 * 60 * 1000, // 402/403 → 30 min - notFound: 2 * 60 * 1000, // 404 → 2 minutes - transient: 30 * 1000, // 408/500/502/503/504 → 1 min - requestNotAllowed: 5 * 1000, // "Request not allowed" → 5 sec - // Legacy aliases for backward compatibility - rateLimit: 2 * 60 * 1000, - serviceUnavailable: 2 * 1000, - authExpired: 2 * 60 * 1000 -}; - -// Skip patterns - requests containing these texts will bypass provider -export const SKIP_PATTERNS = [ - "Please write a 5-10 word title for the following conversation:" -]; - +// Barrel re-export — consumers can migrate to specific files over time +export * from "./providers.js"; +export * from "./appConstants.js"; +export * from "./runtimeConfig.js"; diff --git a/open-sse/config/providerModels.js b/open-sse/config/providerModels.js index 2360bd6..1cc4ee5 100644 --- a/open-sse/config/providerModels.js +++ b/open-sse/config/providerModels.js @@ -1,3 +1,5 @@ +import { PROVIDERS } from "./providers.js"; + // Provider models - Single source of truth // Key = alias (cc, cx, gc, qw, if, ag, gh for OAuth; id for API Key) // Field "provider" for special cases (e.g. AntiGravity models that call different backends) @@ -105,6 +107,9 @@ export const PROVIDER_MODELS = { // { id: "claude-opus-4.5", name: "Claude Opus 4.5" }, { id: "claude-sonnet-4.5", name: "Claude Sonnet 4.5" }, { id: "claude-haiku-4.5", name: "Claude Haiku 4.5" }, + { id: "deepseek-3.2", name: "DeepSeek 3.2" }, + { id: "deepseek-3.1", name: "DeepSeek 3.1" }, + { id: "qwen3-coder-next", name: "Qwen3 Coder Next" }, ], cu: [ // Cursor IDE { id: "default", name: "Auto (Server Picks)" }, @@ -161,7 +166,10 @@ export const PROVIDER_MODELS = { { id: "claude-3-5-sonnet-20241022", name: "Claude 3.5 Sonnet" }, ], gemini: [ + { id: "gemini-3.1-pro-preview", name: "Gemini 3.1 Pro Preview" }, + { id: "gemini-3.1-flash-lite-preview", name: "Gemini 3.1 Flash Lite Preview" }, { id: "gemini-3-pro-preview", name: "Gemini 3 Pro Preview" }, + { id: "gemini-3-flash-preview", name: "Gemini 3 Flash Preview" }, { id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" }, { id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" }, { id: "gemini-2.5-flash-lite", name: "Gemini 2.5 Flash Lite" }, @@ -298,6 +306,26 @@ export const PROVIDER_MODELS = { { id: "Qwen/Qwen2.5-Coder-32B-Instruct", name: "Qwen 2.5 Coder 32B" }, { id: "NousResearch/Hermes-3-Llama-3.1-70B", name: "Hermes 3 70B" }, ], + ollama: [ + { id: "gpt-oss:120b", name: "GPT OSS 120B" }, + { id: "kimi-k2.5", name: "Kimi K2.5" }, + { id: "glm-5", name: "GLM 5" }, + { id: "minimax-m2.5", name: "MiniMax M2.5" }, + { id: "glm-4.7-flash", name: "GLM 4.7 Flash" }, + { id: "qwen3.5", name: "Qwen3.5" }, + ], + vertex: [ + { id: "gemini-3.1-pro-preview", name: "Gemini 3.1 Pro Preview" }, + { id: "gemini-3.1-flash-lite-preview", name: "Gemini 3.1 Flash Lite Preview" }, + { id: "gemini-3-flash-preview", name: "Gemini 3 Flash Preview" }, + { id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" }, + ], + "vertex-partner": [ + { id: "deepseek-ai/deepseek-v3.2-maas", name: "DeepSeek V3.2 (Vertex)" }, + { id: "qwen/qwen3-next-80b-a3b-thinking-maas", name: "Qwen3 Next 80B Thinking (Vertex)" }, + { id: "qwen/qwen3-next-80b-a3b-instruct-maas", name: "Qwen3 Next 80B Instruct (Vertex)" }, + { id: "zai-org/glm-5-maas", name: "GLM-5 (Vertex)" }, + ], }; // Helper functions @@ -331,8 +359,8 @@ export function getModelTargetFormat(aliasOrId, modelId) { return found?.targetFormat || null; } -// Provider ID to alias mapping -export const PROVIDER_ID_TO_ALIAS = { +// OAuth providers that use short aliases (everything else: alias = id) +const OAUTH_ALIASES = { claude: "cc", codex: "cx", "gemini-cli": "gc", @@ -345,32 +373,15 @@ export const PROVIDER_ID_TO_ALIAS = { "kimi-coding": "kmc", kilocode: "kc", cline: "cl", - openai: "openai", - anthropic: "anthropic", - gemini: "gemini", - openrouter: "openrouter", - glm: "glm", - "glm-cn": "glm-cn", - kimi: "kimi", - minimax: "minimax", - "minimax-cn": "minimax-cn", - alicode: "alicode", - "alicode-intl": "alicode-intl", - deepseek: "deepseek", - groq: "groq", - xai: "xai", - mistral: "mistral", - perplexity: "perplexity", - together: "together", - fireworks: "fireworks", - cerebras: "cerebras", - cohere: "cohere", - nvidia: "nvidia", - nebius: "nebius", - siliconflow: "siliconflow", - hyperbolic: "hyperbolic", + vertex: "vertex", + "vertex-partner": "vertex-partner", }; +// Derived from PROVIDERS — no need to maintain manually +export const PROVIDER_ID_TO_ALIAS = Object.fromEntries( + Object.keys(PROVIDERS).map(id => [id, OAUTH_ALIASES[id] || id]) +); + export function getModelsByProviderId(providerId) { const alias = PROVIDER_ID_TO_ALIAS[providerId] || providerId; return PROVIDER_MODELS[alias] || []; diff --git a/open-sse/config/providers.js b/open-sse/config/providers.js new file mode 100644 index 0000000..f945bcc --- /dev/null +++ b/open-sse/config/providers.js @@ -0,0 +1,313 @@ +import { platform, arch } from "os"; + +// === OS/Arch helpers === +function mapStainlessOs() { + switch (platform()) { + case "darwin": return "MacOS"; + case "win32": return "Windows"; + case "linux": return "Linux"; + case "freebsd": return "FreeBSD"; + default: return `Other::${platform()}`; + } +} + +function mapStainlessArch() { + switch (arch()) { + case "x64": return "x64"; + case "arm64": return "arm64"; + case "ia32": return "x86"; + default: return `other::${arch()}`; + } +} + +// Shared Claude-compatible API headers (reused across claude-format providers) +const CLAUDE_API_HEADERS = { + "Anthropic-Version": "2023-06-01", + "Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14" +}; + +// Shared baseUrls +const KIMI_CODING_BASE_URL = "https://api.kimi.com/coding/v1/messages"; + +export const PROVIDERS = { + claude: { + baseUrl: "https://api.anthropic.com/v1/messages", + format: "claude", + headers: { + "Anthropic-Version": "2023-06-01", + "Anthropic-Beta": "claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05", + "Anthropic-Dangerous-Direct-Browser-Access": "true", + "User-Agent": "claude-cli/2.1.63 (external, cli)", + "X-App": "cli", + "X-Stainless-Helper-Method": "stream", + "X-Stainless-Retry-Count": "0", + "X-Stainless-Runtime-Version": "v24.3.0", + "X-Stainless-Package-Version": "0.74.0", + "X-Stainless-Runtime": "node", + "X-Stainless-Lang": "js", + "X-Stainless-Arch": mapStainlessArch(), + "X-Stainless-Os": mapStainlessOs(), + "X-Stainless-Timeout": "600" + }, + clientId: "9d1c250a-e61b-44d9-88ed-5944d1962f5e", + tokenUrl: "https://api.anthropic.com/v1/oauth/token" + }, + gemini: { + baseUrl: "https://generativelanguage.googleapis.com/v1beta/models", + format: "gemini", + clientId: "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com", + clientSecret: "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" + }, + "gemini-cli": { + baseUrl: "https://cloudcode-pa.googleapis.com/v1internal", + format: "gemini-cli", + clientId: "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com", + clientSecret: "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" + }, + codex: { + baseUrl: "https://chatgpt.com/backend-api/codex/responses", + format: "openai-responses", + headers: { + "originator": "codex-cli", + "User-Agent": "codex-cli/1.0.18 (macOS; arm64)" + }, + clientId: "app_EMoamEEZ73f0CkXaXp7hrann", + clientSecret: "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", + tokenUrl: "https://auth.openai.com/oauth/token" + }, + qwen: { + baseUrl: "https://portal.qwen.ai/v1/chat/completions", + format: "openai", + headers: { + "User-Agent": "google-api-nodejs-client/9.15.1", + "X-Goog-Api-Client": "gl-node/22.17.0" + }, + clientId: "f0304373b74a44d2b584a3fb70ca9e56", + tokenUrl: "https://chat.qwen.ai/api/v1/oauth2/token", + authUrl: "https://chat.qwen.ai/api/v1/oauth2/device/code" + }, + iflow: { + baseUrl: "https://apis.iflow.cn/v1/chat/completions", + format: "openai", + headers: { "User-Agent": "iFlow-Cli" }, + clientId: "10009311001", + clientSecret: "4Z3YjXycVsQvyGF1etiNlIBB4RsqSDtW", + tokenUrl: "https://iflow.cn/oauth/token", + authUrl: "https://iflow.cn/oauth" + }, + antigravity: { + baseUrls: [ + "https://daily-cloudcode-pa.googleapis.com", + "https://cloudcode-pa.googleapis.com", + ], + format: "antigravity", + headers: { "User-Agent": `antigravity/1.104.0 ${platform()}/${arch()}` }, + clientId: "1071006060591-tmhssin2h21lcre235vtolojh4g403ep.apps.googleusercontent.com", + clientSecret: "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" + }, + openrouter: { + baseUrl: "https://openrouter.ai/api/v1/chat/completions", + format: "openai", + headers: { + "HTTP-Referer": "https://endpoint-proxy.local", + "X-Title": "Endpoint Proxy" + } + }, + openai: { + baseUrl: "https://api.openai.com/v1/chat/completions", + format: "openai" + }, + glm: { + baseUrl: "https://api.z.ai/api/anthropic/v1/messages", + format: "claude", + headers: { ...CLAUDE_API_HEADERS } + }, + "glm-cn": { + baseUrl: "https://open.bigmodel.cn/api/coding/paas/v4/chat/completions", + format: "openai", + headers: {} + }, + kimi: { + baseUrl: KIMI_CODING_BASE_URL, + format: "claude", + headers: { ...CLAUDE_API_HEADERS } + }, + minimax: { + baseUrl: "https://api.minimax.io/anthropic/v1/messages", + format: "claude", + headers: { ...CLAUDE_API_HEADERS } + }, + "minimax-cn": { + baseUrl: "https://api.minimaxi.com/anthropic/v1/messages", + format: "claude", + headers: { ...CLAUDE_API_HEADERS } + }, + alicode: { + baseUrl: "https://coding.dashscope.aliyuncs.com/v1/chat/completions", + format: "openai", + headers: {} + }, + "alicode-intl": { + baseUrl: "https://coding-intl.dashscope.aliyuncs.com/v1/chat/completions", + format: "openai", + headers: {} + }, + github: { + baseUrl: "https://api.githubcopilot.com/chat/completions", + responsesUrl: "https://api.githubcopilot.com/responses", + format: "openai", + headers: { + "copilot-integration-id": "vscode-chat", + "editor-version": "vscode/1.110.0", + "editor-plugin-version": "copilot-chat/0.38.0", + "user-agent": "GitHubCopilotChat/0.38.0", + "openai-intent": "conversation-panel", + "x-github-api-version": "2025-04-01", + "x-vscode-user-agent-library-version": "electron-fetch", + "X-Initiator": "user", + "Accept": "application/json", + "Content-Type": "application/json" + } + }, + kiro: { + baseUrl: "https://codewhisperer.us-east-1.amazonaws.com/generateAssistantResponse", + format: "kiro", + headers: { + "Content-Type": "application/json", + "Accept": "application/vnd.amazon.eventstream", + "X-Amz-Target": "AmazonCodeWhispererStreamingService.GenerateAssistantResponse", + "User-Agent": "AWS-SDK-JS/3.0.0 kiro-ide/1.0.0", + "X-Amz-User-Agent": "aws-sdk-js/3.0.0 kiro-ide/1.0.0" + }, + tokenUrl: "https://prod.us-east-1.auth.desktop.kiro.dev/refreshToken", + authUrl: "https://prod.us-east-1.auth.desktop.kiro.dev" + }, + cursor: { + baseUrl: "https://api2.cursor.sh", + chatPath: "/aiserver.v1.ChatService/StreamUnifiedChatWithTools", + format: "cursor", + headers: { + "connect-accept-encoding": "gzip", + "connect-protocol-version": "1", + "Content-Type": "application/connect+proto", + "User-Agent": "connect-es/1.6.1" + }, + clientVersion: "1.1.3" + }, + "kimi-coding": { + baseUrl: KIMI_CODING_BASE_URL, + format: "claude", + headers: { ...CLAUDE_API_HEADERS }, + clientId: "17e5f671-d194-4dfb-9706-5516cb48c098", + tokenUrl: "https://auth.kimi.com/api/oauth/token", + refreshUrl: "https://auth.kimi.com/api/oauth/token" + }, + kilocode: { + baseUrl: "https://api.kilo.ai/api/openrouter/chat/completions", + format: "openai", + headers: {} + }, + cline: { + baseUrl: "https://api.cline.bot/api/v1/chat/completions", + format: "openai", + headers: { + "HTTP-Referer": "https://cline.bot", + "X-Title": "Cline" + }, + tokenUrl: "https://api.cline.bot/api/v1/auth/token", + refreshUrl: "https://api.cline.bot/api/v1/auth/refresh" + }, + nvidia: { + baseUrl: "https://integrate.api.nvidia.com/v1/chat/completions", + format: "openai" + }, + anthropic: { + baseUrl: "https://api.anthropic.com/v1/messages", + format: "claude", + headers: { ...CLAUDE_API_HEADERS } + }, + deepseek: { + baseUrl: "https://api.deepseek.com/chat/completions", + format: "openai" + }, + groq: { + baseUrl: "https://api.groq.com/openai/v1/chat/completions", + format: "openai" + }, + xai: { + baseUrl: "https://api.x.ai/v1/chat/completions", + format: "openai" + }, + mistral: { + baseUrl: "https://api.mistral.ai/v1/chat/completions", + format: "openai" + }, + perplexity: { + baseUrl: "https://api.perplexity.ai/chat/completions", + format: "openai" + }, + together: { + baseUrl: "https://api.together.xyz/v1/chat/completions", + format: "openai" + }, + fireworks: { + baseUrl: "https://api.fireworks.ai/inference/v1/chat/completions", + format: "openai" + }, + cerebras: { + baseUrl: "https://api.cerebras.ai/v1/chat/completions", + format: "openai" + }, + cohere: { + baseUrl: "https://api.cohere.ai/v1/chat/completions", + format: "openai" + }, + nebius: { + baseUrl: "https://api.studio.nebius.ai/v1/chat/completions", + format: "openai" + }, + siliconflow: { + baseUrl: "https://api.siliconflow.cn/v1/chat/completions", + format: "openai" + }, + hyperbolic: { + baseUrl: "https://api.hyperbolic.xyz/v1/chat/completions", + format: "openai" + }, + deepgram: { + baseUrl: "https://api.deepgram.com/v1/listen", + format: "openai" + }, + assemblyai: { + baseUrl: "https://api.assemblyai.com/v1/audio/transcriptions", + format: "openai" + }, + nanobanana: { + baseUrl: "https://api.nanobananaapi.ai/v1/chat/completions", + format: "openai" + }, + chutes: { + baseUrl: "https://llm.chutes.ai/v1/chat/completions", + format: "openai" + }, + ollama: { + baseUrl: "https://ollama.com/api/chat", + format: "ollama" + }, + "ollama-local": { + baseUrl: "http://localhost:11434/api/chat", + format: "ollama" + }, + // Vertex AI - Gemini models via Service Account JSON + // baseUrl is not used; VertexExecutor.buildUrl() constructs it dynamically + vertex: { + baseUrl: "https://aiplatform.googleapis.com", + format: "gemini" + }, + // Vertex AI - Partner models (Claude, Llama, Mistral, GLM) via SA JSON + // Uses OpenAI-compatible global endpoint (or rawPredict for Anthropic) + "vertex-partner": { + baseUrl: "https://aiplatform.googleapis.com", + format: "openai" + }, +}; diff --git a/open-sse/config/runtimeConfig.js b/open-sse/config/runtimeConfig.js new file mode 100644 index 0000000..19ac15a --- /dev/null +++ b/open-sse/config/runtimeConfig.js @@ -0,0 +1,92 @@ +// HTTP status codes +export const HTTP_STATUS = { + BAD_REQUEST: 400, + UNAUTHORIZED: 401, + PAYMENT_REQUIRED: 402, + FORBIDDEN: 403, + NOT_FOUND: 404, + NOT_ACCEPTABLE: 406, + REQUEST_TIMEOUT: 408, + RATE_LIMITED: 429, + SERVER_ERROR: 500, + BAD_GATEWAY: 502, + SERVICE_UNAVAILABLE: 503, + GATEWAY_TIMEOUT: 504 +}; + +// OpenAI-compatible error types mapping +export const ERROR_TYPES = { + [HTTP_STATUS.BAD_REQUEST]: { type: "invalid_request_error", code: "bad_request" }, + [HTTP_STATUS.UNAUTHORIZED]: { type: "authentication_error", code: "invalid_api_key" }, + [HTTP_STATUS.FORBIDDEN]: { type: "permission_error", code: "insufficient_quota" }, + [HTTP_STATUS.NOT_FOUND]: { type: "invalid_request_error", code: "model_not_found" }, + [HTTP_STATUS.NOT_ACCEPTABLE]: { type: "invalid_request_error", code: "model_not_supported" }, + [HTTP_STATUS.RATE_LIMITED]: { type: "rate_limit_error", code: "rate_limit_exceeded" }, + [HTTP_STATUS.SERVER_ERROR]: { type: "server_error", code: "internal_server_error" }, + [HTTP_STATUS.BAD_GATEWAY]: { type: "server_error", code: "bad_gateway" }, + [HTTP_STATUS.SERVICE_UNAVAILABLE]: { type: "server_error", code: "service_unavailable" }, + [HTTP_STATUS.GATEWAY_TIMEOUT]: { type: "server_error", code: "gateway_timeout" } +}; + +// Default error messages per status code +export const DEFAULT_ERROR_MESSAGES = { + [HTTP_STATUS.BAD_REQUEST]: "Bad request", + [HTTP_STATUS.UNAUTHORIZED]: "Invalid API key provided", + [HTTP_STATUS.FORBIDDEN]: "You exceeded your current quota", + [HTTP_STATUS.NOT_FOUND]: "Model not found", + [HTTP_STATUS.NOT_ACCEPTABLE]: "Model not supported", + [HTTP_STATUS.RATE_LIMITED]: "Rate limit exceeded", + [HTTP_STATUS.SERVER_ERROR]: "Internal server error", + [HTTP_STATUS.BAD_GATEWAY]: "Bad gateway - upstream provider error", + [HTTP_STATUS.SERVICE_UNAVAILABLE]: "Service temporarily unavailable", + [HTTP_STATUS.GATEWAY_TIMEOUT]: "Gateway timeout" +}; + +// Cache TTLs (seconds) +export const CACHE_TTL = { + userInfo: 300, // 5 minutes + modelAlias: 3600 // 1 hour +}; + +// Memory management config +export const MEMORY_CONFIG = { + sessionTtlMs: 2 * 60 * 60 * 1000, + sessionCleanupIntervalMs: 30 * 60 * 1000, + dnsCacheTtlMs: 5 * 60 * 1000, + proxyDispatchersMaxSize: 20, +}; + +// Default token limits +export const DEFAULT_MAX_TOKENS = 64000; +export const DEFAULT_MIN_TOKENS = 32000; + +// Retry config for 429 responses +export const RETRY_CONFIG = { + maxAttempts: 2, + delayMs: 2000 +}; + +// Exponential backoff config for rate limits +export const BACKOFF_CONFIG = { + base: 1000, + max: 2 * 60 * 1000, + maxLevel: 15 +}; + +// Error-based cooldown times +export const COOLDOWN_MS = { + unauthorized: 2 * 60 * 1000, + paymentRequired: 2 * 60 * 1000, + notFound: 2 * 60 * 1000, + transient: 30 * 1000, + requestNotAllowed: 5 * 1000, + // Legacy aliases + rateLimit: 2 * 60 * 1000, + serviceUnavailable: 2 * 1000, + authExpired: 2 * 60 * 1000 +}; + +// Requests containing these texts will bypass provider +export const SKIP_PATTERNS = [ + "Please write a 5-10 word title for the following conversation:" +]; diff --git a/open-sse/executors/antigravity.js b/open-sse/executors/antigravity.js index 30e0db8..7cb0e8e 100644 --- a/open-sse/executors/antigravity.js +++ b/open-sse/executors/antigravity.js @@ -1,6 +1,8 @@ import crypto from "crypto"; import { BaseExecutor } from "./base.js"; -import { PROVIDERS, OAUTH_ENDPOINTS, HTTP_STATUS, ANTIGRAVITY_HEADERS, INTERNAL_REQUEST_HEADER } from "../config/constants.js"; +import { PROVIDERS } from "../config/providers.js"; +import { OAUTH_ENDPOINTS, ANTIGRAVITY_HEADERS, INTERNAL_REQUEST_HEADER } from "../config/appConstants.js"; +import { HTTP_STATUS } from "../config/runtimeConfig.js"; import { deriveSessionId } from "../utils/sessionManager.js"; import { proxyAwareFetch } from "../utils/proxyFetch.js"; @@ -167,7 +169,9 @@ export class AntigravityExecutor extends BaseExecutor { let lastError = null; let lastStatus = 0; const MAX_AUTO_RETRIES = 3; + const MAX_RETRY_AFTER_RETRIES = 3; const retryAttemptsByUrl = {}; // Track retry attempts per URL + const retryAfterAttemptsByUrl = {}; // Track Retry-After retries per URL for (let urlIndex = 0; urlIndex < fallbackCount; urlIndex++) { const url = this.buildUrl(model, stream, urlIndex); @@ -175,10 +179,13 @@ export class AntigravityExecutor extends BaseExecutor { const sessionId = transformedBody.request?.sessionId; const headers = this.buildHeaders(credentials, stream, sessionId); - // Initialize retry counter for this URL + // Initialize retry counters for this URL if (!retryAttemptsByUrl[urlIndex]) { retryAttemptsByUrl[urlIndex] = 0; } + if (!retryAfterAttemptsByUrl[urlIndex]) { + retryAfterAttemptsByUrl[urlIndex] = 0; + } try { const response = await proxyAwareFetch(url, { @@ -204,8 +211,9 @@ export class AntigravityExecutor extends BaseExecutor { } } - if (retryMs && retryMs <= MAX_RETRY_AFTER_MS) { - log?.debug?.("RETRY", `${response.status} with Retry-After: ${Math.ceil(retryMs / 1000)}s, waiting...`); + if (retryMs && retryMs <= MAX_RETRY_AFTER_MS && retryAfterAttemptsByUrl[urlIndex] < MAX_RETRY_AFTER_RETRIES) { + retryAfterAttemptsByUrl[urlIndex]++; + log?.debug?.("RETRY", `${response.status} with Retry-After: ${Math.ceil(retryMs / 1000)}s, waiting... (${retryAfterAttemptsByUrl[urlIndex]}/${MAX_RETRY_AFTER_RETRIES})`); await new Promise(resolve => setTimeout(resolve, retryMs)); urlIndex--; continue; diff --git a/open-sse/executors/base.js b/open-sse/executors/base.js index 17f351d..61fb0f6 100644 --- a/open-sse/executors/base.js +++ b/open-sse/executors/base.js @@ -1,4 +1,4 @@ -import { HTTP_STATUS, RETRY_CONFIG } from "../config/constants.js"; +import { HTTP_STATUS, RETRY_CONFIG } from "../config/runtimeConfig.js"; import { proxyAwareFetch } from "../utils/proxyFetch.js"; /** diff --git a/open-sse/executors/codex.js b/open-sse/executors/codex.js index e49edf5..ca062ec 100644 --- a/open-sse/executors/codex.js +++ b/open-sse/executors/codex.js @@ -1,6 +1,6 @@ import { BaseExecutor } from "./base.js"; import { CODEX_DEFAULT_INSTRUCTIONS } from "../config/codexInstructions.js"; -import { PROVIDERS } from "../config/constants.js"; +import { PROVIDERS } from "../config/providers.js"; import { normalizeResponsesInput } from "../translator/helpers/responsesApiHelper.js"; /** diff --git a/open-sse/executors/cursor.js b/open-sse/executors/cursor.js index 6dfa9cc..18b9adf 100644 --- a/open-sse/executors/cursor.js +++ b/open-sse/executors/cursor.js @@ -1,16 +1,15 @@ import { BaseExecutor } from "./base.js"; -import { PROVIDERS, HTTP_STATUS } from "../config/constants.js"; +import { PROVIDERS } from "../config/providers.js"; +import { HTTP_STATUS } from "../config/runtimeConfig.js"; import { generateCursorBody, parseConnectRPCFrame, extractTextFromResponse } from "../utils/cursorProtobuf.js"; +import { buildCursorHeaders } from "../utils/cursorChecksum.js"; import { estimateUsage } from "../utils/usageTracking.js"; import { FORMATS } from "../translator/formats.js"; -import { buildCursorRequest } from "../translator/request/openai-to-cursor.js"; import { proxyAwareFetch } from "../utils/proxyFetch.js"; -import crypto from "crypto"; -import { v5 as uuidv5 } from "uuid"; import zlib from "zlib"; // Detect cloud environment @@ -37,18 +36,50 @@ const COMPRESS_FLAG = { GZIP_TRAILER: 0x03 }; +const CURSOR_STREAM_DEBUG = process.env.CURSOR_STREAM_DEBUG === "1"; +const debugLog = (...args) => { + if (CURSOR_STREAM_DEBUG) console.log(...args); +}; + function decompressPayload(payload, flags) { - // ConnectRPC trailer frame (flags & 0x02) - contains status JSON, not compressed data - if (flags & COMPRESS_FLAG.TRAILER) { - return payload; + // Check if payload is JSON error (starts with {"error") + if (payload.length > 10 && payload[0] === 0x7b && payload[1] === 0x22) { + try { + const text = payload.toString("utf-8"); + if (text.startsWith('{"error"')) { + debugLog(`[DECOMPRESS] Detected JSON error, skipping decompression`); + return payload; + } + } catch {} } - if (flags === COMPRESS_FLAG.GZIP) { + if ( + flags === COMPRESS_FLAG.GZIP || + flags === COMPRESS_FLAG.TRAILER || + flags === COMPRESS_FLAG.GZIP_TRAILER + ) { + // Primary: try gzip decompression (standard gzip header 0x1f 0x8b) try { return zlib.gunzipSync(payload); - } catch (err) { - console.log(`[DECOMPRESS ERROR] flags=${flags}, payloadSize=${payload.length}, error=${err.message}`); - return payload; + } catch (gzipErr) { + // Fallback: TRAILER and GZIP_TRAILER frames sometimes use raw zlib deflate format + try { + return zlib.inflateSync(payload); + } catch (deflateErr) { + // Last resort: try raw deflate (no zlib header) + try { + return zlib.inflateRawSync(payload); + } catch (rawErr) { + debugLog( + `[DECOMPRESS ERROR] flags=${flags}, payloadSize=${payload.length}, gzip=${gzipErr.message}, deflate=${deflateErr.message}, raw=${rawErr.message}` + ); + debugLog( + `[DECOMPRESS ERROR] First 50 bytes (hex):`, + payload.slice(0, 50).toString("hex") + ); + return payload; + } + } } } return payload; @@ -83,46 +114,6 @@ export class CursorExecutor extends BaseExecutor { return `${this.config.baseUrl}${this.config.chatPath}`; } - // Jyh cipher checksum for Cursor API authentication - generateChecksum(machineId) { - const timestamp = Math.floor(Date.now() / 1000000); - const byteArray = new Uint8Array([ - (timestamp >> 40) & 0xFF, - (timestamp >> 32) & 0xFF, - (timestamp >> 24) & 0xFF, - (timestamp >> 16) & 0xFF, - (timestamp >> 8) & 0xFF, - timestamp & 0xFF - ]); - - let t = 165; - for (let i = 0; i < byteArray.length; i++) { - byteArray[i] = ((byteArray[i] ^ t) + (i % 256)) & 0xFF; - t = byteArray[i]; - } - - const alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_"; - let encoded = ""; - - for (let i = 0; i < byteArray.length; i += 3) { - const a = byteArray[i]; - const b = i + 1 < byteArray.length ? byteArray[i + 1] : 0; - const c = i + 2 < byteArray.length ? byteArray[i + 2] : 0; - - encoded += alphabet[a >> 2]; - encoded += alphabet[((a & 3) << 4) | (b >> 4)]; - - if (i + 1 < byteArray.length) { - encoded += alphabet[((b & 15) << 2) | (c >> 6)]; - } - if (i + 2 < byteArray.length) { - encoded += alphabet[c & 63]; - } - } - - return `${encoded}${machineId}`; - } - buildHeaders(credentials) { const accessToken = credentials.accessToken; const machineId = credentials.providerSpecificData?.machineId; @@ -132,34 +123,14 @@ export class CursorExecutor extends BaseExecutor { throw new Error("Machine ID is required for Cursor API"); } - const cleanToken = accessToken.includes("::") ? accessToken.split("::")[1] : accessToken; - - return { - "authorization": `Bearer ${cleanToken}`, - "connect-accept-encoding": "gzip", - "connect-protocol-version": "1", - "content-type": "application/connect+proto", - "user-agent": "connect-es/1.6.1", - "x-amzn-trace-id": `Root=${crypto.randomUUID()}`, - "x-client-key": crypto.createHash("sha256").update(cleanToken).digest("hex"), - "x-cursor-checksum": this.generateChecksum(machineId), - "x-cursor-client-version": "2.3.41", - "x-cursor-client-type": "ide", - "x-cursor-client-os": process.platform === "win32" ? "windows" : process.platform === "darwin" ? "macos" : "linux", - "x-cursor-client-arch": process.arch === "arm64" ? "aarch64" : "x64", - "x-cursor-client-device-type": "desktop", - "x-cursor-config-version": crypto.randomUUID(), - "x-cursor-timezone": Intl.DateTimeFormat().resolvedOptions().timeZone || "UTC", - "x-ghost-mode": ghostMode ? "true" : "false", - "x-request-id": crypto.randomUUID(), - "x-session-id": uuidv5(cleanToken, uuidv5.DNS), - }; + return buildCursorHeaders(accessToken, machineId, ghostMode); } transformRequest(model, body, stream, credentials) { - const translatedBody = buildCursorRequest(model, body, stream, credentials); - const messages = translatedBody.messages || []; - const tools = translatedBody.tools || body.tools || []; + // Messages are already translated by chatCore (claude→openai→cursor) + // Do NOT call buildCursorRequest again — double-translation drops tool_results + const messages = body.messages || []; + const tools = body.tools || []; const reasoningEffort = body.reasoning_effort || null; return generateCursorBody(messages, model, tools, reasoningEffort); } @@ -184,13 +155,30 @@ export class CursorExecutor extends BaseExecutor { throw new Error("http2 module not available"); } + const HTTP2_TIMEOUT_MS = 60000; // 60s max — prevent hung sessions + return new Promise((resolve, reject) => { const urlObj = new URL(url); const client = http2.connect(`https://${urlObj.host}`); const chunks = []; let responseHeaders = {}; + let settled = false; - client.on("error", reject); + // Ensure client is always closed on settle + const finish = (fn) => (...args) => { + if (settled) return; + settled = true; + clearTimeout(hangTimeout); + client.close(); + fn(...args); + }; + + // Hard timeout: close session if server never responds + const hangTimeout = setTimeout(finish(() => { + reject(new Error("HTTP/2 request timed out")); + }), HTTP2_TIMEOUT_MS); + + client.on("error", finish(reject)); const req = client.request({ ":method": "POST", @@ -202,25 +190,18 @@ export class CursorExecutor extends BaseExecutor { req.on("response", (hdrs) => { responseHeaders = hdrs; }); req.on("data", (chunk) => { chunks.push(chunk); }); - req.on("end", () => { - client.close(); + req.on("end", finish(() => { resolve({ status: responseHeaders[":status"], headers: responseHeaders, body: Buffer.concat(chunks) }); - }); - req.on("error", (err) => { - client.close(); - reject(err); - }); + })); + req.on("error", finish(reject)); if (signal) { - signal.addEventListener("abort", () => { - req.close(); - client.close(); - reject(new Error("Request aborted")); - }); + const onAbort = finish(() => reject(new Error("Request aborted"))); + signal.addEventListener("abort", onAbort, { once: true }); } req.write(body); @@ -282,53 +263,87 @@ export class CursorExecutor extends BaseExecutor { let totalContent = ""; const toolCalls = []; const toolCallsMap = new Map(); // Track streaming tool calls by ID + const finalizedIds = new Set(); let frameCount = 0; + debugLog(`[CURSOR BUFFER] Total length: ${buffer.length} bytes`); + while (offset < buffer.length) { - if (offset + 5 > buffer.length) break; + if (offset + 5 > buffer.length) { + debugLog( + `[CURSOR BUFFER] Reached end, offset=${offset}, remaining=${buffer.length - offset}` + ); + break; + } const flags = buffer[offset]; const length = buffer.readUInt32BE(offset + 1); - if (offset + 5 + length > buffer.length) break; + debugLog( + `[CURSOR BUFFER] Frame ${frameCount + 1}: flags=0x${flags.toString(16).padStart(2, "0")}, length=${length}` + ); + + if (offset + 5 + length > buffer.length) { + debugLog( + `[CURSOR BUFFER] Incomplete frame, offset=${offset}, length=${length}, buffer.length=${buffer.length}` + ); + break; + } let payload = buffer.slice(offset + 5, offset + 5 + length); offset += 5 + length; frameCount++; - // Stop at ConnectRPC trailer frame (end of response, anything after is a separate response) - if (flags & COMPRESS_FLAG.TRAILER) { - break; + payload = decompressPayload(payload, flags); + if (!payload) { + debugLog(`[CURSOR BUFFER] Frame ${frameCount}: decompression failed, skipping`); + continue; } - payload = decompressPayload(payload, flags); - if (!payload) continue; - - try { - const text = payload.toString("utf-8"); - if (text.startsWith("{") && text.includes('"error"')) { - return createErrorResponse(JSON.parse(text)); - } - } catch {} + // Check for JSON error frames (byte guard: skip toString on non-JSON frames) + if (payload.length > 0 && payload[0] === 0x7b) { + try { + const text = payload.toString("utf-8"); + if (text.includes('"error"')) { + const hasContent = totalContent || toolCallsMap.size > 0; + debugLog( + `[CURSOR BUFFER] Error frame (hasContent=${hasContent}): ${text.slice(0, 500)}` + ); + if (hasContent) { + break; + } + return createErrorResponse(JSON.parse(text)); + } + } catch {} + } const result = extractTextFromResponse(new Uint8Array(payload)); + debugLog(`[CURSOR DECODED] Frame ${frameCount}:`, result); if (result.error) { - return new Response(JSON.stringify({ - error: { - message: result.error, - type: "rate_limit_error", - code: "rate_limited" + const hasContent = totalContent || toolCallsMap.size > 0; + debugLog(`[CURSOR BUFFER] Decoded error (hasContent=${hasContent}): ${result.error}`); + if (hasContent) { + break; + } + return new Response( + JSON.stringify({ + error: { + message: result.error, + type: "rate_limit_error", + code: "rate_limited" + } + }), + { + status: HTTP_STATUS.RATE_LIMITED, + headers: { "Content-Type": "application/json" } } - }), { - status: HTTP_STATUS.RATE_LIMITED, - headers: { "Content-Type": "application/json" } - }); + ); } if (result.toolCall) { const tc = result.toolCall; - + if (toolCallsMap.has(tc.id)) { // Accumulate arguments for existing tool call const existing = toolCallsMap.get(tc.id); @@ -338,10 +353,11 @@ export class CursorExecutor extends BaseExecutor { // New tool call toolCallsMap.set(tc.id, { ...tc }); } - + // Push to final array when isLast is true if (tc.isLast) { const finalToolCall = toolCallsMap.get(tc.id); + finalizedIds.add(tc.id); toolCalls.push({ id: finalToolCall.id, type: finalToolCall.type, @@ -352,14 +368,19 @@ export class CursorExecutor extends BaseExecutor { }); } } - + if (result.text) totalContent += result.text; } + debugLog( + `[CURSOR BUFFER] Parsed ${frameCount} frames, toolCallsMap size: ${toolCallsMap.size}, finalized toolCalls: ${toolCalls.length}` + ); + // Finalize all remaining tool calls in map (in case stream ended without isLast=true) for (const [id, tc] of toolCallsMap.entries()) { // Check if already in final array - if (!toolCalls.find(t => t.id === id)) { + if (!finalizedIds.has(id)) { + debugLog(`[CURSOR BUFFER] Finalizing incomplete tool call: ${id}, isLast=${tc.isLast}`); toolCalls.push({ id: tc.id, type: tc.type, @@ -371,6 +392,8 @@ export class CursorExecutor extends BaseExecutor { } } + debugLog(`[CURSOR BUFFER] Final toolCalls count: ${toolCalls.length}`); + const message = { role: "assistant", @@ -411,176 +434,294 @@ export class CursorExecutor extends BaseExecutor { let totalContent = ""; const toolCalls = []; const toolCallsMap = new Map(); // Track streaming tool calls by ID + const finalizedIds = new Set(); + const emittedToolCallIds = new Set(); let frameCount = 0; + debugLog(`[CURSOR BUFFER SSE] Total length: ${buffer.length} bytes`); + while (offset < buffer.length) { - if (offset + 5 > buffer.length) break; + if (offset + 5 > buffer.length) { + debugLog( + `[CURSOR BUFFER SSE] Reached end, offset=${offset}, remaining=${buffer.length - offset}` + ); + break; + } const flags = buffer[offset]; const length = buffer.readUInt32BE(offset + 1); - if (offset + 5 + length > buffer.length) break; + debugLog( + `[CURSOR BUFFER SSE] Frame ${frameCount + 1}: flags=0x${flags.toString(16).padStart(2, "0")}, length=${length}` + ); + + if (offset + 5 + length > buffer.length) { + debugLog( + `[CURSOR BUFFER SSE] Incomplete frame, offset=${offset}, length=${length}, buffer.length=${buffer.length}` + ); + break; + } let payload = buffer.slice(offset + 5, offset + 5 + length); offset += 5 + length; frameCount++; - // Stop at ConnectRPC trailer frame (end of response, anything after is a separate response) - if (flags & COMPRESS_FLAG.TRAILER) { - break; + payload = decompressPayload(payload, flags); + if (!payload) { + debugLog(`[CURSOR BUFFER SSE] Frame ${frameCount}: decompression failed, skipping`); + continue; } - payload = decompressPayload(payload, flags); - if (!payload) continue; - - try { - const text = payload.toString("utf-8"); - if (text.startsWith("{") && text.includes('"error"')) { - return createErrorResponse(JSON.parse(text)); - } - } catch {} + // Check for JSON error frames (byte-guard: only decode if starts with '{') + if (payload[0] === 0x7b) { + try { + const text = payload.toString("utf-8"); + if (text.includes('"error"')) { + const hasContent = chunks.length > 0 || totalContent || toolCallsMap.size > 0; + debugLog( + `[CURSOR BUFFER SSE] Error frame (hasContent=${hasContent}): ${text.slice(0, 500)}` + ); + if (hasContent) { + break; + } + return createErrorResponse(JSON.parse(text)); + } + } catch {} + } const result = extractTextFromResponse(new Uint8Array(payload)); + debugLog(`[CURSOR DECODED SSE] Frame ${frameCount}:`, result); if (result.error) { - return new Response(JSON.stringify({ - error: { - message: result.error, - type: "rate_limit_error", - code: "rate_limited" + const hasContent = chunks.length > 0 || totalContent || toolCallsMap.size > 0; + debugLog(`[CURSOR BUFFER SSE] Decoded error (hasContent=${hasContent}): ${result.error}`); + if (hasContent) { + break; + } + return new Response( + JSON.stringify({ + error: { + message: result.error, + type: "rate_limit_error", + code: "rate_limited" + } + }), + { + status: HTTP_STATUS.RATE_LIMITED, + headers: { "Content-Type": "application/json" } } - }), { - status: HTTP_STATUS.RATE_LIMITED, - headers: { "Content-Type": "application/json" } - }); + ); } if (result.toolCall) { const tc = result.toolCall; - + if (chunks.length === 0) { - chunks.push(`data: ${JSON.stringify({ - id: responseId, - object: "chat.completion.chunk", - created, - model, - choices: [{ - index: 0, - delta: { role: "assistant", content: "" }, - finish_reason: null - }] - })}\n\n`); + chunks.push( + `data: ${JSON.stringify({ + id: responseId, + object: "chat.completion.chunk", + created, + model, + choices: [ + { + index: 0, + delta: { role: "assistant", content: "" }, + finish_reason: null + } + ] + })}\n\n` + ); } - + if (toolCallsMap.has(tc.id)) { // Accumulate arguments for existing tool call const existing = toolCallsMap.get(tc.id); const oldArgsLen = existing.function.arguments.length; existing.function.arguments += tc.function.arguments; existing.isLast = tc.isLast; - + // Stream the delta arguments if (tc.function.arguments) { - chunks.push(`data: ${JSON.stringify({ - id: responseId, - object: "chat.completion.chunk", - created, - model, - choices: [{ - index: 0, - delta: { - tool_calls: [{ - index: existing.index, - id: tc.id, - type: "function", - function: { - name: tc.function.name, - arguments: tc.function.arguments - } - }] - }, - finish_reason: null - }] - })}\n\n`); + emittedToolCallIds.add(tc.id); + chunks.push( + `data: ${JSON.stringify({ + id: responseId, + object: "chat.completion.chunk", + created, + model, + choices: [ + { + index: 0, + delta: { + tool_calls: [ + { + index: existing.index, + id: tc.id, + type: "function", + function: { + name: tc.function.name, + arguments: tc.function.arguments + } + } + ] + }, + finish_reason: null + } + ] + })}\n\n` + ); } } else { // New tool call - assign index and add to map const toolCallIndex = toolCalls.length; + finalizedIds.add(tc.id); toolCalls.push({ ...tc, index: toolCallIndex }); toolCallsMap.set(tc.id, { ...tc, index: toolCallIndex }); - + // Stream initial tool call with name - chunks.push(`data: ${JSON.stringify({ - id: responseId, - object: "chat.completion.chunk", - created, - model, - choices: [{ - index: 0, - delta: { - tool_calls: [{ - index: toolCallIndex, - id: tc.id, - type: "function", - function: { - name: tc.function.name, - arguments: tc.function.arguments - } - }] - }, - finish_reason: null - }] - })}\n\n`); + emittedToolCallIds.add(tc.id); + chunks.push( + `data: ${JSON.stringify({ + id: responseId, + object: "chat.completion.chunk", + created, + model, + choices: [ + { + index: 0, + delta: { + tool_calls: [ + { + index: toolCallIndex, + id: tc.id, + type: "function", + function: { + name: tc.function.name, + arguments: tc.function.arguments + } + } + ] + }, + finish_reason: null + } + ] + })}\n\n` + ); } } if (result.text) { totalContent += result.text; - chunks.push(`data: ${JSON.stringify({ + chunks.push( + `data: ${JSON.stringify({ + id: responseId, + object: "chat.completion.chunk", + created, + model, + choices: [ + { + index: 0, + delta: + chunks.length === 0 && toolCalls.length === 0 + ? { role: "assistant", content: result.text } + : { content: result.text }, + finish_reason: null + } + ] + })}\n\n` + ); + } + } + + debugLog( + `[CURSOR BUFFER SSE] Parsed ${frameCount} frames, toolCallsMap size: ${toolCallsMap.size}, toolCalls array: ${toolCalls.length}` + ); + + // Finalize all remaining tool calls in map (stream may have ended without isLast=true) + for (const [id, tc] of toolCallsMap.entries()) { + if (!finalizedIds.has(id)) { + debugLog(`[CURSOR BUFFER SSE] Finalizing incomplete tool call: ${id}, isLast=${tc.isLast}`); + const toolCallIndex = toolCalls.length; + toolCalls.push({ + id: tc.id, + type: tc.type, + index: toolCallIndex, + function: { + name: tc.function.name, + arguments: tc.function.arguments + } + }); + + // Emit SSE chunk for the finalized tool call if not already emitted + if (!emittedToolCallIds.has(tc.id)) { + chunks.push( + `data: ${JSON.stringify({ + id: responseId, + object: "chat.completion.chunk", + created, + model, + choices: [ + { + index: 0, + delta: { + tool_calls: [ + { + index: toolCallIndex, + id: tc.id, + type: "function", + function: { + name: tc.function.name, + arguments: tc.function.arguments + } + } + ] + }, + finish_reason: null + } + ] + })}\n\n` + ); + } + } + } + + if (chunks.length === 0 && toolCalls.length === 0) { + chunks.push( + `data: ${JSON.stringify({ id: responseId, object: "chat.completion.chunk", created, model, - choices: [{ - index: 0, - delta: chunks.length === 0 && toolCalls.length === 0 - ? { role: "assistant", content: result.text } - : { content: result.text }, - finish_reason: null - }] - })}\n\n`); - } - } - - - if (chunks.length === 0 && toolCalls.length === 0) { - chunks.push(`data: ${JSON.stringify({ - id: responseId, - object: "chat.completion.chunk", - created, - model, - choices: [{ - index: 0, - delta: { role: "assistant", content: "" }, - finish_reason: null - }] - })}\n\n`); + choices: [ + { + index: 0, + delta: { role: "assistant", content: "" }, + finish_reason: null + } + ] + })}\n\n` + ); } const usage = estimateUsage(body, totalContent.length, FORMATS.OPENAI); - chunks.push(`data: ${JSON.stringify({ - id: responseId, - object: "chat.completion.chunk", - created, - model, - choices: [{ - index: 0, - delta: {}, - finish_reason: toolCalls.length > 0 ? "tool_calls" : "stop" - }], - usage - })}\n\n`); + chunks.push( + `data: ${JSON.stringify({ + id: responseId, + object: "chat.completion.chunk", + created, + model, + choices: [ + { + index: 0, + delta: {}, + finish_reason: toolCalls.length > 0 ? "tool_calls" : "stop" + } + ], + usage + })}\n\n` + ); chunks.push("data: [DONE]\n\n"); return new Response(chunks.join(""), { diff --git a/open-sse/executors/default.js b/open-sse/executors/default.js index eba3c61..eb3e19d 100644 --- a/open-sse/executors/default.js +++ b/open-sse/executors/default.js @@ -1,5 +1,6 @@ import { BaseExecutor } from "./base.js"; -import { PROVIDERS, OAUTH_ENDPOINTS } from "../config/constants.js"; +import { PROVIDERS } from "../config/providers.js"; +import { OAUTH_ENDPOINTS, buildKimiHeaders } from "../config/appConstants.js"; import { buildClineHeaders } from "../../src/shared/utils/clineAuth.js"; export class DefaultExecutor extends BaseExecutor { @@ -23,10 +24,11 @@ export class DefaultExecutor extends BaseExecutor { case "claude": case "glm": case "kimi": - case "kimi-coding": case "minimax": case "minimax-cn": return `${this.config.baseUrl}?beta=true`; + case "kimi-coding": + return `${this.config.baseUrl}?beta=true`; case "gemini": return `${this.config.baseUrl}/${model}:${stream ? "streamGenerateContent?alt=sse" : "generateContent"}`; default: @@ -46,11 +48,14 @@ export class DefaultExecutor extends BaseExecutor { break; case "glm": case "kimi": - case "kimi-coding": case "minimax": case "minimax-cn": headers["x-api-key"] = credentials.apiKey || credentials.accessToken; break; + case "kimi-coding": + headers["Authorization"] = `Bearer ${credentials.accessToken}`; + Object.assign(headers, buildKimiHeaders()); + break; default: if (this.provider?.startsWith?.("anthropic-compatible-")) { if (credentials.apiKey) { @@ -184,9 +189,14 @@ export class DefaultExecutor extends BaseExecutor { } async refreshKimiCoding(refreshToken) { + const kimiHeaders = buildKimiHeaders(); const response = await fetch("https://auth.kimi.com/api/oauth/token", { method: "POST", - headers: { "Content-Type": "application/x-www-form-urlencoded", "Accept": "application/json" }, + headers: { + "Content-Type": "application/x-www-form-urlencoded", + "Accept": "application/json", + ...kimiHeaders + }, body: new URLSearchParams({ grant_type: "refresh_token", refresh_token: refreshToken, client_id: "17e5f671-d194-4dfb-9706-5516cb48c098" }) }); if (!response.ok) return null; diff --git a/open-sse/executors/gemini-cli.js b/open-sse/executors/gemini-cli.js index e6fe3d3..45470c8 100644 --- a/open-sse/executors/gemini-cli.js +++ b/open-sse/executors/gemini-cli.js @@ -1,5 +1,6 @@ import { BaseExecutor } from "./base.js"; -import { PROVIDERS, OAUTH_ENDPOINTS, GEMINI_CLI_API_CLIENT, geminiCLIUserAgent } from "../config/constants.js"; +import { PROVIDERS } from "../config/providers.js"; +import { OAUTH_ENDPOINTS, GEMINI_CLI_API_CLIENT, geminiCLIUserAgent } from "../config/appConstants.js"; export class GeminiCLIExecutor extends BaseExecutor { constructor() { diff --git a/open-sse/executors/github.js b/open-sse/executors/github.js index c38dec8..b174007 100644 --- a/open-sse/executors/github.js +++ b/open-sse/executors/github.js @@ -1,5 +1,7 @@ import { BaseExecutor } from "./base.js"; -import { PROVIDERS, OAUTH_ENDPOINTS, HTTP_STATUS, GITHUB_COPILOT } from "../config/constants.js"; +import { PROVIDERS } from "../config/providers.js"; +import { OAUTH_ENDPOINTS, GITHUB_COPILOT } from "../config/appConstants.js"; +import { HTTP_STATUS } from "../config/runtimeConfig.js"; import { openaiToOpenAIResponsesRequest } from "../translator/request/openai-responses.js"; import { openaiResponsesToOpenAIResponse } from "../translator/response/openai-responses.js"; import { initState } from "../translator/index.js"; @@ -42,6 +44,36 @@ export class GithubExecutor extends BaseExecutor { if (!body?.messages) return body; const sanitized = { ...body }; + + // Handle response_format for Claude models via GitHub + // GitHub's internal translation doesn't respect response_format, so we inject it as a system prompt + // AND prepend a reminder to the last user message for maximum effectiveness + if (body.response_format && body.model?.includes('claude')) { + const responseFormat = body.response_format; + let systemInstruction = ''; + if (responseFormat.type === 'json_schema' && responseFormat.json_schema?.schema) { + systemInstruction = 'CRITICAL: You must ONLY output raw JSON. Never use markdown code blocks. Never use backticks. Never wrap JSON in triple backticks. Output ONLY the raw JSON object.'; + } else if (responseFormat.type === 'json_object') { + systemInstruction = 'CRITICAL: You must ONLY output raw JSON. Never use markdown code blocks. Never use backticks.'; + } + if (systemInstruction) { + // Add to system message + const systemIdx = body.messages.findIndex(m => m.role === 'system'); + if (systemIdx >= 0) { + body.messages[systemIdx].content = systemInstruction + '\n\n' + body.messages[systemIdx].content; + } else { + body.messages.unshift({ role: 'system', content: systemInstruction }); + } + + // Also prepend to the last user message as a reminder + const lastUserIdx = body.messages.map((m, i) => m.role === 'user' ? i : -1).filter(i => i >= 0).pop(); + if (lastUserIdx >= 0) { + const userMsg = body.messages[lastUserIdx]; + const userContent = typeof userMsg.content === 'string' ? userMsg.content : JSON.stringify(userMsg.content); + userMsg.content = 'Respond with ONLY raw JSON (no markdown, no backticks, no code blocks): ' + userContent; + } + } + } sanitized.messages = body.messages.map(msg => { // assistant messages with only tool_calls have content: null — leave as-is if (!msg.content) return msg; @@ -141,7 +173,7 @@ export class GithubExecutor extends BaseExecutor { const parsed = parseSSELine(trimmed); if (!parsed) continue; - if (parsed.done) { + if (parsed.done && stream === true) { controller.enqueue(new TextEncoder().encode("data: [DONE]\n\n")); continue; } @@ -166,6 +198,9 @@ export class GithubExecutor extends BaseExecutor { } }); + if (!response.body) { + return { response: new Response("", { status: response.status, headers: response.headers }), url, headers, transformedBody }; + } const convertedStream = response.body.pipeThrough(transformStream); return { diff --git a/open-sse/executors/iflow.js b/open-sse/executors/iflow.js index f18753b..701c8c9 100644 --- a/open-sse/executors/iflow.js +++ b/open-sse/executors/iflow.js @@ -1,6 +1,6 @@ import crypto from "crypto"; import { BaseExecutor } from "./base.js"; -import { PROVIDERS } from "../config/constants.js"; +import { PROVIDERS } from "../config/providers.js"; /** * IFlowExecutor - Executor for iFlow API with HMAC-SHA256 signature diff --git a/open-sse/executors/index.js b/open-sse/executors/index.js index 49a4c27..ac76eb7 100644 --- a/open-sse/executors/index.js +++ b/open-sse/executors/index.js @@ -5,6 +5,7 @@ import { IFlowExecutor } from "./iflow.js"; import { KiroExecutor } from "./kiro.js"; import { CodexExecutor } from "./codex.js"; import { CursorExecutor } from "./cursor.js"; +import { VertexExecutor } from "./vertex.js"; import { DefaultExecutor } from "./default.js"; const executors = { @@ -15,7 +16,9 @@ const executors = { kiro: new KiroExecutor(), codex: new CodexExecutor(), cursor: new CursorExecutor(), - cu: new CursorExecutor() // Alias for cursor + cu: new CursorExecutor(), // Alias for cursor + vertex: new VertexExecutor("vertex"), + "vertex-partner": new VertexExecutor("vertex-partner"), }; const defaultCache = new Map(); @@ -38,4 +41,5 @@ export { IFlowExecutor } from "./iflow.js"; export { KiroExecutor } from "./kiro.js"; export { CodexExecutor } from "./codex.js"; export { CursorExecutor } from "./cursor.js"; +export { VertexExecutor } from "./vertex.js"; export { DefaultExecutor } from "./default.js"; diff --git a/open-sse/executors/kiro.js b/open-sse/executors/kiro.js index 9ad2ed9..cc9ebdc 100644 --- a/open-sse/executors/kiro.js +++ b/open-sse/executors/kiro.js @@ -1,5 +1,5 @@ import { BaseExecutor } from "./base.js"; -import { PROVIDERS } from "../config/constants.js"; +import { PROVIDERS } from "../config/providers.js"; import { v4 as uuidv4 } from "uuid"; import { refreshKiroToken } from "../services/tokenRefresh.js"; import { proxyAwareFetch } from "../utils/proxyFetch.js"; @@ -345,6 +345,9 @@ export class KiroExecutor extends BaseExecutor { }); // Pipe response body through transform stream + if (!response.body) { + return new Response("data: [DONE]\n\n", { status: response.status, headers: { "Content-Type": "text/event-stream" } }); + } const transformedStream = response.body.pipeThrough(transformStream); return new Response(transformedStream, { diff --git a/open-sse/executors/vertex.js b/open-sse/executors/vertex.js new file mode 100644 index 0000000..8928faf --- /dev/null +++ b/open-sse/executors/vertex.js @@ -0,0 +1,120 @@ +import { BaseExecutor } from "./base.js"; +import { PROVIDERS } from "../config/providers.js"; +import { parseVertexSaJson, refreshVertexToken } from "../services/tokenRefresh.js"; +import { proxyAwareFetch } from "../utils/proxyFetch.js"; + +// Cache project IDs resolved from raw API keys { apiKey → projectId } +const projectIdCache = new Map(); + +/** + * Resolve GCP project ID from a raw Vertex API key. + * Sends a dummy 404 request and parses "projects/{id}" from the error message. + */ +async function resolveProjectId(apiKey) { + if (projectIdCache.has(apiKey)) return projectIdCache.get(apiKey); + + const res = await fetch( + `https://aiplatform.googleapis.com/v1/publishers/google/models/__probe__:generateContent?key=${apiKey}`, + { method: "POST", headers: { "Content-Type": "application/json" }, body: "{}" } + ); + const json = await res.json().catch(() => null); + const msg = json?.[0]?.error?.message || json?.error?.message || ""; + const match = msg.match(/projects\/([^/]+)\//); + const projectId = match?.[1] || null; + + if (projectId) projectIdCache.set(apiKey, projectId); + return projectId; +} + +/** + * VertexExecutor - Google Cloud Vertex AI + * + * "vertex" → Gemini models via regional/global Vertex endpoint + * "vertex-partner" → Partner models (Llama, Mistral, GLM, DeepSeek, Qwen) + * via global OpenAI-compatible endpoint + * + * Auth: SA JSON (stored as apiKey) → JWT assertion → Bearer token (via jose) + * Token is minted/cached in tokenRefresh.js, not here. + */ +export class VertexExecutor extends BaseExecutor { + constructor(providerId = "vertex") { + super(providerId, PROVIDERS[providerId] || {}); + } + + buildUrl(model, stream, urlIndex = 0, credentials = null) { + const saJson = parseVertexSaJson(credentials?.apiKey); + const rawKey = !saJson ? credentials?.apiKey : null; + const projectId = saJson?.project_id || credentials?.providerSpecificData?.projectId; + + if (this.provider === "vertex-partner") { + // Partner models require project_id in path regardless of auth method + if (!projectId) throw new Error("Vertex partner models require a project_id. Add it in providerSpecificData or use Service Account JSON."); + const url = `https://aiplatform.googleapis.com/v1/projects/${projectId}/locations/global/endpoints/openapi/chat/completions`; + return rawKey ? `${url}?key=${rawKey}` : url; + } + + // Gemini on Vertex: always use global publishers endpoint + const action = stream ? "streamGenerateContent" : "generateContent"; + let url = `https://aiplatform.googleapis.com/v1/publishers/google/models/${model}:${action}`; + + if (rawKey) url += `?key=${rawKey}`; + return url; + } + + buildHeaders(credentials, stream = true) { + const headers = { "Content-Type": "application/json" }; + + // Only set Bearer token if using SA JSON flow (raw key goes in URL ?key=) + if (credentials.accessToken) { + headers["Authorization"] = `Bearer ${credentials.accessToken}`; + } + + if (stream) headers["Accept"] = "text/event-stream"; + + return headers; + } + + async refreshCredentials(credentials, log) { + const saJson = parseVertexSaJson(credentials?.apiKey); + if (!saJson) return null; + + const result = await refreshVertexToken(saJson, log); + if (!result) return null; + + return { accessToken: result.accessToken, expiresAt: result.expiresAt }; + } + + async execute({ model, body, stream, credentials, signal, log, proxyOptions = null }) { + const saJson = parseVertexSaJson(credentials?.apiKey); + + // SA JSON flow: mint Bearer token (cached) + if (saJson) { + const result = await refreshVertexToken(saJson, log); + if (!result?.accessToken) throw new Error("Vertex: failed to mint access token from Service Account JSON"); + credentials.accessToken = result.accessToken; + } + + // vertex-partner with raw key: auto-resolve project_id if not provided + if (this.provider === "vertex-partner" && !saJson && !credentials?.providerSpecificData?.projectId) { + const projectId = await resolveProjectId(credentials.apiKey); + if (!projectId) throw new Error("Vertex: could not resolve project_id from API key. Please add it manually in provider settings."); + log?.debug?.("VERTEX", `Resolved project_id: ${projectId}`); + credentials.providerSpecificData = { ...credentials.providerSpecificData, projectId }; + } + + const url = this.buildUrl(model, stream, 0, credentials); + const headers = this.buildHeaders(credentials, stream); + const transformedBody = this.transformRequest(model, body, stream, credentials); + + const response = await proxyAwareFetch(url, { + method: "POST", + headers, + body: JSON.stringify(transformedBody), + signal, + }, proxyOptions); + + return { response, url, headers, transformedBody }; + } +} + +export default VertexExecutor; diff --git a/open-sse/handlers/chatCore.js b/open-sse/handlers/chatCore.js index 92c89c6..b61ed42 100644 --- a/open-sse/handlers/chatCore.js +++ b/open-sse/handlers/chatCore.js @@ -7,7 +7,7 @@ import { refreshWithRetry } from "../services/tokenRefresh.js"; import { createRequestLogger } from "../utils/requestLogger.js"; import { getModelTargetFormat, PROVIDER_ID_TO_ALIAS } from "../config/providerModels.js"; import { createErrorResult, parseUpstreamError, formatProviderError } from "../utils/error.js"; -import { HTTP_STATUS } from "../config/constants.js"; +import { HTTP_STATUS } from "../config/runtimeConfig.js"; import { handleBypassRequest } from "../utils/bypassHandler.js"; import { trackPendingRequest, appendRequestLog, saveRequestDetail } from "@/lib/usageDb.js"; import { getExecutor } from "../executors/index.js"; @@ -23,14 +23,14 @@ import { handleStreamingResponse, buildOnStreamComplete } from "./chatCore/strea * @param {object} options.credentials - Provider credentials * @param {string} options.sourceFormatOverride - Override detected source format (e.g. "openai-responses") */ -export async function handleChatCore({ body, modelInfo, credentials, log, onCredentialsRefreshed, onRequestSuccess, onDisconnect, clientRawRequest, connectionId, userAgent, apiKey, sourceFormatOverride }) { +export async function handleChatCore({ body, modelInfo, credentials, log, onCredentialsRefreshed, onRequestSuccess, onDisconnect, clientRawRequest, connectionId, userAgent, apiKey, ccFilterNaming, sourceFormatOverride }) { const { provider, model } = modelInfo; const requestStartTime = Date.now(); const sourceFormat = sourceFormatOverride || detectFormat(body); - // Check for bypass patterns (warmup, skip) - const bypassResponse = handleBypassRequest(body, model, userAgent); + // Check for bypass patterns (warmup, skip, cc naming) + const bypassResponse = handleBypassRequest(body, model, userAgent, ccFilterNaming); if (bypassResponse) return bypassResponse; const alias = PROVIDER_ID_TO_ALIAS[provider] || provider; @@ -39,7 +39,16 @@ export async function handleChatCore({ body, modelInfo, credentials, log, onCred const clientRequestedStreaming = body.stream === true || sourceFormat === FORMATS.ANTIGRAVITY || sourceFormat === FORMATS.GEMINI || sourceFormat === FORMATS.GEMINI_CLI; const providerRequiresStreaming = provider === "openai" || provider === "codex"; - const stream = providerRequiresStreaming ? true : (body.stream !== false); + let stream = providerRequiresStreaming ? true : (body.stream !== false); + + // Check client Accept header preference for non-streaming requests + // This fixes AI SDK compatibility where clients send Accept: application/json + const acceptHeader = clientRawRequest?.headers?.accept || ""; + const clientPrefersJson = acceptHeader.includes("application/json"); + const clientPrefersSSE = acceptHeader.includes("text/event-stream"); + if (clientPrefersJson && !clientPrefersSSE && body.stream !== true) { + stream = false; + } const reqLogger = await createRequestLogger(sourceFormat, targetFormat, model); if (clientRawRequest) reqLogger.logClientRawRequest(clientRawRequest.endpoint, clientRawRequest.body, clientRawRequest.headers); @@ -47,6 +56,10 @@ export async function handleChatCore({ body, modelInfo, credentials, log, onCred log?.debug?.("FORMAT", `${sourceFormat} → ${targetFormat} | stream=${stream}`); let translatedBody = translateRequest(sourceFormat, targetFormat, model, body, stream, credentials, provider, reqLogger); + if (!translatedBody) { + trackPendingRequest(model, provider, connectionId, false, true); + return createErrorResult(HTTP_STATUS.BAD_REQUEST, `Failed to translate request for ${sourceFormat} → ${targetFormat}`); + } const toolNameMap = translatedBody._toolNameMap; delete translatedBody._toolNameMap; translatedBody.model = model; @@ -128,17 +141,23 @@ export async function handleChatCore({ body, modelInfo, credentials, log, onCred // Handle 401/403 - try token refresh if (providerResponse.status === HTTP_STATUS.UNAUTHORIZED || providerResponse.status === HTTP_STATUS.FORBIDDEN) { - const newCredentials = await refreshWithRetry(() => executor.refreshCredentials(credentials, log), 3, log); - if (newCredentials?.accessToken || newCredentials?.copilotToken) { - log?.info?.("TOKEN", `${provider.toUpperCase()} | refreshed`); - Object.assign(credentials, newCredentials); - if (onCredentialsRefreshed) await onCredentialsRefreshed(newCredentials); - try { - const retryResult = await executor.execute({ model, body: translatedBody, stream, credentials, signal: streamController.signal, log, proxyOptions }); - if (retryResult.response.ok) { providerResponse = retryResult.response; providerUrl = retryResult.url; } - } catch { log?.warn?.("TOKEN", `${provider.toUpperCase()} | retry after refresh failed`); } - } else { - log?.warn?.("TOKEN", `${provider.toUpperCase()} | refresh failed`); + try { + const newCredentials = await refreshWithRetry(() => executor.refreshCredentials(credentials, log), 3, log); + if (newCredentials?.accessToken || newCredentials?.copilotToken) { + log?.info?.("TOKEN", `${provider.toUpperCase()} | refreshed`); + Object.assign(credentials, newCredentials); + if (onCredentialsRefreshed) { + try { await onCredentialsRefreshed(newCredentials); } catch (e) { log?.warn?.("TOKEN", `onCredentialsRefreshed failed: ${e.message}`); } + } + try { + const retryResult = await executor.execute({ model, body: translatedBody, stream, credentials, signal: streamController.signal, log, proxyOptions }); + if (retryResult.response.ok) { providerResponse = retryResult.response; providerUrl = retryResult.url; } + } catch { log?.warn?.("TOKEN", `${provider.toUpperCase()} | retry after refresh failed`); } + } else { + log?.warn?.("TOKEN", `${provider.toUpperCase()} | refresh failed`); + } + } catch (e) { + log?.warn?.("TOKEN", `${provider.toUpperCase()} | refresh threw: ${e.message}`); } } @@ -173,12 +192,14 @@ export async function handleChatCore({ body, modelInfo, credentials, log, onCred // Provider forced streaming but client wants JSON if (!clientRequestedStreaming && providerRequiresStreaming) { const result = await handleForcedSSEToJson({ ...sharedCtx, providerResponse, sourceFormat, trackDone, appendLog }); - if (result) return result; + if (result) { streamController.handleComplete(); return result; } } // True non-streaming response if (!stream) { - return handleNonStreamingResponse({ ...sharedCtx, providerResponse, sourceFormat, targetFormat, reqLogger, trackDone, appendLog }); + const result = await handleNonStreamingResponse({ ...sharedCtx, providerResponse, sourceFormat, targetFormat, reqLogger, trackDone, appendLog }); + streamController.handleComplete(); + return result; } // Streaming response diff --git a/open-sse/handlers/chatCore/nonStreamingHandler.js b/open-sse/handlers/chatCore/nonStreamingHandler.js index 190def6..8b92852 100644 --- a/open-sse/handlers/chatCore/nonStreamingHandler.js +++ b/open-sse/handlers/chatCore/nonStreamingHandler.js @@ -1,8 +1,9 @@ import { FORMATS } from "../../translator/formats.js"; import { needsTranslation } from "../../translator/index.js"; +import { ollamaBodyToOpenAI } from "../../translator/response/ollama-to-openai.js"; import { addBufferToUsage, filterUsageForFormat } from "../../utils/usageTracking.js"; import { createErrorResult } from "../../utils/error.js"; -import { HTTP_STATUS } from "../../config/constants.js"; +import { HTTP_STATUS } from "../../config/runtimeConfig.js"; import { parseSSEToOpenAIResponse } from "./sseToJsonHandler.js"; import { buildRequestDetail, extractRequestConfig, extractUsageFromResponse, saveUsageStats } from "./requestDetail.js"; import { appendRequestLog, saveRequestDetail } from "@/lib/usageDb.js"; @@ -76,8 +77,12 @@ export function translateNonStreamingResponse(responseBody, targetFormat, source const toolCalls = []; for (const block of responseBody.content) { - if (block.type === "text") textContent += block.text; - else if (block.type === "thinking") thinkingContent += block.thinking || ""; + if (block.type === "text") { + // Strip markdown code block markers (e.g. kimi wraps JSON in ```json...```) + const raw = block.text ?? ""; + const text = raw.replace(/^\s*```\s*json\s*\n?/i, "").replace(/\n?\s*```\s*$/i, ""); + textContent += text; + } else if (block.type === "thinking") thinkingContent += block.thinking || ""; else if (block.type === "tool_use") { toolCalls.push({ id: block.id, type: "function", function: { name: block.name, arguments: JSON.stringify(block.input || {}) } }); } @@ -111,6 +116,11 @@ export function translateNonStreamingResponse(responseBody, targetFormat, source return result; } + // Ollama + if (targetFormat === FORMATS.OLLAMA) { + return ollamaBodyToOpenAI(responseBody); + } + return responseBody; } diff --git a/open-sse/handlers/chatCore/sseToJsonHandler.js b/open-sse/handlers/chatCore/sseToJsonHandler.js index 0c71be3..8df3996 100644 --- a/open-sse/handlers/chatCore/sseToJsonHandler.js +++ b/open-sse/handlers/chatCore/sseToJsonHandler.js @@ -1,6 +1,6 @@ import { convertResponsesStreamToJson } from "../../transformer/streamToJsonConverter.js"; import { createErrorResult } from "../../utils/error.js"; -import { HTTP_STATUS } from "../../config/constants.js"; +import { HTTP_STATUS } from "../../config/runtimeConfig.js"; import { FORMATS } from "../../translator/formats.js"; import { buildRequestDetail, extractRequestConfig, saveUsageStats } from "./requestDetail.js"; import { saveRequestDetail, appendRequestLog } from "@/lib/usageDb.js"; diff --git a/open-sse/handlers/chatCore/streamingHandler.js b/open-sse/handlers/chatCore/streamingHandler.js index 415c9b8..03d12c1 100644 --- a/open-sse/handlers/chatCore/streamingHandler.js +++ b/open-sse/handlers/chatCore/streamingHandler.js @@ -76,6 +76,8 @@ export function buildOnStreamComplete({ provider, model, connectionId, apiKey, r ttft: ttftAt ? ttftAt - requestStartTime : Date.now() - requestStartTime, total: Date.now() - requestStartTime }; + const safeContent = contentObj?.content || "[Empty streaming response]"; + const safeThinking = contentObj?.thinking || null; saveRequestDetail(buildRequestDetail({ provider, model, connectionId, @@ -83,8 +85,8 @@ export function buildOnStreamComplete({ provider, model, connectionId, apiKey, r tokens: usage || { prompt_tokens: 0, completion_tokens: 0 }, request: extractRequestConfig(body, stream), providerRequest: finalBody || translatedBody || null, - providerResponse: contentObj.content || "[Empty streaming response]", - response: { content: contentObj.content || "[Empty streaming response]", thinking: contentObj.thinking || null, type: "streaming" }, + providerResponse: safeContent, + response: { content: safeContent, thinking: safeThinking, type: "streaming" }, status: "success" }, { id: streamDetailId })).catch(err => { console.error("[RequestDetail] Failed to update streaming content:", err.message); diff --git a/open-sse/handlers/embeddingsCore.js b/open-sse/handlers/embeddingsCore.js index 35fdade..b9eb636 100644 --- a/open-sse/handlers/embeddingsCore.js +++ b/open-sse/handlers/embeddingsCore.js @@ -1,6 +1,6 @@ import { getModelTargetFormat, PROVIDER_ID_TO_ALIAS } from "../config/providerModels.js"; import { createErrorResult, parseUpstreamError, formatProviderError } from "../utils/error.js"; -import { HTTP_STATUS } from "../config/constants.js"; +import { HTTP_STATUS } from "../config/runtimeConfig.js"; import { getExecutor } from "../executors/index.js"; import { refreshWithRetry } from "../services/tokenRefresh.js"; diff --git a/open-sse/index.js b/open-sse/index.js index 7e6fb33..dde009c 100644 --- a/open-sse/index.js +++ b/open-sse/index.js @@ -2,7 +2,9 @@ import "./utils/proxyFetch.js"; // Config -export { PROVIDERS, OAUTH_ENDPOINTS, CACHE_TTL, DEFAULT_MAX_TOKENS, CLAUDE_SYSTEM_PROMPT, COOLDOWN_MS, BACKOFF_CONFIG } from "./config/constants.js"; +export { PROVIDERS } from "./config/providers.js"; +export { OAUTH_ENDPOINTS, CLAUDE_SYSTEM_PROMPT } from "./config/appConstants.js"; +export { CACHE_TTL, DEFAULT_MAX_TOKENS, COOLDOWN_MS, BACKOFF_CONFIG } from "./config/runtimeConfig.js"; export { PROVIDER_MODELS, getProviderModels, diff --git a/open-sse/services/accountFallback.js b/open-sse/services/accountFallback.js index 0e1b2d1..573a5d6 100644 --- a/open-sse/services/accountFallback.js +++ b/open-sse/services/accountFallback.js @@ -1,4 +1,4 @@ -import { COOLDOWN_MS, BACKOFF_CONFIG, HTTP_STATUS } from "../config/constants.js"; +import { COOLDOWN_MS, BACKOFF_CONFIG, HTTP_STATUS } from "../config/runtimeConfig.js"; /** * Calculate exponential backoff cooldown for rate limits (429) diff --git a/open-sse/services/compact.js b/open-sse/services/compact.js index 2732d40..812cd27 100644 --- a/open-sse/services/compact.js +++ b/open-sse/services/compact.js @@ -38,8 +38,15 @@ export async function handleComboChat({ body, models, handleSingleModel, log }) const modelStr = models[i]; log.info("COMBO", `Trying model ${i + 1}/${models.length}: ${modelStr}`); - const result = await handleSingleModel(body, modelStr); - + let result; + try { + result = await handleSingleModel(body, modelStr); + } catch (e) { + lastError = `${modelStr}: ${e.message}`; + log.warn("COMBO", `Model threw exception, trying next`, { model: modelStr, error: e.message }); + continue; + } + // Success or client error - return response if (result.ok || result.status < 500) { return result; diff --git a/open-sse/services/model.js b/open-sse/services/model.js index 6b2f7f3..119d726 100644 --- a/open-sse/services/model.js +++ b/open-sse/services/model.js @@ -46,6 +46,10 @@ const ALIAS_TO_PROVIDER_ID = { ch: "chutes", chutes: "chutes", cursor: "cursor", + vx: "vertex", + vertex: "vertex", + vxp: "vertex-partner", + "vertex-partner": "vertex-partner", }; /** diff --git a/open-sse/services/projectId.js b/open-sse/services/projectId.js index 8bb8fde..b7aa300 100644 --- a/open-sse/services/projectId.js +++ b/open-sse/services/projectId.js @@ -8,7 +8,7 @@ * This significantly reduces the risk of being flagged by Google's anti-abuse systems. */ -import {CLOUD_CODE_API, LOAD_CODE_ASSIST_HEADERS, LOAD_CODE_ASSIST_METADATA} from "../config/constants.js"; +import { CLOUD_CODE_API, LOAD_CODE_ASSIST_HEADERS, LOAD_CODE_ASSIST_METADATA } from "../config/appConstants.js"; // ─── Cache ──────────────────────────────────────────────────────────────────── // connectionId -> { projectId: string, fetchedAt: number } @@ -254,7 +254,9 @@ async function onboardUser(accessToken, tierID, externalSignal) { console.warn(`[ProjectId] onboardUser failed after ${MAX_ATTEMPTS} attempts: ${error.message}`); return null; } - throw error; + // Continue to next attempt instead of throwing (which would skip remaining retries) + console.warn(`[ProjectId] onboardUser attempt ${attempt} failed: ${error.message}, retrying...`); + await new Promise(resolve => setTimeout(resolve, 2000)); } finally { clearTimeout(timeoutId); externalSignal?.removeEventListener("abort", forwardAbort); diff --git a/open-sse/services/provider.js b/open-sse/services/provider.js index 9310136..286ed44 100644 --- a/open-sse/services/provider.js +++ b/open-sse/services/provider.js @@ -1,4 +1,4 @@ -import { PROVIDERS } from "../config/constants.js"; +import { PROVIDERS } from "../config/providers.js"; import { buildClineHeaders } from "../../src/shared/utils/clineAuth.js"; const OPENAI_COMPATIBLE_PREFIX = "openai-compatible-"; @@ -255,7 +255,7 @@ export function buildProviderHeaders(provider, credentials, stream = true, body } break; - case "github": + case "github": { // GitHub Copilot requires special headers to mimic VSCode // Prioritize copilotToken from providerSpecificData, fallback to accessToken const githubToken = credentials.copilotToken || credentials.accessToken; @@ -279,6 +279,7 @@ export function buildProviderHeaders(provider, credentials, stream = true, body headers["X-Initiator"] = "user"; headers["Accept"] = "application/json"; break; + } case "codex": case "qwen": @@ -297,6 +298,12 @@ export function buildProviderHeaders(provider, credentials, stream = true, body // Claude-compatible API providers use x-api-key headers["x-api-key"] = credentials.apiKey; break; + + case "vertex": + case "vertex-partner": + // Vertex uses async token minting — headers are set by VertexExecutor._buildHeadersAsync() + // Do NOT set Authorization here; it would leak the raw SA JSON as Bearer token + break; default: headers["Authorization"] = `Bearer ${credentials.apiKey || credentials.accessToken}`; diff --git a/open-sse/services/tokenRefresh.js b/open-sse/services/tokenRefresh.js index 1d4f628..60423f7 100644 --- a/open-sse/services/tokenRefresh.js +++ b/open-sse/services/tokenRefresh.js @@ -1,4 +1,5 @@ -import { PROVIDERS, OAUTH_ENDPOINTS, GITHUB_COPILOT } from "../config/constants.js"; +import { PROVIDERS } from "../config/providers.js"; +import { OAUTH_ENDPOINTS, GITHUB_COPILOT } from "../config/appConstants.js"; // Token expiry buffer (refresh if expires within 5 minutes) export const TOKEN_EXPIRY_BUFFER_MS = 5 * 60 * 1000; @@ -68,83 +69,67 @@ export async function refreshAccessToken(provider, refreshToken, credentials, lo * Specialized refresh for Claude OAuth tokens */ export async function refreshClaudeOAuthToken(refreshToken, log) { - const response = await fetch(OAUTH_ENDPOINTS.anthropic.token, { - method: "POST", - headers: { - "Content-Type": "application/json", - Accept: "application/json", - }, - body: JSON.stringify({ - grant_type: "refresh_token", - refresh_token: refreshToken, - client_id: PROVIDERS.claude.clientId, - }), - }); - - if (!response.ok) { - const errorText = await response.text(); - log?.error?.("TOKEN_REFRESH", "Failed to refresh Claude OAuth token", { - status: response.status, - error: errorText, + try { + const response = await fetch(OAUTH_ENDPOINTS.anthropic.token, { + method: "POST", + headers: { + "Content-Type": "application/json", + Accept: "application/json", + }, + body: JSON.stringify({ + grant_type: "refresh_token", + refresh_token: refreshToken, + client_id: PROVIDERS.claude.clientId, + }), }); + + if (!response.ok) { + const errorText = await response.text(); + log?.error?.("TOKEN_REFRESH", "Failed to refresh Claude OAuth token", { status: response.status, error: errorText }); + return null; + } + + const tokens = await response.json(); + log?.info?.("TOKEN_REFRESH", "Successfully refreshed Claude OAuth token", { hasNewAccessToken: !!tokens.access_token, expiresIn: tokens.expires_in }); + return { accessToken: tokens.access_token, refreshToken: tokens.refresh_token || refreshToken, expiresIn: tokens.expires_in }; + } catch (error) { + log?.error?.("TOKEN_REFRESH", `Network error refreshing Claude token: ${error.message}`); return null; } - - const tokens = await response.json(); - - log?.info?.("TOKEN_REFRESH", "Successfully refreshed Claude OAuth token", { - hasNewAccessToken: !!tokens.access_token, - hasNewRefreshToken: !!tokens.refresh_token, - expiresIn: tokens.expires_in, - }); - - return { - accessToken: tokens.access_token, - refreshToken: tokens.refresh_token || refreshToken, - expiresIn: tokens.expires_in, - }; } /** * Specialized refresh for Google providers (Gemini, Antigravity) */ export async function refreshGoogleToken(refreshToken, clientId, clientSecret, log) { - const response = await fetch(OAUTH_ENDPOINTS.google.token, { - method: "POST", - headers: { - "Content-Type": "application/x-www-form-urlencoded", - Accept: "application/json", - }, - body: new URLSearchParams({ - grant_type: "refresh_token", - refresh_token: refreshToken, - client_id: clientId, - client_secret: clientSecret, - }), - }); - - if (!response.ok) { - const errorText = await response.text(); - log?.error?.("TOKEN_REFRESH", "Failed to refresh Google token", { - status: response.status, - error: errorText, + try { + const response = await fetch(OAUTH_ENDPOINTS.google.token, { + method: "POST", + headers: { + "Content-Type": "application/x-www-form-urlencoded", + Accept: "application/json", + }, + body: new URLSearchParams({ + grant_type: "refresh_token", + refresh_token: refreshToken, + client_id: clientId, + client_secret: clientSecret, + }), }); + + if (!response.ok) { + const errorText = await response.text(); + log?.error?.("TOKEN_REFRESH", "Failed to refresh Google token", { status: response.status, error: errorText }); + return null; + } + + const tokens = await response.json(); + log?.info?.("TOKEN_REFRESH", "Successfully refreshed Google token", { hasNewAccessToken: !!tokens.access_token, expiresIn: tokens.expires_in }); + return { accessToken: tokens.access_token, refreshToken: tokens.refresh_token || refreshToken, expiresIn: tokens.expires_in }; + } catch (error) { + log?.error?.("TOKEN_REFRESH", `Network error refreshing Google token: ${error.message}`); return null; } - - const tokens = await response.json(); - - log?.info?.("TOKEN_REFRESH", "Successfully refreshed Google token", { - hasNewAccessToken: !!tokens.access_token, - hasNewRefreshToken: !!tokens.refresh_token, - expiresIn: tokens.expires_in, - }); - - return { - accessToken: tokens.access_token, - refreshToken: tokens.refresh_token || refreshToken, - expiresIn: tokens.expires_in, - }; } /** @@ -205,6 +190,7 @@ export async function refreshQwenToken(refreshToken, log) { * Specialized refresh for Codex (OpenAI) OAuth tokens */ export async function refreshCodexToken(refreshToken, log) { + try { const response = await fetch(OAUTH_ENDPOINTS.openai.token, { method: "POST", headers: { @@ -241,6 +227,10 @@ export async function refreshCodexToken(refreshToken, log) { refreshToken: tokens.refresh_token || refreshToken, expiresIn: tokens.expires_in, }; + } catch (error) { + log?.error?.("TOKEN_REFRESH", `Network error refreshing Codex token: ${error.message}`); + return null; + } } /** @@ -507,6 +497,13 @@ export async function getAccessToken(provider, credentials, log) { log ); + case "vertex": + case "vertex-partner": { + const saJson = parseVertexSaJson(credentials.apiKey); + if (!saJson) return null; + return await refreshVertexToken(saJson, log); + } + default: log?.warn?.("TOKEN_REFRESH", `Unsupported provider for token refresh: ${provider}`); return null; @@ -544,6 +541,12 @@ export async function refreshTokenByProvider(provider, credentials, log) { credentials.providerSpecificData, log ); + case "vertex": + case "vertex-partner": { + const saJson = parseVertexSaJson(credentials.apiKey); + if (!saJson) return null; + return refreshVertexToken(saJson, log); + } default: return refreshAccessToken(provider, credentials.refreshToken, credentials, log); } @@ -623,6 +626,81 @@ export async function getAllAccessTokens(userInfo, log) { return results; } +/** + * Parse Vertex AI Service Account JSON from apiKey string + */ +export function parseVertexSaJson(apiKey) { + if (typeof apiKey !== "string") return null; + try { + const parsed = JSON.parse(apiKey); + if (parsed.type === "service_account" && parsed.client_email && parsed.private_key && parsed.project_id) { + return parsed; + } + return null; + } catch { + return null; + } +} + +// Cache Vertex tokens keyed by service account email { token, expiresAt } +const vertexTokenCache = new Map(); + +/** + * Mint a short-lived OAuth2 Bearer token for Google Cloud Vertex AI + * using Service Account JSON + jose (RS256 JWT assertion flow). + * Token is cached until 5 minutes before expiry. + */ +export async function refreshVertexToken(saJson, log) { + const cacheKey = saJson.client_email; + const cached = vertexTokenCache.get(cacheKey); + + // Return cached token if still valid (5-min buffer) + if (cached && cached.expiresAt - Date.now() > 5 * 60 * 1000) { + return { accessToken: cached.token, expiresAt: cached.expiresAt }; + } + + try { + const { SignJWT, importPKCS8 } = await import("jose"); + log?.debug?.("TOKEN_REFRESH", `Vertex minting token for ${saJson.client_email}`); + const privateKey = await importPKCS8(saJson.private_key.replace(/\\n/g, "\n"), "RS256"); + const now = Math.floor(Date.now() / 1000); + + const jwt = await new SignJWT({ scope: "https://www.googleapis.com/auth/cloud-platform" }) + .setProtectedHeader({ alg: "RS256" }) + .setIssuer(saJson.client_email) + .setAudience("https://oauth2.googleapis.com/token") + .setIssuedAt(now) + .setExpirationTime(now + 3600) + .sign(privateKey); + + const res = await fetch("https://oauth2.googleapis.com/token", { + method: "POST", + headers: { "Content-Type": "application/x-www-form-urlencoded" }, + body: new URLSearchParams({ + grant_type: "urn:ietf:params:oauth:grant-type:jwt-bearer", + assertion: jwt, + }), + }); + + if (!res.ok) { + const err = await res.text(); + log?.error?.("TOKEN_REFRESH", `Vertex token mint failed: ${err}`); + return null; + } + + const { access_token, expires_in } = await res.json(); + const expiresAt = Date.now() + (expires_in ?? 3600) * 1000; + + vertexTokenCache.set(cacheKey, { token: access_token, expiresAt }); + log?.info?.("TOKEN_REFRESH", `Vertex token minted for ${saJson.client_email}`); + + return { accessToken: access_token, expiresAt }; + } catch (error) { + log?.error?.("TOKEN_REFRESH", `Vertex token error: ${error.message}`); + return null; + } +} + /** * Refresh token with retry and exponential backoff * Retries on failure with increasing delay: 1s, 2s, 3s... diff --git a/open-sse/services/usage.js b/open-sse/services/usage.js index 733768a..bf0c28a 100644 --- a/open-sse/services/usage.js +++ b/open-sse/services/usage.js @@ -2,7 +2,7 @@ * Usage Fetcher - Get usage data from provider APIs */ -import { CLIENT_METADATA, getPlatformUserAgent } from "../config/constants.js"; +import { CLIENT_METADATA, getPlatformUserAgent } from "../config/appConstants.js"; // GitHub API config const GITHUB_CONFIG = { @@ -27,8 +27,10 @@ const CODEX_CONFIG = { // Claude API config const CLAUDE_CONFIG = { + oauthUsageUrl: "https://api.anthropic.com/api/oauth/usage", usageUrl: "https://api.anthropic.com/v1/organizations/{org_id}/usage", settingsUrl: "https://api.anthropic.com/v1/settings", + apiVersion: "2023-06-01", }; /** @@ -211,30 +213,34 @@ async function getGeminiUsage(accessToken) { */ async function getAntigravityUsage(accessToken, providerSpecificData) { try { - // First get project ID from subscription info - const projectId = await getAntigravityProjectId(accessToken); + // Fetch subscription info once — reuse for both projectId and plan + const subscriptionInfo = await getAntigravitySubscriptionInfo(accessToken); + const projectId = subscriptionInfo?.cloudaicompanionProject || null; // Fetch quota data with timeout const controller = new AbortController(); const timeoutId = setTimeout(() => controller.abort(), 10000); // 10s timeout - - const response = await fetch(ANTIGRAVITY_CONFIG.quotaApiUrl, { - method: "POST", - headers: { - "Authorization": `Bearer ${accessToken}`, - "User-Agent": ANTIGRAVITY_CONFIG.userAgent, - "Content-Type": "application/json", - "X-Client-Name": "antigravity", - "X-Client-Version": "1.107.0", - "x-request-source": "local", // MITM bypass - }, - body: JSON.stringify({ - ...(projectId ? { project: projectId } : {}) - }), - signal: controller.signal, - }); - - clearTimeout(timeoutId); + + let response; + try { + response = await fetch(ANTIGRAVITY_CONFIG.quotaApiUrl, { + method: "POST", + headers: { + "Authorization": `Bearer ${accessToken}`, + "User-Agent": ANTIGRAVITY_CONFIG.userAgent, + "Content-Type": "application/json", + "X-Client-Name": "antigravity", + "X-Client-Version": "1.107.0", + "x-request-source": "local", // MITM bypass + }, + body: JSON.stringify({ + ...(projectId ? { project: projectId } : {}) + }), + signal: controller.signal, + }); + } finally { + clearTimeout(timeoutId); + } if (response.status === 403) { return { @@ -300,9 +306,6 @@ async function getAntigravityUsage(accessToken, providerSpecificData) { } } - // Get subscription info for plan type - const subscriptionInfo = await getAntigravitySubscriptionInfo(accessToken); - return { plan: subscriptionInfo?.currentTier?.name || "Unknown", quotas, @@ -330,10 +333,9 @@ async function getAntigravityProjectId(accessToken) { * Get Antigravity subscription info */ async function getAntigravitySubscriptionInfo(accessToken) { + const controller = new AbortController(); + const timeoutId = setTimeout(() => controller.abort(), 10000); // 10s timeout try { - const controller = new AbortController(); - const timeoutId = setTimeout(() => controller.abort(), 10000); // 10s timeout - const response = await fetch(ANTIGRAVITY_CONFIG.loadProjectApiUrl, { method: "POST", headers: { @@ -345,46 +347,108 @@ async function getAntigravitySubscriptionInfo(accessToken) { body: JSON.stringify({ metadata: CLIENT_METADATA, mode: 1 }), signal: controller.signal, }); - - clearTimeout(timeoutId); if (!response.ok) return null; - return await response.json(); } catch (error) { console.error("[Antigravity Subscription] Error:", error.message); return null; + } finally { + clearTimeout(timeoutId); } } /** - * Claude Usage - Try to fetch from Anthropic API + * Claude Usage - Primary: OAuth endpoint, Fallback: legacy settings/org endpoint */ async function getClaudeUsage(accessToken) { try { - // Try to get organization/account settings first - const settingsResponse = await fetch("https://api.anthropic.com/v1/settings", { + // Primary: OAuth usage endpoint (Claude Code consumer OAuth tokens) + const oauthResponse = await fetch(CLAUDE_CONFIG.oauthUsageUrl, { method: "GET", headers: { "Authorization": `Bearer ${accessToken}`, - "Content-Type": "application/json", - "anthropic-version": "2023-06-01", + "anthropic-beta": "oauth-2025-04-20", + "anthropic-version": CLAUDE_CONFIG.apiVersion, + }, + }); + + if (oauthResponse.ok) { + const data = await oauthResponse.json(); + const quotas = {}; + + // utilization = % USED (e.g. 87 means 87% used, 13% remaining) + const hasUtilization = (window) => + window && typeof window === "object" && typeof window.utilization === "number"; + + const createQuotaObject = (window) => { + const used = window.utilization; + const remaining = Math.max(0, 100 - used); + return { + used, + total: 100, + remaining, + remainingPercentage: remaining, + resetAt: parseResetTime(window.resets_at), + unlimited: false, + }; + }; + + if (hasUtilization(data.five_hour)) { + quotas["session (5h)"] = createQuotaObject(data.five_hour); + } + + if (hasUtilization(data.seven_day)) { + quotas["weekly (7d)"] = createQuotaObject(data.seven_day); + } + + // Parse model-specific weekly windows (e.g. seven_day_sonnet, seven_day_opus) + for (const [key, value] of Object.entries(data)) { + if (key.startsWith("seven_day_") && key !== "seven_day" && hasUtilization(value)) { + const modelName = key.replace("seven_day_", ""); + quotas[`weekly ${modelName} (7d)`] = createQuotaObject(value); + } + } + + return { + plan: "Claude Code", + extraUsage: data.extra_usage ?? null, + quotas, + }; + } + + // Fallback: legacy settings + org usage endpoint + console.warn(`[Claude Usage] OAuth endpoint returned ${oauthResponse.status}, falling back to legacy`); + return await getClaudeUsageLegacy(accessToken); + } catch (error) { + return { message: `Claude connected. Unable to fetch usage: ${error.message}` }; + } +} + +/** + * Legacy Claude usage for API key / org admin users + */ +async function getClaudeUsageLegacy(accessToken) { + try { + const settingsResponse = await fetch(CLAUDE_CONFIG.settingsUrl, { + method: "GET", + headers: { + "Authorization": `Bearer ${accessToken}`, + "anthropic-version": CLAUDE_CONFIG.apiVersion, }, }); if (settingsResponse.ok) { const settings = await settingsResponse.json(); - // Try usage endpoint if we have org info if (settings.organization_id) { const usageResponse = await fetch( - `https://api.anthropic.com/v1/organizations/${settings.organization_id}/usage`, + CLAUDE_CONFIG.usageUrl.replace("{org_id}", settings.organization_id), { method: "GET", headers: { "Authorization": `Bearer ${accessToken}`, - "Content-Type": "application/json", - "anthropic-version": "2023-06-01", + "anthropic-version": CLAUDE_CONFIG.apiVersion, }, } ); @@ -406,7 +470,6 @@ async function getClaudeUsage(accessToken) { }; } - // If settings API fails, OAuth token may not have required scope return { message: "Claude connected. Usage API requires admin permissions." }; } catch (error) { return { message: `Claude connected. Unable to fetch usage: ${error.message}` }; diff --git a/open-sse/translator/formats.js b/open-sse/translator/formats.js index dffd8dc..6828868 100644 --- a/open-sse/translator/formats.js +++ b/open-sse/translator/formats.js @@ -9,7 +9,8 @@ export const FORMATS = { CODEX: "codex", ANTIGRAVITY: "antigravity", KIRO: "kiro", - CURSOR: "cursor" + CURSOR: "cursor", + OLLAMA: "ollama" }; /** diff --git a/open-sse/translator/helpers/maxTokensHelper.js b/open-sse/translator/helpers/maxTokensHelper.js index 4627fab..6a1c044 100644 --- a/open-sse/translator/helpers/maxTokensHelper.js +++ b/open-sse/translator/helpers/maxTokensHelper.js @@ -1,4 +1,4 @@ -import { DEFAULT_MAX_TOKENS, DEFAULT_MIN_TOKENS } from "../../config/constants.js"; +import { DEFAULT_MAX_TOKENS, DEFAULT_MIN_TOKENS } from "../../config/runtimeConfig.js"; /** * Adjust max_tokens based on request context diff --git a/open-sse/translator/index.js b/open-sse/translator/index.js index b1309e3..fe622ff 100644 --- a/open-sse/translator/index.js +++ b/open-sse/translator/index.js @@ -26,7 +26,7 @@ export function register(from, to, requestFn, responseFn) { function ensureInitialized() { if (initialized) return; initialized = true; - + // Request translators - sync require pattern for bundler require("./request/claude-to-openai.js"); require("./request/openai-to-claude.js"); @@ -36,7 +36,8 @@ function ensureInitialized() { require("./request/openai-responses.js"); require("./request/openai-to-kiro.js"); require("./request/openai-to-cursor.js"); - + require("./request/openai-to-ollama.js"); + // Response translators require("./response/claude-to-openai.js"); require("./response/openai-to-claude.js"); @@ -45,6 +46,7 @@ function ensureInitialized() { require("./response/openai-responses.js"); require("./response/kiro-to-openai.js"); require("./response/cursor-to-openai.js"); + require("./response/ollama-to-openai.js"); } // Translate request: source -> openai -> target diff --git a/open-sse/translator/request/openai-responses.js b/open-sse/translator/request/openai-responses.js index ab91997..8067c8f 100644 --- a/open-sse/translator/request/openai-responses.js +++ b/open-sse/translator/request/openai-responses.js @@ -228,12 +228,17 @@ export function openaiToOpenAIResponsesRequest(model, body, stream, credentials) } } - // Convert tool results + // Convert tool results - output must be a string for Responses API if (msg.role === "tool") { + const output = typeof msg.content === "string" + ? msg.content + : Array.isArray(msg.content) + ? msg.content.map(c => c.text || JSON.stringify(c)).join("") + : JSON.stringify(msg.content); result.input.push({ type: "function_call_output", call_id: msg.tool_call_id, - output: msg.content + output }); } } diff --git a/open-sse/translator/request/openai-to-claude.js b/open-sse/translator/request/openai-to-claude.js index e2e9663..bb2c315 100644 --- a/open-sse/translator/request/openai-to-claude.js +++ b/open-sse/translator/request/openai-to-claude.js @@ -1,6 +1,6 @@ import { register } from "../index.js"; import { FORMATS } from "../formats.js"; -import { CLAUDE_SYSTEM_PROMPT } from "../../config/constants.js"; +import { CLAUDE_SYSTEM_PROMPT } from "../../config/appConstants.js"; import { adjustMaxTokens } from "../helpers/maxTokensHelper.js"; // Empty prefix matches real Claude Code behavior (no tool name prefix). @@ -100,6 +100,21 @@ export function openaiToClaudeRequest(model, body, stream) { } } + // Handle response_format for JSON mode + if (body.response_format) { + const responseFormat = body.response_format; + if (responseFormat.type === "json_schema" && responseFormat.json_schema?.schema) { + const schemaJson = JSON.stringify(responseFormat.json_schema.schema, null, 2); + systemParts.push(`You must respond with valid JSON that strictly follows this JSON schema: +\`\`\`json +${schemaJson} +\`\`\` +Respond ONLY with the JSON object, no other text.`); + } else if (responseFormat.type === "json_object") { + systemParts.push("You must respond with valid JSON. Respond ONLY with a JSON object, no other text."); + } + } + // System with Claude Code prompt and cache_control const claudeCodePrompt = { type: "text", text: CLAUDE_SYSTEM_PROMPT }; diff --git a/open-sse/translator/request/openai-to-cursor.js b/open-sse/translator/request/openai-to-cursor.js index d2b78d8..a36f25c 100644 --- a/open-sse/translator/request/openai-to-cursor.js +++ b/open-sse/translator/request/openai-to-cursor.js @@ -1,8 +1,10 @@ /** * OpenAI to Cursor Request Translator - * - assistant tool_calls → kept as-is (Cursor generates tool calls) - * - Claude tool_use blocks → converted to OpenAI tool_calls format - * - tool results → converted to user message string + * Converts OpenAI messages to Cursor ask/agent format. + * + * Important: Cursor can loop when tool outputs are sent via protobuf tool_results + * with partial schema mismatches. For stability, tool outputs are represented as + * structured text blocks in user messages. */ import { register } from "../index.js"; import { FORMATS } from "../formats.js"; @@ -10,96 +12,154 @@ import { FORMATS } from "../formats.js"; function extractContent(content) { if (typeof content === "string") return content; if (Array.isArray(content)) { - return content.filter(p => p.type === "text").map(p => p.text).join(""); + return content + .filter(part => { + if (!part || typeof part !== "object") return false; + return part.type === "text" && typeof part.text === "string"; + }) + .map(part => part.text || "") + .join(""); } return ""; } -// Build a map of tool_use_id → tool_name from the previous assistant message -function getToolNameMap(prevMsg) { - const map = {}; - if (!prevMsg?.tool_calls) return map; - for (const tc of prevMsg.tool_calls) { - if (tc.id && tc.function?.name) map[tc.id] = tc.function.name; - } - return map; +function sanitizeToolResultText(text) { + // Strip non-printable control chars that can produce backend request errors + return text.replace(/[\u0000-\u0008\u000B\u000C\u000E-\u001F\u007F]/g, ""); +} + +function escapeXml(text) { + return text.replace(/&/g, "&").replace(//g, ">"); +} + +function buildToolResultBlock(toolName, toolCallId, resultText) { + const cleanResult = sanitizeToolResultText(resultText || ""); + return [ + "", + `${escapeXml(toolName || "tool")}`, + `${escapeXml(toolCallId || "")}`, + `${escapeXml(cleanResult)}`, + "" + ].join("\n"); +} + +function normalizeToolCallId(id) { + return typeof id === "string" ? id.split("\n")[0] : ""; } function convertMessages(messages) { const result = []; + + // Build a map of tool_call_id -> tool name from assistant tool calls + const toolCallMetaMap = new Map(); + const rememberToolMeta = (toolCallId, toolName) => { + if (!toolCallId) return; + const name = toolName || "tool"; + toolCallMetaMap.set(toolCallId, { name }); + const normalized = normalizeToolCallId(toolCallId); + if (normalized && normalized !== toolCallId) { + toolCallMetaMap.set(normalized, { name }); + } + }; + + for (const msg of messages) { + if (msg.role === "assistant" && msg.tool_calls) { + for (const tc of msg.tool_calls) { + rememberToolMeta(tc.id || "", tc.function?.name || "tool"); + } + } + if (msg.role === "assistant" && Array.isArray(msg.content)) { + for (const part of msg.content) { + if (part?.type !== "tool_use") continue; + rememberToolMeta(part.id || "", part.name || "tool"); + } + } + } for (let i = 0; i < messages.length; i++) { const msg = messages[i]; if (msg.role === "system") { - result.push({ role: "user", content: `[System Instructions]\n${msg.content}` }); - continue; - } - - if (msg.role === "user") { - if (Array.isArray(msg.content)) { - const parts = []; - const prevMsg = result[result.length - 1]; - const nameMap = getToolNameMap(prevMsg); - for (const block of msg.content) { - if (block.type === "text") { - parts.push(block.text); - } else if (block.type === "tool_result") { - // Claude format: user message with tool_result blocks - const toolResultText = extractContent(block.content) || ""; - const toolCallId = block.tool_use_id || ""; - const toolName = nameMap[toolCallId] || ""; - parts.push(`\n${toolName}\n${toolCallId}\n${toolResultText}\n`); - } - } - result.push({ role: "user", content: parts.join("\n") || "" }); - } else { - result.push({ role: "user", content: extractContent(msg.content) || "" }); - } - continue; - } - - if (msg.role === "tool") { - // Strip system-reminder tags injected by Claude Code - const raw = extractContent(msg.content) || ""; - const toolContent = raw.replace(/[\s\S]*?<\/system-reminder>/g, "").trim(); - const prevMsg = result[result.length - 1]; - const nameMap = getToolNameMap(prevMsg); - const toolCallId = msg.tool_call_id || ""; - const toolName = nameMap[toolCallId] || ""; result.push({ role: "user", - content: `\n${toolName}\n${toolCallId}\n${toolContent}\n` + content: `[System Instructions]\n${extractContent(msg.content)}` }); continue; } - if (msg.role === "assistant") { - let content = extractContent(msg.content) || ""; - let tool_calls = null; + if (msg.role === "tool") { + const toolContent = extractContent(msg.content); + const toolCallId = msg.tool_call_id || ""; + const toolMeta = toolCallMetaMap.get(toolCallId) || {}; + const toolName = msg.name || toolMeta.name || "tool"; + result.push({ + role: "user", + content: buildToolResultBlock(toolName, toolCallId, toolContent) + }); + continue; + } - if (msg.tool_calls && msg.tool_calls.length > 0) { - // OpenAI format: strip `index` field - tool_calls = msg.tool_calls.map(({ index, ...tc }) => tc); - } else if (Array.isArray(msg.content)) { - // Claude format: extract tool_use blocks from content array - const extracted = msg.content - .filter(b => b.type === "tool_use") - .map(b => ({ - id: b.id, - type: "function", - function: { - name: b.name, - arguments: JSON.stringify(b.input || {}) + if (msg.role === "user" || msg.role === "assistant") { + if (msg.role === "user" && Array.isArray(msg.content)) { + const parts = []; + for (const block of msg.content) { + if (!block || typeof block !== "object") continue; + if (block.type === "text") { + if (typeof block.text === "string") { + parts.push(block.text || ""); } - })); - if (extracted.length > 0) tool_calls = extracted; + continue; + } + if (block.type === "tool_result") { + const toolCallId = block.tool_use_id || ""; + const toolMeta = + toolCallMetaMap.get(toolCallId) || + toolCallMetaMap.get(normalizeToolCallId(toolCallId)); + const toolName = toolMeta?.name || "tool"; + const toolContent = extractContent(block.content); + parts.push(buildToolResultBlock(toolName, toolCallId, toolContent)); + } + } + const joined = parts.filter(Boolean).join("\n"); + if (joined) result.push({ role: "user", content: joined }); + continue; } - if (tool_calls) { - result.push({ role: "assistant", content, tool_calls }); - } else if (content) { - result.push({ role: "assistant", content }); + const content = extractContent(msg.content); + + if (msg.role === "assistant" && msg.tool_calls && msg.tool_calls.length > 0) { + const assistantMsg = { role: "assistant", content: content || "" }; + assistantMsg.tool_calls = msg.tool_calls.map(tc => { + const { index, ...rest } = tc || {}; + return rest; + }); + result.push(assistantMsg); + } else if (msg.role === "assistant" && Array.isArray(msg.content)) { + const extractedToolCalls = msg.content + .filter(b => b?.type === "tool_use") + .map(b => ({ + id: b.id || "", + type: "function", + function: { + name: b.name || "tool", + arguments: JSON.stringify(b.input || {}) + } + })) + .filter(tc => tc.id); + + if (extractedToolCalls.length > 0) { + result.push({ + role: "assistant", + content: content || "", + tool_calls: extractedToolCalls + }); + } else if (content) { + result.push({ role: "assistant", content }); + } + } else { + if (content) { + result.push({ role: msg.role, content }); + } } } } diff --git a/open-sse/translator/request/openai-to-gemini.js b/open-sse/translator/request/openai-to-gemini.js index 2874b39..fef3e48 100644 --- a/open-sse/translator/request/openai-to-gemini.js +++ b/open-sse/translator/request/openai-to-gemini.js @@ -1,7 +1,7 @@ import { register } from "../index.js"; import { FORMATS } from "../formats.js"; import { DEFAULT_THINKING_GEMINI_SIGNATURE } from "../../config/defaultThinkingSignature.js"; -import { ANTIGRAVITY_DEFAULT_SYSTEM } from "../../config/constants.js"; +import { ANTIGRAVITY_DEFAULT_SYSTEM } from "../../config/appConstants.js"; import { openaiToClaudeRequestForAntigravity } from "./openai-to-claude.js"; function generateUUID() { diff --git a/open-sse/translator/request/openai-to-ollama.js b/open-sse/translator/request/openai-to-ollama.js new file mode 100644 index 0000000..6a316ea --- /dev/null +++ b/open-sse/translator/request/openai-to-ollama.js @@ -0,0 +1,159 @@ +import { register } from "../index.js"; +import { FORMATS } from "../formats.js"; + +/** + * Convert OpenAI request to Ollama format + * + * Ollama expects: + * - model: string + * - messages: Array<{role: string, content: string}> + * - stream: boolean + * - options?: {temperature?: number, num_predict?: number} + * + * Key differences from OpenAI: + * - Content must be string, not array + * - No support for tool_calls in request (tools are handled differently) + * - tool role maps to user + */ +export function openaiToOllamaRequest(model, body, stream) { + const result = { + model: model, + messages: normalizeMessages(body.messages), + stream: stream + }; + + // Temperature + if (body.temperature !== undefined) { + result.options = result.options || {}; + result.options.temperature = body.temperature; + } + + // Max tokens (Ollama uses num_predict) + if (body.max_tokens !== undefined) { + result.options = result.options || {}; + result.options.num_predict = body.max_tokens; + } + + // Top_p + if (body.top_p !== undefined) { + result.options = result.options || {}; + result.options.top_p = body.top_p; + } + + // Tools (Ollama supports tools in OpenAI format) + if (body.tools && Array.isArray(body.tools)) { + result.tools = body.tools; + } + + // Tool choice + if (body.tool_choice) { + result.tool_choice = body.tool_choice; + } + + return result; +} + +/** + * Normalize messages to Ollama format + * - Content must be string + * - tool messages: convert tool_call_id to tool_name + * - assistant messages: keep tool_calls as-is + */ +function normalizeMessages(messages) { + if (!Array.isArray(messages)) return messages; + + const result = []; + const toolCallMap = new Map(); // Map tool_call_id -> tool_name + + // First pass: build tool_call_id -> tool_name map from assistant messages + for (const msg of messages) { + if (msg.role === "assistant" && msg.tool_calls) { + for (const tc of msg.tool_calls) { + if (tc.id && tc.function?.name) { + toolCallMap.set(tc.id, tc.function.name); + } + } + } + } + + // Second pass: convert messages + for (const msg of messages) { + // Handle tool result messages (OpenAI format -> Ollama format) + if (msg.role === "tool") { + const toolResult = normalizeContent(msg.content); + if (!toolResult) continue; + + // Get tool_name from map or use msg.name as fallback + const toolName = toolCallMap.get(msg.tool_call_id) || msg.name || "unknown_tool"; + + result.push({ + role: "tool", + tool_name: toolName, + content: toolResult + }); + continue; + } + + // Handle assistant messages with tool_calls + if (msg.role === "assistant" && msg.tool_calls) { + const content = normalizeContent(msg.content) || ""; + + // Convert OpenAI tool_calls format to Ollama format + const ollamaToolCalls = msg.tool_calls.map(tc => ({ + type: "function", + function: { + index: tc.index || 0, + name: tc.function?.name || "", + arguments: typeof tc.function?.arguments === "string" + ? JSON.parse(tc.function.arguments || "{}") + : tc.function?.arguments || {} + } + })); + + result.push({ + role: "assistant", + content: content, + tool_calls: ollamaToolCalls + }); + continue; + } + + // Normal messages + const role = msg.role; + const content = normalizeContent(msg.content); + + // Skip empty messages (except assistant) + if (!content && role !== "assistant") continue; + + result.push({ + role: role, + content: content + }); + } + + return result; +} + +/** + * Normalize content to string + * Ollama only accepts string content + */ +function normalizeContent(content) { + if (typeof content === "string") { + return content; + } + + if (Array.isArray(content)) { + // Extract text from content array + const textParts = content + .filter(block => block && block.type === "text" && block.text) + .map(block => block.text); + + return textParts.join("\n") || ""; + } + + return ""; +} + +// Register translator +register(FORMATS.OPENAI, FORMATS.OLLAMA, openaiToOllamaRequest, null); diff --git a/open-sse/translator/response/ollama-to-openai.js b/open-sse/translator/response/ollama-to-openai.js new file mode 100644 index 0000000..9cd5b88 --- /dev/null +++ b/open-sse/translator/response/ollama-to-openai.js @@ -0,0 +1,152 @@ +import { register } from "../index.js"; +import { FORMATS } from "../formats.js"; + +/** + * Convert Ollama NDJSON response to OpenAI SSE format + * + * Ollama response format: + * {"model": "...", "message": {"role": "assistant", "content": "..."}, "done": false} + * {"model": "...", "done": true, "prompt_eval_count": 123, "eval_count": 456} + * + * OpenAI format: + * {"id": "...", "object": "chat.completion.chunk", "created": 123, "model": "...", + * "choices": [{"index": 0, "delta": {"content": "..."}, "finish_reason": null}]} + */ +export function ollamaToOpenAI(chunk, state) { + if (!chunk || typeof chunk !== "object") return null; + + // Initialize state on first chunk + if (!state.ollama) { + state.ollama = { + id: `chatcmpl-${Date.now()}`, + created: Math.floor(Date.now() / 1000), + model: chunk.model || state.model + }; + } + + const { id, created, model } = state.ollama; + + // Final chunk with done=true + if (chunk.done) { + const usage = extractUsage(chunk); + + // Determine finish_reason based on done_reason and previous tool_calls + let finishReason = "stop"; + if (chunk.done_reason === "tool_calls" || state.hadToolCalls) { + finishReason = "tool_calls"; + } + + return { + id: id, + object: "chat.completion.chunk", + created: created, + model: model, + choices: [{ + index: 0, + delta: {}, + finish_reason: finishReason + }], + usage: usage + }; + } + + // Content chunk + const message = chunk.message; + if (!message) return null; + + const content = typeof message.content === "string" ? message.content : ""; + const thinking = typeof message.thinking === "string" ? message.thinking : ""; + const toolCalls = Array.isArray(message.tool_calls) ? message.tool_calls : null; + + // Skip empty chunks + if (!content && !thinking && !toolCalls) return null; + + // Accumulate content in state + if (content) { + state.accumulatedContent = (state.accumulatedContent || "") + content; + } + if (thinking) { + state.accumulatedThinking = (state.accumulatedThinking || "") + thinking; + } + + const delta = {}; + if (content) delta.content = content; + if (thinking) delta.reasoning_content = thinking; + + // Convert Ollama tool_calls to OpenAI format + if (toolCalls) { + state.hadToolCalls = true; + delta.tool_calls = convertToolCalls(toolCalls); + } + + return { + id: id, + object: "chat.completion.chunk", + created: created, + model: model, + choices: [{ + index: 0, + delta: delta, + finish_reason: null + }] + }; +} + +/** + * Extract usage stats from Ollama response + */ +function extractUsage(ollamaChunk) { + return { + prompt_tokens: ollamaChunk.prompt_eval_count || 0, + completion_tokens: ollamaChunk.eval_count || 0, + total_tokens: (ollamaChunk.prompt_eval_count || 0) + (ollamaChunk.eval_count || 0) + }; +} + +/** + * Convert tool_calls from Ollama format to OpenAI format + */ +function convertToolCalls(toolCalls) { + return toolCalls.map((tc, i) => ({ + index: tc.function?.index ?? i, + id: tc.id || `call_${i}_${Date.now()}`, + type: "function", + function: { + name: tc.function?.name || "", + arguments: typeof tc.function?.arguments === "string" + ? tc.function.arguments + : JSON.stringify(tc.function?.arguments || {}) + } + })); +} + +/** + * Convert Ollama non-streaming response body to OpenAI chat.completion format + */ +export function ollamaBodyToOpenAI(body) { + const msg = body.message || {}; + const content = msg.content || ""; + const thinking = msg.thinking || ""; + const toolCalls = Array.isArray(msg.tool_calls) ? msg.tool_calls : []; + + const message = { role: "assistant" }; + if (content) message.content = content; + if (thinking) message.reasoning_content = thinking; + if (toolCalls.length > 0) message.tool_calls = convertToolCalls(toolCalls); + if (!message.content && !message.tool_calls) message.content = ""; + + let finishReason = body.done_reason || "stop"; + if (toolCalls.length > 0) finishReason = "tool_calls"; + + return { + id: `chatcmpl-${Date.now()}`, + object: "chat.completion", + created: Math.floor(Date.now() / 1000), + model: body.model || "ollama", + choices: [{ index: 0, message, finish_reason: finishReason }], + usage: extractUsage(body) + }; +} + +// Register translator +register(FORMATS.OLLAMA, FORMATS.OPENAI, null, ollamaToOpenAI); diff --git a/open-sse/utils/bypassHandler.js b/open-sse/utils/bypassHandler.js index a4e21f3..57fa2ff 100644 --- a/open-sse/utils/bypassHandler.js +++ b/open-sse/utils/bypassHandler.js @@ -1,14 +1,14 @@ import { detectFormat } from "../services/provider.js"; import { translateResponse, initState } from "../translator/index.js"; import { FORMATS } from "../translator/formats.js"; -import { SKIP_PATTERNS } from "../config/constants.js"; +import { SKIP_PATTERNS } from "../config/runtimeConfig.js"; import { formatSSE } from "./stream.js"; /** * Check for bypass patterns - return fake response without calling provider * Only works for Claude CLI requests */ -export function handleBypassRequest(body, model, userAgent = "") { +export function handleBypassRequest(body, model, userAgent = "", ccFilterNaming = false) { if (!userAgent.includes("claude-cli")) return null; if (!body.messages?.length) return null; @@ -22,6 +22,7 @@ export function handleBypassRequest(body, model, userAgent = "") { }; let shouldBypass = false; + let namingBypass = false; // Pattern 1: Title extraction (assistant message = "{") const lastMsg = messages[messages.length - 1]; @@ -54,23 +55,50 @@ export function handleBypassRequest(body, model, userAgent = "") { } } + // Pattern 5: CC naming request (topic title extraction by Claude Code CLI) + // Claude format: system is top-level body.system field, not inside messages + if (!shouldBypass && ccFilterNaming) { + const systemMsg = messages.find(m => m.role === "system"); + const systemFromMessages = getText(systemMsg?.content); + const systemFromBody = Array.isArray(body.system) + ? body.system.filter(s => s.type === "text").map(s => s.text).join(" ") + : (typeof body.system === "string" ? body.system : ""); + const systemText = systemFromMessages || systemFromBody; + if (systemText.includes("isNewTopic")) { + shouldBypass = true; + namingBypass = true; + } + } + if (!shouldBypass) return null; const sourceFormat = detectFormat(body); const stream = body.stream !== false; + // For naming bypass, generate title from user message + if (namingBypass) { + const userMsg = messages.find(m => m.role === "user"); + const userText = getText(userMsg?.content); + const title = userText.trim().split(/\s+/).slice(0, 3).join(" "); + const namingText = JSON.stringify({ isNewTopic: true, title }); + return stream + ? createStreamingResponse(sourceFormat, model, namingText) + : createNonStreamingResponse(sourceFormat, model, namingText); + } + return stream ? createStreamingResponse(sourceFormat, model) : createNonStreamingResponse(sourceFormat, model); } +const DEFAULT_BYPASS_TEXT = "CLI Command Execution: Clear Terminal"; + /** * Create OpenAI standard format response */ -function createOpenAIResponse(model) { +function createOpenAIResponse(model, text = DEFAULT_BYPASS_TEXT) { const id = `chatcmpl-${Date.now()}`; const created = Math.floor(Date.now() / 1000); - const text = "CLI Command Execution: Clear Terminal"; return { id, @@ -97,8 +125,8 @@ function createOpenAIResponse(model) { * Create non-streaming response with translation * Use translator to convert OpenAI → sourceFormat */ -function createNonStreamingResponse(sourceFormat, model) { - const openaiResponse = createOpenAIResponse(model); +function createNonStreamingResponse(sourceFormat, model, text) { + const openaiResponse = createOpenAIResponse(model, text); // If sourceFormat is OpenAI, return directly if (sourceFormat === FORMATS.OPENAI) { @@ -151,8 +179,8 @@ function createNonStreamingResponse(sourceFormat, model) { * Create streaming response with translation * Use translator to convert OpenAI chunks → sourceFormat */ -function createStreamingResponse(sourceFormat, model) { - const openaiResponse = createOpenAIResponse(model); +function createStreamingResponse(sourceFormat, model, text) { + const openaiResponse = createOpenAIResponse(model, text); const state = initState(sourceFormat); state.model = model; diff --git a/open-sse/utils/cursorChecksum.js b/open-sse/utils/cursorChecksum.js index e9585f9..c41d5b7 100644 --- a/open-sse/utils/cursorChecksum.js +++ b/open-sse/utils/cursorChecksum.js @@ -106,22 +106,38 @@ export function buildCursorHeaders(accessToken, machineId = null, ghostMode = tr const clientKey = generateHashed64Hex(cleanToken); const checksum = generateCursorChecksum(effectiveMachineId); + // Detect OS + let os = "linux"; + if (typeof process !== "undefined") { + if (process.platform === "win32") os = "windows"; + else if (process.platform === "darwin") os = "macos"; + } + + // Detect architecture + let arch = "x64"; + if (typeof process !== "undefined") { + if (process.arch === "arm64") arch = "aarch64"; + } + return { - "Authorization": `Bearer ${cleanToken}`, + "authorization": `Bearer ${cleanToken}`, "connect-accept-encoding": "gzip", "connect-protocol-version": "1", - "Content-Type": "application/connect+proto", - "User-Agent": "connect-es/1.6.1", + "content-type": "application/connect+proto", + "user-agent": "connect-es/1.6.1", "x-amzn-trace-id": `Root=${crypto.randomUUID()}`, "x-client-key": clientKey, "x-cursor-checksum": checksum, - "x-cursor-client-version": "1.1.3", + "x-cursor-client-version": "2.3.41", + "x-cursor-client-type": "ide", + "x-cursor-client-os": os, + "x-cursor-client-arch": arch, + "x-cursor-client-device-type": "desktop", "x-cursor-config-version": crypto.randomUUID(), "x-cursor-timezone": Intl.DateTimeFormat().resolvedOptions().timeZone || "UTC", "x-ghost-mode": ghostMode ? "true" : "false", "x-request-id": crypto.randomUUID(), - "x-session-id": sessionId, - "Host": "api2.cursor.sh" + "x-session-id": sessionId }; } diff --git a/open-sse/utils/cursorProtobuf.js b/open-sse/utils/cursorProtobuf.js index 35c4313..e6f7f8b 100644 --- a/open-sse/utils/cursorProtobuf.js +++ b/open-sse/utils/cursorProtobuf.js @@ -6,8 +6,11 @@ import { v4 as uuidv4 } from "uuid"; import zlib from "zlib"; -const DEBUG = true; +const DEBUG = process.env.CURSOR_PROTOBUF_DEBUG === "1"; const log = (tag, ...args) => DEBUG && console.log(`[PROTOBUF:${tag}]`, ...args); +const textDecoder = new TextDecoder(); + +const PROTOBUF_SCHEMA_VERSION = "1.1.3"; // ==================== SCHEMAS ==================== @@ -18,6 +21,8 @@ const ROLE = { USER: 1, ASSISTANT: 2 }; const UNIFIED_MODE = { CHAT: 1, AGENT: 2 }; const THINKING_LEVEL = { UNSPECIFIED: 0, MEDIUM: 1, HIGH: 2 }; +const CLIENT_SIDE_TOOL_V2 = { MCP: 19 }; +const CLIENT_SIDE_TOOL_V2_MCP = 19; const FIELD = { // StreamUnifiedChatRequestWithTools (top level) @@ -55,6 +60,7 @@ const FIELD = { MSG_ID: 13, MSG_TOOL_RESULTS: 18, MSG_IS_AGENTIC: 29, + MSG_SERVER_BUBBLE_ID: 32, MSG_UNIFIED_MODE: 47, MSG_SUPPORTED_TOOLS: 51, @@ -67,18 +73,35 @@ const FIELD = { TOOL_RESULT_TOOL_CALL: 11, TOOL_RESULT_MODEL_CALL_ID: 12, - // ClientSideToolV2Result + // ClientSideToolV2Result (nested inside ToolResult.result) + CLIENT_RESULT_TOOL: 1, + CLIENT_RESULT_MCP_RESULT: 28, + CLIENT_RESULT_TOOL_CALL_ID: 35, + CLIENT_RESULT_MODEL_CALL_ID: 48, + CLIENT_RESULT_TOOL_INDEX: 49, + // Aliases used by encodeClientSideToolV2Result CV2R_TOOL: 1, CV2R_MCP_RESULT: 28, CV2R_CALL_ID: 35, CV2R_MODEL_CALL_ID: 48, CV2R_TOOL_INDEX: 49, - // MCPResult + // MCPResult (nested inside ClientSideToolV2Result.mcp_result) + MCP_RESULT_SELECTED_TOOL: 1, + MCP_RESULT_RESULT: 2, + // Aliases used by encodeMcpResult MCPR_SELECTED_TOOL: 1, MCPR_RESULT: 2, - // ClientSideToolV2Call + // ClientSideToolV2Call (nested inside ToolResult.tool_call) + CLIENT_CALL_TOOL: 1, + CLIENT_CALL_MCP_PARAMS: 27, + CLIENT_CALL_TOOL_CALL_ID: 3, + CLIENT_CALL_NAME: 9, + CLIENT_CALL_RAW_ARGS: 10, + CLIENT_CALL_TOOL_INDEX: 48, + CLIENT_CALL_MODEL_CALL_ID: 49, + // Aliases used by encodeClientSideToolV2Call CV2C_TOOL: 1, CV2C_MCP_PARAMS: 27, CV2C_CALL_ID: 3, @@ -87,9 +110,6 @@ const FIELD = { CV2C_TOOL_INDEX: 48, CV2C_MODEL_CALL_ID: 49, - // ConversationMessage extra fields - MSG_SERVER_BUBBLE_ID: 32, - // Model MODEL_NAME: 1, MODEL_EMPTY: 4, @@ -135,6 +155,7 @@ const FIELD = { TOOL_NAME: 9, TOOL_RAW_ARGS: 10, TOOL_IS_LAST: 11, + TOOL_IS_LAST_ALT: 15, TOOL_MCP_PARAMS: 27, // MCPParams @@ -152,6 +173,19 @@ const FIELD = { THINKING_TEXT: 1 }; +// Known response field numbers — used to detect unknown fields from protocol updates +const KNOWN_RESPONSE_FIELDS = new Set([ + FIELD.TOOL_CALL, + FIELD.RESPONSE, + FIELD.TOOL_ID, + FIELD.TOOL_NAME, + FIELD.TOOL_RAW_ARGS, + FIELD.TOOL_IS_LAST, + FIELD.TOOL_MCP_PARAMS, + FIELD.RESPONSE_TEXT, + FIELD.THINKING +]); + // ==================== PRIMITIVE ENCODING ==================== export function encodeVarint(value) { @@ -200,15 +234,46 @@ function concatArrays(...arrays) { // ==================== MESSAGE ENCODING ==================== -// ClientSideToolV2 enum: MCP = 19 -const CLIENT_SIDE_TOOL_V2_MCP = 19; - /** * Format tool name: "toolName" → "mcp_custom_toolName" + * Also handles: "mcp__server__tool" → "mcp_server_tool" */ function formatToolName(name) { - if (name.startsWith("mcp_")) return name; - return `mcp_custom_${name}`; + const base = typeof name === "string" && name.length > 0 ? name : "tool"; + + if (base.startsWith("mcp__")) { + const rest = base.slice("mcp__".length); + const splitIdx = rest.indexOf("__"); + if (splitIdx >= 0) { + const server = rest.slice(0, splitIdx) || "custom"; + const toolName = rest.slice(splitIdx + 2) || "tool"; + return `mcp_${server}_${toolName}`; + } + return `mcp_custom_${rest || "tool"}`; + } + + if (base.startsWith("mcp_")) return base; + return `mcp_custom_${base}`; +} + +/** + * Parse formatted tool name: "mcp_server_tool" → { serverName, selectedTool } + */ +function parseToolName(formattedName) { + if (typeof formattedName !== "string" || !formattedName.startsWith("mcp_")) { + return { serverName: "custom", selectedTool: formattedName || "tool" }; + } + + const tail = formattedName.slice("mcp_".length); + const splitIdx = tail.indexOf("_"); + if (splitIdx < 0) { + return { serverName: "custom", selectedTool: tail || "tool" }; + } + + return { + serverName: tail.slice(0, splitIdx) || "custom", + selectedTool: tail.slice(splitIdx + 1) || "tool" + }; } /** @@ -235,15 +300,16 @@ function encodeMcpResult(selectedTool, resultContent) { } /** - * Encode ClientSideToolV2Result proto + * Encode ClientSideToolV2Result proto: { tool, mcp_result, call_id, model_call_id, tool_index } + * Represents the result of executing a tool */ -function encodeClientSideToolV2Result(toolCallId, modelCallId, selectedTool, resultContent) { +function encodeClientSideToolV2Result(toolCallId, modelCallId, selectedTool, resultContent, toolIndex = 1) { return concatArrays( encodeField(FIELD.CV2R_TOOL, WIRE_TYPE.VARINT, CLIENT_SIDE_TOOL_V2_MCP), encodeField(FIELD.CV2R_MCP_RESULT, WIRE_TYPE.LEN, encodeMcpResult(selectedTool, resultContent)), encodeField(FIELD.CV2R_CALL_ID, WIRE_TYPE.LEN, toolCallId), ...(modelCallId ? [encodeField(FIELD.CV2R_MODEL_CALL_ID, WIRE_TYPE.LEN, modelCallId)] : []), - encodeField(FIELD.CV2R_TOOL_INDEX, WIRE_TYPE.VARINT, 1) + encodeField(FIELD.CV2R_TOOL_INDEX, WIRE_TYPE.VARINT, toolIndex > 0 ? toolIndex : 1) ); } @@ -260,16 +326,17 @@ function encodeMcpParamsForCall(toolName, rawArgs, serverName) { } /** - * Encode ClientSideToolV2Call proto + * Encode ClientSideToolV2Call proto: { tool, mcp_params, call_id, name, raw_args, tool_index, model_call_id } + * Represents a tool call definition */ -function encodeClientSideToolV2Call(toolCallId, toolName, mcpToolName, rawArgs, modelCallId) { +function encodeClientSideToolV2Call(toolCallId, toolName, selectedTool, serverName, rawArgs, modelCallId, toolIndex = 1) { return concatArrays( encodeField(FIELD.CV2C_TOOL, WIRE_TYPE.VARINT, CLIENT_SIDE_TOOL_V2_MCP), - encodeField(FIELD.CV2C_MCP_PARAMS, WIRE_TYPE.LEN, encodeMcpParamsForCall(mcpToolName, rawArgs, "custom")), + encodeField(FIELD.CV2C_MCP_PARAMS, WIRE_TYPE.LEN, encodeMcpParamsForCall(selectedTool, rawArgs, serverName)), encodeField(FIELD.CV2C_CALL_ID, WIRE_TYPE.LEN, toolCallId), encodeField(FIELD.CV2C_NAME, WIRE_TYPE.LEN, toolName), encodeField(FIELD.CV2C_RAW_ARGS, WIRE_TYPE.LEN, rawArgs), - encodeField(FIELD.CV2C_TOOL_INDEX, WIRE_TYPE.VARINT, 1), + encodeField(FIELD.CV2C_TOOL_INDEX, WIRE_TYPE.VARINT, toolIndex > 0 ? toolIndex : 1), ...(modelCallId ? [encodeField(FIELD.CV2C_MODEL_CALL_ID, WIRE_TYPE.LEN, modelCallId)] : []) ); } @@ -282,23 +349,24 @@ export function encodeToolResult(toolResult) { const originalName = toolResult.tool_name || toolResult.name || ""; const toolName = formatToolName(originalName); const rawArgs = toolResult.raw_args || "{}"; - const resultContent = toolResult.result_content || ""; + const resultContent = toolResult.result_content || toolResult.result || ""; const { toolCallId, modelCallId } = parseToolId(toolResult.tool_call_id || ""); + const toolIndex = toolResult.tool_index || toolResult.index || 1; - // Derive mcpToolName: strip "mcp_" prefix → "custom_toolName" - const mcpToolName = toolName.startsWith("mcp_") ? toolName.slice(4) : originalName; + // Parse tool name to extract server and selected tool + const { serverName, selectedTool } = parseToolName(toolName); return concatArrays( encodeField(FIELD.TOOL_RESULT_CALL_ID, WIRE_TYPE.LEN, toolCallId), encodeField(FIELD.TOOL_RESULT_NAME, WIRE_TYPE.LEN, toolName), - encodeField(FIELD.TOOL_RESULT_INDEX, WIRE_TYPE.VARINT, toolResult.tool_index || 1), + encodeField(FIELD.TOOL_RESULT_INDEX, WIRE_TYPE.VARINT, toolIndex > 0 ? toolIndex : 1), ...(modelCallId ? [encodeField(FIELD.TOOL_RESULT_MODEL_CALL_ID, WIRE_TYPE.LEN, modelCallId)] : []), encodeField(FIELD.TOOL_RESULT_RAW_ARGS, WIRE_TYPE.LEN, rawArgs), encodeField(FIELD.TOOL_RESULT_RESULT, WIRE_TYPE.LEN, - encodeClientSideToolV2Result(toolCallId, modelCallId, mcpToolName, resultContent) + encodeClientSideToolV2Result(toolCallId, modelCallId, selectedTool, resultContent, toolIndex) ), encodeField(FIELD.TOOL_RESULT_TOOL_CALL, WIRE_TYPE.LEN, - encodeClientSideToolV2Call(toolCallId, toolName, mcpToolName, rawArgs, modelCallId) + encodeClientSideToolV2Call(toolCallId, toolName, selectedTool, serverName, rawArgs, modelCallId, toolIndex) ) ); } @@ -384,13 +452,71 @@ export function encodeRequest(messages, modelName, tools = [], reasoningEffort = const isAgentic = hasTools; const formattedMessages = []; const messageIds = []; + const normalizedMessages = []; - // Prepare messages + // Guardrail: split mixed assistant payload into separate assistant messages + // This prevents protobuf encoding errors when tool calls and results are in same message for (let i = 0; i < messages.length; i++) { const msg = messages[i]; + const hasToolCalls = Array.isArray(msg?.tool_calls) && msg.tool_calls.length > 0; + const hasToolResults = Array.isArray(msg?.tool_results) && msg.tool_results.length > 0; + + if (msg?.role === "assistant" && hasToolCalls && hasToolResults) { + log( + "ENCODE", + `normalizing mixed assistant tool payload at msg[${i}] (calls=${msg.tool_calls.length}, results=${msg.tool_results.length})` + ); + + // Keep assistant tool call message without embedded results + normalizedMessages.push({ + ...msg, + tool_results: [] + }); + + // Avoid inserting duplicate assistant tool-result message if next one already matches + const nextMsg = messages[i + 1]; + const nextHasToolResults = + nextMsg?.role === "assistant" && + Array.isArray(nextMsg?.tool_results) && + nextMsg.tool_results.length > 0; + const currentIds = new Set( + msg.tool_results.map(tr => tr?.tool_call_id).filter(id => typeof id === "string") + ); + const nextIds = new Set( + (nextMsg?.tool_results || []) + .map(tr => tr?.tool_call_id) + .filter(id => typeof id === "string") + ); + let sameIds = currentIds.size > 0 && currentIds.size === nextIds.size; + if (sameIds) { + for (const id of currentIds) { + if (!nextIds.has(id)) { + sameIds = false; + break; + } + } + } + + if (!(nextHasToolResults && sameIds)) { + normalizedMessages.push({ + role: "assistant", + content: "", + tool_results: msg.tool_results + }); + } + + continue; + } + + normalizedMessages.push(msg); + } + + // Prepare messages + for (let i = 0; i < normalizedMessages.length; i++) { + const msg = normalizedMessages[i]; const role = msg.role === "user" ? ROLE.USER : ROLE.ASSISTANT; const msgId = uuidv4(); - const isLast = i === messages.length - 1; + const isLast = i === normalizedMessages.length - 1; formattedMessages.push({ content: msg.content, @@ -719,6 +845,16 @@ export function extractTextFromResponse(payload) { try { const fields = decodeMessage(payload); + // Warn about unknown field numbers — may indicate a Cursor protocol update + for (const fieldNum of fields.keys()) { + if (!KNOWN_RESPONSE_FIELDS.has(fieldNum)) { + log( + "SCHEMA", + `Unknown response field #${fieldNum} detected. Schema v${PROTOBUF_SCHEMA_VERSION} may be outdated.` + ); + } + } + // Field 1: ClientSideToolV2Call if (fields.has(FIELD.TOOL_CALL)) { const toolCall = extractToolCall(fields.get(FIELD.TOOL_CALL)[0].value); @@ -731,7 +867,7 @@ export function extractTextFromResponse(payload) { // Field 2: StreamUnifiedChatResponse if (fields.has(FIELD.RESPONSE)) { const { text, thinking } = extractTextAndThinking(fields.get(FIELD.RESPONSE)[0].value); - + if (text || thinking) { return { text, error: null, toolCall: null, thinking }; } @@ -739,8 +875,15 @@ export function extractTextFromResponse(payload) { return { text: null, error: null, toolCall: null, thinking: null }; } catch (err) { - log("EXTRACT", `Error: ${err.message}`); - return { text: null, error: null, toolCall: null, thinking: null }; + log("EXTRACT", `Decode failed (schema v${PROTOBUF_SCHEMA_VERSION}): ${err.message}`); + return { + text: null, + error: null, + toolCall: null, + thinking: null, + raw: Buffer.from(payload).toString("base64"), + decodeError: err.message + }; } } diff --git a/open-sse/utils/error.js b/open-sse/utils/error.js index 31497dd..cf63787 100644 --- a/open-sse/utils/error.js +++ b/open-sse/utils/error.js @@ -1,4 +1,4 @@ -import { ERROR_TYPES, DEFAULT_ERROR_MESSAGES } from "../config/constants.js"; +import { ERROR_TYPES, DEFAULT_ERROR_MESSAGES } from "../config/runtimeConfig.js"; /** * Build OpenAI-compatible error response body diff --git a/open-sse/utils/ollamaTransform.js b/open-sse/utils/ollamaTransform.js index 33b77b7..b4fb6a6 100644 --- a/open-sse/utils/ollamaTransform.js +++ b/open-sse/utils/ollamaTransform.js @@ -49,7 +49,7 @@ export function transformToOllama(response, model) { const formattedCalls = toolCallsArr.map(tc => ({ function: { name: tc.function.name, - arguments: JSON.parse(tc.function.arguments || "{}") + arguments: (() => { try { return JSON.parse(tc.function.arguments || "{}"); } catch { return {}; } })() } })); const ollama = JSON.stringify({ @@ -75,6 +75,9 @@ export function transformToOllama(response, model) { } }); + if (!response.body) { + return new Response("", { status: response.status, headers: { "Content-Type": "application/x-ndjson" } }); + } return new Response(response.body.pipeThrough(transform), { headers: { "Content-Type": "application/x-ndjson", "Access-Control-Allow-Origin": "*" } }); diff --git a/open-sse/utils/proxyFetch.js b/open-sse/utils/proxyFetch.js index 87d7398..23dbdd1 100644 --- a/open-sse/utils/proxyFetch.js +++ b/open-sse/utils/proxyFetch.js @@ -1,10 +1,13 @@ +import { Readable } from "stream"; +import { MEMORY_CONFIG } from "../config/runtimeConfig.js"; + const isCloud = typeof caches !== "undefined" && typeof caches === "object"; const originalFetch = globalThis.fetch; const proxyDispatchers = new Map(); -// Constants -const DNS_CACHE = {}; +// DNS cache — use Map to avoid prototype pollution via malformed hostnames +const DNS_CACHE = new Map(); const MITM_BYPASS_HOSTS = ["cloudcode-pa.googleapis.com", "daily-cloudcode-pa.googleapis.com", "googleapis.com"]; const MITM_BYPASS_HEADER = "x-request-source"; const MITM_BYPASS_VALUE = "local"; @@ -22,7 +25,8 @@ function normalizeString(value) { * Resolve real IP using Google DNS (bypass system DNS) */ async function resolveRealIP(hostname) { - if (DNS_CACHE[hostname]) return DNS_CACHE[hostname]; + const cached = DNS_CACHE.get(hostname); + if (cached && Date.now() < cached.expiry) return cached.ip; try { const dns = await import("dns"); @@ -31,7 +35,7 @@ async function resolveRealIP(hostname) { resolver.setServers(GOOGLE_DNS_SERVERS); const resolve4 = promisify(resolver.resolve4.bind(resolver)); const addresses = await resolve4(hostname); - DNS_CACHE[hostname] = addresses[0]; + DNS_CACHE.set(hostname, { ip: addresses[0], expiry: Date.now() + MEMORY_CONFIG.dnsCacheTtlMs }); return addresses[0]; } catch (error) { console.warn(`[ProxyFetch] DNS resolve failed for ${hostname}:`, error.message); @@ -50,23 +54,27 @@ function shouldBypassMitmDns(url, options) { headers[MITM_BYPASS_HEADER.charAt(0).toUpperCase() + MITM_BYPASS_HEADER.slice(1)] === MITM_BYPASS_VALUE; if (!hasLocalMarker) { - // Debug: log when bypass is not triggered - const hostname = new URL(url).hostname; - if (MITM_BYPASS_HOSTS.some(host => hostname.includes(host))) { - console.warn(`[ProxyFetch] MITM bypass NOT triggered for ${hostname} - missing header`); - } + try { + const hostname = new URL(url).hostname; + if (MITM_BYPASS_HOSTS.some(host => hostname.includes(host))) { + console.warn(`[ProxyFetch] MITM bypass NOT triggered for ${hostname} - missing header`); + } + } catch { /* invalid URL — skip debug log */ } return false; } - const hostname = new URL(url).hostname; - return MITM_BYPASS_HOSTS.some(host => hostname.includes(host)); + try { + const hostname = new URL(url).hostname; + return MITM_BYPASS_HOSTS.some(host => hostname.includes(host)); + } catch { return false; } } function shouldBypassByNoProxy(targetUrl, noProxyValue) { const noProxy = normalizeString(noProxyValue); if (!noProxy) return false; - const hostname = new URL(targetUrl).hostname.toLowerCase(); + let hostname; + try { hostname = new URL(targetUrl).hostname.toLowerCase(); } catch { return false; } const patterns = noProxy.split(",").map((p) => p.trim().toLowerCase()).filter(Boolean); return patterns.some((pattern) => { @@ -83,7 +91,8 @@ function getEnvProxyUrl(targetUrl) { const noProxy = process.env.NO_PROXY || process.env.no_proxy; if (shouldBypassByNoProxy(targetUrl, noProxy)) return null; - const protocol = new URL(targetUrl).protocol; + let protocol; + try { protocol = new URL(targetUrl).protocol; } catch { return null; } if (protocol === "https:") { return process.env.HTTPS_PROXY || process.env.https_proxy || @@ -132,6 +141,10 @@ async function getDispatcher(proxyUrl) { if (!normalized) return null; if (!proxyDispatchers.has(normalized)) { + // Evict oldest entry if max size reached + if (proxyDispatchers.size >= MEMORY_CONFIG.proxyDispatchersMaxSize) { + proxyDispatchers.delete(proxyDispatchers.keys().next().value); + } const { ProxyAgent } = await import("undici"); proxyDispatchers.set(normalized, new ProxyAgent({ uri: normalized })); } @@ -145,7 +158,7 @@ async function getDispatcher(proxyUrl) { async function createBypassRequest(parsedUrl, realIP, options) { const https = await import("https"); const net = await import("net"); - const { Readable } = require("stream"); + const { Readable } = await import("stream"); return new Promise((resolve, reject) => { const socket = new net.Socket(); diff --git a/open-sse/utils/requestLogger.js b/open-sse/utils/requestLogger.js index cc04540..010153d 100644 --- a/open-sse/utils/requestLogger.js +++ b/open-sse/utils/requestLogger.js @@ -44,7 +44,7 @@ async function createLogSession(sourceFormat, targetFormat, model) { } const timestamp = formatTimestamp(); - const safeModel = model.replace(/[/:]/g, "-"); + const safeModel = (model || "unknown").replace(/[/:]/g, "-"); const folderName = `${sourceFormat}_${targetFormat}_${safeModel}_${timestamp}`; const sessionPath = path.join(LOGS_DIR, folderName); diff --git a/open-sse/utils/sessionManager.js b/open-sse/utils/sessionManager.js index 35b3c25..7c08013 100644 --- a/open-sse/utils/sessionManager.js +++ b/open-sse/utils/sessionManager.js @@ -9,11 +9,24 @@ */ import crypto from "crypto"; +import { MEMORY_CONFIG } from "../config/runtimeConfig.js"; -// Runtime storage for session IDs (per connection/account) -// Key: connectionId (email or identifier), Value: sessionId +// Runtime storage: Key = connectionId, Value = { sessionId, lastUsed } const runtimeSessionStore = new Map(); +// Periodically evict entries that haven't been used within TTL +const cleanupInterval = setInterval(() => { + const now = Date.now(); + for (const [key, entry] of runtimeSessionStore) { + if (now - entry.lastUsed > MEMORY_CONFIG.sessionTtlMs) { + runtimeSessionStore.delete(key); + } + } +}, MEMORY_CONFIG.sessionCleanupIntervalMs); + +// Allow Node.js to exit even if interval is still active +if (cleanupInterval.unref) cleanupInterval.unref(); + /** * Get or create a session ID for the given connection. * @@ -30,22 +43,25 @@ const runtimeSessionStore = new Map(); */ export function deriveSessionId(connectionId) { if (!connectionId) { - // Fallback for requests without a connection identifier return generateBinaryStyleId(); } - // Check if we already have a session ID for this connection in this process run - if (runtimeSessionStore.has(connectionId)) { - return runtimeSessionStore.get(connectionId); + const existing = runtimeSessionStore.get(connectionId); + if (existing) { + existing.lastUsed = Date.now(); + return existing.sessionId; } - // Generate a new ID using the binary's exact logic - const newSessionId = generateBinaryStyleId(); + // Evict oldest entry if store exceeds max size (safety cap between cleanup cycles) + const MAX_SESSIONS = 1000; + if (runtimeSessionStore.size >= MAX_SESSIONS) { + const oldest = runtimeSessionStore.keys().next().value; + runtimeSessionStore.delete(oldest); + } - // Store it for future requests from this connection - runtimeSessionStore.set(connectionId, newSessionId); - - return newSessionId; + const sessionId = generateBinaryStyleId(); + runtimeSessionStore.set(connectionId, { sessionId, lastUsed: Date.now() }); + return sessionId; } /** diff --git a/open-sse/utils/stream.js b/open-sse/utils/stream.js index 4538035..138f2ee 100644 --- a/open-sse/utils/stream.js +++ b/open-sse/utils/stream.js @@ -6,7 +6,7 @@ import { parseSSELine, hasValuableContent, fixInvalidId, formatSSE } from "./str export { COLORS, formatSSE }; -const sharedDecoder = new TextDecoder(); +// sharedEncoder is stateless — safe to share across streams const sharedEncoder = new TextEncoder(); /** @@ -49,6 +49,9 @@ export function createSSEStream(options = {}) { let buffer = ""; let usage = null; + // Per-stream decoder with stream:true to correctly handle multi-byte chars split across chunks + const decoder = new TextDecoder("utf-8", { fatal: false }); + const state = mode === STREAM_MODE.TRANSLATE ? { ...initState(sourceFormat), provider, toolNameMap, model } : null; let totalContentLength = 0; @@ -61,7 +64,7 @@ export function createSSEStream(options = {}) { if (!ttftAt) { ttftAt = Date.now(); } - const text = sharedDecoder.decode(chunk, { stream: true }); + const text = decoder.decode(chunk, { stream: true }); buffer += text; reqLogger?.appendProviderChunk?.(text); @@ -159,10 +162,12 @@ export function createSSEStream(options = {}) { // Translate mode if (!trimmed) continue; - const parsed = parseSSELine(trimmed); + const parsed = parseSSELine(trimmed, targetFormat); if (!parsed) continue; - if (parsed && parsed.done) { + // For Ollama: done=true is the final chunk with finish_reason/usage, must translate + // For other formats: done=true is the [DONE] sentinel, skip + if (parsed && parsed.done && targetFormat !== FORMATS.OLLAMA) { const output = "data: [DONE]\n\n"; reqLogger?.appendConvertedChunk?.(output); controller.enqueue(sharedEncoder.encode(output)); @@ -251,7 +256,7 @@ export function createSSEStream(options = {}) { flush(controller) { trackPendingRequest(model, provider, connectionId, false); try { - const remaining = sharedDecoder.decode(); + const remaining = decoder.decode(); if (remaining) buffer += remaining; if (mode === STREAM_MODE.PASSTHROUGH) { diff --git a/open-sse/utils/streamHandler.js b/open-sse/utils/streamHandler.js index 5e90adc..10ac653 100644 --- a/open-sse/utils/streamHandler.js +++ b/open-sse/utils/streamHandler.js @@ -107,6 +107,9 @@ export function createDisconnectAwareStream(transformStream, streamController) { controller.enqueue(value); } catch (error) { streamController.handleError(error); + // Cleanup reader/writer to avoid orphaned streams + reader.cancel().catch(() => {}); + writer.abort().catch(() => {}); controller.error(error); } }, @@ -128,7 +131,7 @@ export function createDisconnectAwareStream(transformStream, streamController) { export function pipeWithDisconnect(providerResponse, transformStream, streamController) { const transformedBody = providerResponse.body.pipeThrough(transformStream); return createDisconnectAwareStream( - { readable: transformedBody, writable: { getWriter: () => ({ abort: () => {} }) } }, + { readable: transformedBody, writable: { getWriter: () => ({ abort: () => Promise.resolve() }) } }, streamController ); } diff --git a/open-sse/utils/streamHelpers.js b/open-sse/utils/streamHelpers.js index 50cfe06..a7a1918 100644 --- a/open-sse/utils/streamHelpers.js +++ b/open-sse/utils/streamHelpers.js @@ -1,8 +1,24 @@ import { FORMATS } from "../translator/formats.js"; // Parse SSE data line -export function parseSSELine(line) { - if (!line || line.charCodeAt(0) !== 100) return null; // 'd' = 100 +export function parseSSELine(line, format = null) { + if (!line) return null; + + // NDJSON format (Ollama): raw JSON lines without "data:" prefix + if (format === FORMATS.OLLAMA) { + const trimmed = line.trim(); + if (trimmed.startsWith("{")) { + try { + return JSON.parse(trimmed); + } catch (error) { + return null; + } + } + return null; + } + + // Standard SSE format: "data: {...}" + if (line.charCodeAt(0) !== 100) return null; // 'd' = 100 const data = line.slice(5).trim(); if (data === "[DONE]") return { done: true }; diff --git a/package.json b/package.json index 954972c..2b8373d 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "9router-app", - "version": "0.3.35", + "version": "0.3.51", "description": "9Router web dashboard", "private": true, "scripts": { @@ -15,6 +15,8 @@ "@monaco-editor/react": "^4.7.0", "@xyflow/react": "^12.10.1", "bcryptjs": "^3.0.3", + "better-sqlite3": "^12.6.2", + "confbox": "^0.2.4", "express": "^5.2.1", "fs": "^0.0.1-security", "http-proxy-middleware": "^3.0.5", @@ -28,6 +30,7 @@ "ora": "^9.1.0", "react": "19.2.4", "react-dom": "19.2.4", + "react-is": "^16.13.1", "recharts": "^3.7.0", "selfsigned": "^5.5.0", "socks-proxy-agent": "^8.0.5", diff --git a/public/providers/ollama-local.png b/public/providers/ollama-local.png new file mode 100644 index 0000000..8cd2cf1 Binary files /dev/null and b/public/providers/ollama-local.png differ diff --git a/public/providers/ollama.png b/public/providers/ollama.png new file mode 100644 index 0000000..8cd2cf1 Binary files /dev/null and b/public/providers/ollama.png differ diff --git a/public/providers/vertex-partner.png b/public/providers/vertex-partner.png new file mode 100644 index 0000000..00ee1a4 Binary files /dev/null and b/public/providers/vertex-partner.png differ diff --git a/public/providers/vertex.png b/public/providers/vertex.png new file mode 100644 index 0000000..00ee1a4 Binary files /dev/null and b/public/providers/vertex.png differ diff --git a/src/app/(dashboard)/dashboard/cli-tools/components/ClaudeToolCard.js b/src/app/(dashboard)/dashboard/cli-tools/components/ClaudeToolCard.js index 2db68c9..e518b9b 100644 --- a/src/app/(dashboard)/dashboard/cli-tools/components/ClaudeToolCard.js +++ b/src/app/(dashboard)/dashboard/cli-tools/components/ClaudeToolCard.js @@ -1,7 +1,7 @@ "use client"; import { useState, useEffect, useRef } from "react"; -import { Card, Button, ModelSelectModal, ManualConfigModal } from "@/shared/components"; +import { Card, Button, ModelSelectModal, ManualConfigModal, Tooltip } from "@/shared/components"; import Image from "next/image"; const CLOUD_URL = process.env.NEXT_PUBLIC_CLOUD_URL; @@ -31,6 +31,7 @@ export default function ClaudeToolCard({ const [modelAliases, setModelAliases] = useState({}); const [showManualConfigModal, setShowManualConfigModal] = useState(false); const [customBaseUrl, setCustomBaseUrl] = useState(""); + const [ccFilterNaming, setCcFilterNaming] = useState(false); const hasInitializedModels = useRef(false); const getConfigStatus = () => { @@ -64,6 +65,22 @@ export default function ClaudeToolCard({ if (isExpanded) fetchModelAliases(); }, [isExpanded]); + useEffect(() => { + fetch("/api/settings").then(r => r.json()).then(data => { + setCcFilterNaming(!!data.ccFilterNaming); + }).catch(() => {}); + }, []); + + const handleCcFilterNamingToggle = async (e) => { + const value = e.target.checked; + setCcFilterNaming(value); + await fetch("/api/settings", { + method: "PATCH", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify({ ccFilterNaming: value }), + }).catch(() => {}); + }; + const fetchModelAliases = async () => { try { const res = await fetch("/api/models/alias"); @@ -319,6 +336,19 @@ export default function ClaudeToolCard({ {modelMappings[model.alias] && } ))} + + {/* CC Filter Naming */} +
+ Filter naming + arrow_forward + + + info + +
{message && ( diff --git a/src/app/(dashboard)/dashboard/profile/page.js b/src/app/(dashboard)/dashboard/profile/page.js index 4226137..63d985b 100644 --- a/src/app/(dashboard)/dashboard/profile/page.js +++ b/src/app/(dashboard)/dashboard/profile/page.js @@ -314,7 +314,7 @@ export default function ProfilePage() { } }; - const observabilityEnabled = settings.observabilityEnabled !== false; + const observabilityEnabled = settings.observabilityEnabled === true; return (
diff --git a/src/app/(dashboard)/dashboard/providers/[id]/page.js b/src/app/(dashboard)/dashboard/providers/[id]/page.js index e034df7..11869b2 100644 --- a/src/app/(dashboard)/dashboard/providers/[id]/page.js +++ b/src/app/(dashboard)/dashboard/providers/[id]/page.js @@ -34,6 +34,8 @@ export default function ProviderDetailPage() { const [selectedConnectionIds, setSelectedConnectionIds] = useState([]); const [bulkProxyPoolId, setBulkProxyPoolId] = useState("__none__"); const [bulkUpdatingProxy, setBulkUpdatingProxy] = useState(false); + const [providerStrategy, setProviderStrategy] = useState(null); // null = use global, "round-robin" = override + const [providerStickyLimit, setProviderStickyLimit] = useState(""); const { copied, copy } = useCopyToClipboard(); const providerInfo = providerNode @@ -75,14 +77,16 @@ export default function ProviderDetailPage() { const fetchConnections = useCallback(async () => { try { - const [connectionsRes, nodesRes, proxyPoolsRes] = await Promise.all([ + const [connectionsRes, nodesRes, proxyPoolsRes, settingsRes] = await Promise.all([ fetch("/api/providers", { cache: "no-store" }), fetch("/api/provider-nodes", { cache: "no-store" }), fetch("/api/proxy-pools?isActive=true", { cache: "no-store" }), + fetch("/api/settings", { cache: "no-store" }), ]); const connectionsData = await connectionsRes.json(); const nodesData = await nodesRes.json(); const proxyPoolsData = await proxyPoolsRes.json(); + const settingsData = settingsRes.ok ? await settingsRes.json() : {}; if (connectionsRes.ok) { const filtered = (connectionsData.connections || []).filter(c => c.provider === providerId); setConnections(filtered); @@ -90,6 +94,10 @@ export default function ProviderDetailPage() { if (proxyPoolsRes.ok) { setProxyPools(proxyPoolsData.proxyPools || []); } + // Load per-provider strategy override + const override = (settingsData.providerStrategies || {})[providerId] || {}; + setProviderStrategy(override.fallbackStrategy || null); + setProviderStickyLimit(override.stickyRoundRobinLimit != null ? String(override.stickyRoundRobinLimit) : "1"); if (nodesRes.ok) { let node = (nodesData.nodes || []).find((entry) => entry.id === providerId) || null; @@ -133,6 +141,49 @@ export default function ProviderDetailPage() { } }; + const saveProviderStrategy = async (strategy, stickyLimit) => { + try { + const settingsRes = await fetch("/api/settings", { cache: "no-store" }); + const settingsData = settingsRes.ok ? await settingsRes.json() : {}; + const current = settingsData.providerStrategies || {}; + + // Build override: null strategy means remove override, use global + const override = {}; + if (strategy) override.fallbackStrategy = strategy; + if (strategy === "round-robin" && stickyLimit !== "") { + override.stickyRoundRobinLimit = Number(stickyLimit) || 3; + } + + const updated = { ...current }; + if (Object.keys(override).length === 0) { + delete updated[providerId]; + } else { + updated[providerId] = override; + } + + await fetch("/api/settings", { + method: "PATCH", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify({ providerStrategies: updated }), + }); + } catch (error) { + console.log("Error saving provider strategy:", error); + } + }; + + const handleRoundRobinToggle = (enabled) => { + const strategy = enabled ? "round-robin" : null; + const sticky = enabled ? (providerStickyLimit || "1") : providerStickyLimit; + if (enabled && !providerStickyLimit) setProviderStickyLimit("1"); + setProviderStrategy(strategy); + saveProviderStrategy(strategy, sticky); + }; + + const handleStickyLimitChange = (value) => { + setProviderStickyLimit(value); + saveProviderStrategy("round-robin", value); + }; + useEffect(() => { fetchConnections(); fetchAliases(); @@ -703,28 +754,27 @@ export default function ProviderDetailPage() {

Connections

- {!isCompatible && ( -
- {providerId === "iflow" && ( - - )} - -
- )} + {/* Round Robin toggle */} +
+ Round Robin + + {providerStrategy === "round-robin" && ( +
+ Sticky: + handleStickyLimitChange(e.target.value)} + placeholder="1" + className="w-14 px-2 py-1 text-xs border border-border rounded-md bg-background focus:outline-none focus:border-primary" + /> +
+ )} +
{connections.length === 0 ? ( @@ -750,6 +800,28 @@ export default function ProviderDetailPage() { ) : ( <> {connectionsList} + {!isCompatible && ( +
+ {providerId === "iflow" && ( + + )} + +
+ )} )}
@@ -1613,6 +1685,7 @@ function AddApiKeyModal({ isOpen, provider, providerName, isCompatible, isAnthro priority: formData.priority, proxyPoolId: formData.proxyPoolId === NONE_PROXY_POOL_VALUE ? null : formData.proxyPoolId, testStatus: isValid ? "active" : "unknown", + providerSpecificData: undefined }); } finally { setSaving(false); diff --git a/src/app/(dashboard)/dashboard/providers/page.js b/src/app/(dashboard)/dashboard/providers/page.js index 48af600..de6efa3 100644 --- a/src/app/(dashboard)/dashboard/providers/page.js +++ b/src/app/(dashboard)/dashboard/providers/page.js @@ -214,11 +214,10 @@ export default function ProvidersPage() { ); diff --git a/src/shared/components/Tooltip.js b/src/shared/components/Tooltip.js new file mode 100644 index 0000000..b7d0c0c --- /dev/null +++ b/src/shared/components/Tooltip.js @@ -0,0 +1,19 @@ +"use client"; + +export default function Tooltip({ text, children, position = "top" }) { + const posClass = { + top: "bottom-full left-1/2 -translate-x-1/2 mb-1.5", + bottom: "top-full left-1/2 -translate-x-1/2 mt-1.5", + left: "right-full top-1/2 -translate-y-1/2 mr-1.5", + right: "left-full top-1/2 -translate-y-1/2 ml-1.5", + }[position]; + + return ( +
+ {children} +
+ {text} +
+
+ ); +} diff --git a/src/shared/components/index.js b/src/shared/components/index.js index 56b3325..65f2bbc 100644 --- a/src/shared/components/index.js +++ b/src/shared/components/index.js @@ -25,6 +25,7 @@ export { default as KiroSocialOAuthModal } from "./KiroSocialOAuthModal"; export { default as CursorAuthModal } from "./CursorAuthModal"; export { default as IFlowCookieModal } from "./IFlowCookieModal"; export { default as SegmentedControl } from "./SegmentedControl"; +export { default as Tooltip } from "./Tooltip"; // Layouts export * from "./layouts"; diff --git a/src/shared/constants/config.js b/src/shared/constants/config.js index a680cf8..53a3e53 100644 --- a/src/shared/constants/config.js +++ b/src/shared/constants/config.js @@ -47,6 +47,8 @@ export const PROVIDER_ENDPOINTS = { openai: "https://api.openai.com/v1/chat/completions", anthropic: "https://api.anthropic.com/v1/messages", gemini: "https://generativelanguage.googleapis.com/v1beta/models", + ollama: "https://ollama.com/api/chat", + "ollama-local": "http://localhost:11434/api/chat", }; // Re-export from providers.js for backward compatibility diff --git a/src/shared/constants/providers.js b/src/shared/constants/providers.js index b5ba86d..f1dc0ee 100644 --- a/src/shared/constants/providers.js +++ b/src/shared/constants/providers.js @@ -15,6 +15,7 @@ export const OAUTH_PROVIDERS = { codex: { id: "codex", alias: "cx", name: "OpenAI Codex", icon: "code", color: "#3B82F6" }, github: { id: "github", alias: "gh", name: "GitHub Copilot", icon: "code", color: "#333333" }, cursor: { id: "cursor", alias: "cu", name: "Cursor IDE", icon: "edit_note", color: "#00D4AA" }, + // "kimi-coding": { id: "kimi-coding", alias: "kmc", name: "Kimi Coding", icon: "psychology", color: "#1E40AF", textIcon: "KC" }, kilocode: { id: "kilocode", alias: "kc", name: "Kilo Code", icon: "code", color: "#FF6B35", textIcon: "KC" }, cline: { id: "cline", alias: "cl", name: "Cline", icon: "smart_toy", color: "#5B9BD5", textIcon: "CL" }, }; @@ -47,7 +48,11 @@ export const APIKEY_PROVIDERS = { deepgram: { id: "deepgram", alias: "dg", name: "Deepgram", icon: "mic", color: "#13EF93", textIcon: "DG", website: "https://deepgram.com" }, assemblyai: { id: "assemblyai", alias: "aai", name: "AssemblyAI", icon: "record_voice_over", color: "#0062FF", textIcon: "AA", website: "https://assemblyai.com" }, nanobanana: { id: "nanobanana", alias: "nb", name: "NanoBanana", icon: "image", color: "#FFD700", textIcon: "NB", website: "https://nanobananaapi.ai" }, - chutes: { id: "chutes", alias: "ch", name: "Chutes AI", icon: "water_drop", color: "#5B6EF5", textIcon: "CH", website: "https://chutes.ai" }, + chutes: { id: "chutes", alias: "ch", name: "Chutes AI", icon: "water_drop", color: "#ffffffff", textIcon: "CH", website: "https://chutes.ai" }, + ollama: { id: "ollama", alias: "ollama", name: "Ollama Cloud", icon: "cloud", color: "#ffffffff", textIcon: "OL", website: "https://ollama.com" }, + "ollama-local": { id: "ollama-local", alias: "ollama-local", name: "Ollama Local", icon: "cloud", color: "#ffffffff", textIcon: "OL", website: "https://ollama.com" }, + vertex: { id: "vertex", alias: "vx", name: "Vertex AI", icon: "cloud", color: "#4285F4", textIcon: "VX", website: "https://cloud.google.com/vertex-ai" }, + "vertex-partner": { id: "vertex-partner", alias: "vxp", name: "Vertex Partner", icon: "cloud", color: "#34A853", textIcon: "VP", website: "https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-partner-models" }, }; export const OPENAI_COMPATIBLE_PREFIX = "openai-compatible-"; @@ -105,4 +110,11 @@ export const ID_TO_ALIAS = Object.values(AI_PROVIDERS).reduce((acc, p) => { }, {}); // Providers that support usage/quota API -export const USAGE_SUPPORTED_PROVIDERS = ["antigravity", "kiro", "github", "codex"]; +export const USAGE_SUPPORTED_PROVIDERS = [ + "claude", + "antigravity", + "kiro", + "github", + "codex", + "kimi-coding", +]; diff --git a/src/sse/handlers/chat.js b/src/sse/handlers/chat.js index 04eb51a..b8b3eb9 100644 --- a/src/sse/handlers/chat.js +++ b/src/sse/handlers/chat.js @@ -12,7 +12,7 @@ import { getModelInfo, getComboModels } from "../services/model.js"; import { handleChatCore } from "open-sse/handlers/chatCore.js"; import { errorResponse, unavailableResponse } from "open-sse/utils/error.js"; import { handleComboChat } from "open-sse/services/combo.js"; -import { HTTP_STATUS } from "open-sse/config/constants.js"; +import { HTTP_STATUS } from "open-sse/config/runtimeConfig.js"; import { detectFormatByEndpoint } from "open-sse/translator/formats.js"; import * as log from "../utils/logger.js"; import { updateProviderCredentials, checkAndRefreshToken } from "../services/tokenRefresh.js"; @@ -111,7 +111,7 @@ async function handleSingleModelChat(body, modelStr, clientRawRequest = null, re return handleComboChat({ body, models: comboModels, - handleSingleModel: (b, m) => handleSingleModelChat(b, m, clientRawRequest, request, apiKey, forceSourceFormat), + handleSingleModel: (b, m) => handleSingleModelChat(b, m, clientRawRequest, request, apiKey), log }); } @@ -132,12 +132,12 @@ async function handleSingleModelChat(body, modelStr, clientRawRequest = null, re const userAgent = request?.headers?.get("user-agent") || ""; // Try with available accounts (fallback on errors) - let excludeConnectionId = null; + const excludeConnectionIds = new Set(); let lastError = null; let lastStatus = null; while (true) { - const credentials = await getProviderCredentials(provider, excludeConnectionId, model); + const credentials = await getProviderCredentials(provider, excludeConnectionIds, model); // All accounts unavailable if (!credentials || credentials.allRateLimited) { @@ -147,7 +147,7 @@ async function handleSingleModelChat(body, modelStr, clientRawRequest = null, re log.warn("CHAT", `[${provider}/${model}] ${errorMsg} (${credentials.retryAfterHuman})`); return unavailableResponse(status, `[${provider}/${model}] ${errorMsg}`, credentials.retryAfter, credentials.retryAfterHuman); } - if (!excludeConnectionId) { + if (excludeConnectionIds.size === 0) { log.error("AUTH", `No credentials for provider: ${provider}`); return errorResponse(HTTP_STATUS.BAD_REQUEST, `No credentials for provider: ${provider}`); } @@ -156,8 +156,7 @@ async function handleSingleModelChat(body, modelStr, clientRawRequest = null, re } // Log account selection - const accountId = credentials.connectionId.slice(0, 8); - log.info("AUTH", `Using ${provider} account: ${accountId}...`); + log.info("AUTH", `\x1b[32mUsing ${provider} account: ${credentials.connectionName}\x1b[0m`); const refreshedCredentials = await checkAndRefreshToken(provider, credentials); @@ -172,6 +171,7 @@ async function handleSingleModelChat(body, modelStr, clientRawRequest = null, re } // Use shared chatCore + const chatSettings = await getSettings(); const result = await handleChatCore({ body: { ...body, model: `${provider}/${model}` }, modelInfo: { provider, model }, @@ -181,6 +181,7 @@ async function handleSingleModelChat(body, modelStr, clientRawRequest = null, re connectionId: credentials.connectionId, userAgent, apiKey, + ccFilterNaming: !!chatSettings.ccFilterNaming, // Detect source format by endpoint + body sourceFormatOverride: request?.url ? detectFormatByEndpoint(new URL(request.url).pathname, body) : null, onCredentialsRefreshed: async (newCreds) => { @@ -202,8 +203,8 @@ async function handleSingleModelChat(body, modelStr, clientRawRequest = null, re const { shouldFallback } = await markAccountUnavailable(credentials.connectionId, result.status, result.error, provider, model); if (shouldFallback) { - log.warn("AUTH", `Account ${accountId}... unavailable (${result.status}), trying fallback`); - excludeConnectionId = credentials.connectionId; + log.warn("AUTH", `Account ${credentials.connectionName} unavailable (${result.status}), trying fallback`); + excludeConnectionIds.add(credentials.connectionId); lastError = result.error; lastStatus = result.status; continue; diff --git a/src/sse/handlers/embeddings.js b/src/sse/handlers/embeddings.js index a8a68e0..4938094 100644 --- a/src/sse/handlers/embeddings.js +++ b/src/sse/handlers/embeddings.js @@ -9,7 +9,7 @@ import { getSettings } from "@/lib/localDb"; import { getModelInfo } from "../services/model.js"; import { handleEmbeddingsCore } from "open-sse/handlers/embeddingsCore.js"; import { errorResponse, unavailableResponse } from "open-sse/utils/error.js"; -import { HTTP_STATUS } from "open-sse/config/constants.js"; +import { HTTP_STATUS } from "open-sse/config/runtimeConfig.js"; import * as log from "../utils/logger.js"; import { updateProviderCredentials, checkAndRefreshToken } from "../services/tokenRefresh.js"; @@ -80,12 +80,12 @@ export async function handleEmbeddings(request) { } // Credential + fallback loop (mirrors handleChat) - let excludeConnectionId = null; + const excludeConnectionIds = new Set(); let lastError = null; let lastStatus = null; while (true) { - const credentials = await getProviderCredentials(provider, excludeConnectionId, model); + const credentials = await getProviderCredentials(provider, excludeConnectionIds, model); // All accounts unavailable if (!credentials || credentials.allRateLimited) { @@ -95,7 +95,7 @@ export async function handleEmbeddings(request) { log.warn("EMBEDDINGS", `[${provider}/${model}] ${errorMsg} (${credentials.retryAfterHuman})`); return unavailableResponse(status, `[${provider}/${model}] ${errorMsg}`, credentials.retryAfter, credentials.retryAfterHuman); } - if (!excludeConnectionId) { + if (excludeConnectionIds.size === 0) { log.error("AUTH", `No credentials for provider: ${provider}`); return errorResponse(HTTP_STATUS.BAD_REQUEST, `No credentials for provider: ${provider}`); } @@ -103,8 +103,7 @@ export async function handleEmbeddings(request) { return errorResponse(lastStatus || HTTP_STATUS.SERVICE_UNAVAILABLE, lastError || "All accounts unavailable"); } - const accountId = credentials.connectionId.slice(0, 8); - log.info("AUTH", `Using ${provider} account: ${accountId}...`); + log.info("AUTH", `\x1b[32mUsing ${provider} account: ${credentials.connectionName}\x1b[0m`); const refreshedCredentials = await checkAndRefreshToken(provider, credentials); @@ -131,8 +130,8 @@ export async function handleEmbeddings(request) { const { shouldFallback } = await markAccountUnavailable(credentials.connectionId, result.status, result.error, provider, model); if (shouldFallback) { - log.warn("AUTH", `Account ${accountId}... unavailable (${result.status}), trying fallback`); - excludeConnectionId = credentials.connectionId; + log.warn("AUTH", `Account ${credentials.connectionName} unavailable (${result.status}), trying fallback`); + excludeConnectionIds.add(credentials.connectionId); lastError = result.error; lastStatus = result.status; continue; diff --git a/src/sse/services/auth.js b/src/sse/services/auth.js index 3daf41b..15e055c 100644 --- a/src/sse/services/auth.js +++ b/src/sse/services/auth.js @@ -11,10 +11,14 @@ let selectionMutex = Promise.resolve(); * Get provider credentials from localDb * Filters out unavailable accounts and returns the selected account based on strategy * @param {string} provider - Provider name - * @param {string|null} excludeConnectionId - Connection ID to exclude (for retry with next account) + * @param {Set|string|null} excludeConnectionIds - Connection ID(s) to exclude (for retry with next account) * @param {string|null} model - Model name for per-model rate limit filtering */ -export async function getProviderCredentials(provider, excludeConnectionId = null, model = null) { +export async function getProviderCredentials(provider, excludeConnectionIds = null, model = null) { + // Normalize to Set for consistent handling + const excludeSet = excludeConnectionIds instanceof Set + ? excludeConnectionIds + : (excludeConnectionIds ? new Set([excludeConnectionIds]) : new Set()); // Acquire mutex to prevent race conditions const currentMutex = selectionMutex; let resolveMutex; @@ -27,7 +31,7 @@ export async function getProviderCredentials(provider, excludeConnectionId = nul const providerId = resolveProviderId(provider); const connections = await getProviderConnections({ provider: providerId, isActive: true }); - log.debug("AUTH", `${provider} | total connections: ${connections.length}, excludeId: ${excludeConnectionId || "none"}, model: ${model || "any"}`); + log.debug("AUTH", `${provider} | total connections: ${connections.length}, excludeIds: ${excludeSet.size > 0 ? [...excludeSet].join(",") : "none"}, model: ${model || "any"}`); if (connections.length === 0) { log.warn("AUTH", `No credentials for ${provider}`); @@ -36,14 +40,14 @@ export async function getProviderCredentials(provider, excludeConnectionId = nul // Filter out model-locked and excluded connections const availableConnections = connections.filter(c => { - if (excludeConnectionId && c.id === excludeConnectionId) return false; + if (excludeSet.has(c.id)) return false; if (isModelLockActive(c, model)) return false; return true; }); log.debug("AUTH", `${provider} | available: ${availableConnections.length}/${connections.length}`); connections.forEach(c => { - const excluded = excludeConnectionId && c.id === excludeConnectionId; + const excluded = excludeSet.has(c.id); const locked = isModelLockActive(c, model); if (excluded || locked) { const lockUntil = getEarliestModelLockUntil(c); @@ -72,11 +76,13 @@ export async function getProviderCredentials(provider, excludeConnectionId = nul } const settings = await getSettings(); - const strategy = settings.fallbackStrategy || "fill-first"; + // Per-provider strategy overrides global setting + const providerOverride = (settings.providerStrategies || {})[providerId] || {}; + const strategy = providerOverride.fallbackStrategy || settings.fallbackStrategy || "fill-first"; let connection; if (strategy === "round-robin") { - const stickyLimit = settings.stickyRoundRobinLimit || 3; + const stickyLimit = providerOverride.stickyRoundRobinLimit || settings.stickyRoundRobinLimit || 3; // Sort by lastUsed (most recent first) to find current candidate const byRecency = [...availableConnections].sort((a, b) => { @@ -178,7 +184,8 @@ export async function markAccountUnavailable(connectionId, status, errorText, pr }); const lockKey = Object.keys(lockUpdate)[0]; - log.warn("AUTH", `${connectionId.slice(0, 8)} locked ${lockKey} for ${Math.round(cooldownMs / 1000)}s [${status}]`); + const connName = conn?.displayName || conn?.name || conn?.email || connectionId.slice(0, 8); + log.warn("AUTH", `${connName} locked ${lockKey} for ${Math.round(cooldownMs / 1000)}s [${status}]`); if (provider && status && reason) { console.error(`❌ ${provider} [${status}]: ${reason}`); @@ -228,7 +235,8 @@ export async function clearAccountError(connectionId, currentConnection, model = } await updateProviderConnection(connectionId, clearObj); - log.info("AUTH", `Account ${connectionId.slice(0, 8)} cleared lock for model=${model || "__all"}`); + const connName = conn?.displayName || conn?.name || conn?.email || connectionId.slice(0, 8); + log.info("AUTH", `Account ${connName} cleared lock for model=${model || "__all"}`); } /** diff --git a/src/sse/services/tokenRefresh.js b/src/sse/services/tokenRefresh.js index ebcf896..759178a 100644 --- a/src/sse/services/tokenRefresh.js +++ b/src/sse/services/tokenRefresh.js @@ -19,7 +19,8 @@ import { getAccessToken as _getAccessToken, refreshTokenByProvider as _refreshTokenByProvider, formatProviderCredentials as _formatProviderCredentials, - getAllAccessTokens as _getAllAccessTokens + getAllAccessTokens as _getAllAccessTokens, + refreshKiroToken as _refreshKiroToken } from "open-sse/services/tokenRefresh.js"; export const TOKEN_EXPIRY_BUFFER_MS = BUFFER_MS; @@ -50,6 +51,9 @@ export const refreshGitHubToken = (refreshToken) => export const refreshCopilotToken = (githubAccessToken) => _refreshCopilotToken(githubAccessToken, log); +export const refreshKiroToken = (refreshToken, providerSpecificData) => + _refreshKiroToken(refreshToken, providerSpecificData, log); + export const getAccessToken = (provider, credentials) => _getAccessToken(provider, credentials, log); diff --git a/tests/unit/openai-to-claude.test.js b/tests/unit/openai-to-claude.test.js new file mode 100644 index 0000000..bf0a525 --- /dev/null +++ b/tests/unit/openai-to-claude.test.js @@ -0,0 +1,124 @@ +/** + * Unit tests for open-sse/translator/request/openai-to-claude.js + * + * Tests cover: + * - openaiToClaudeRequest() - OpenAI to Claude request translation + * - Response format handling (json_schema, json_object) + */ + +import { describe, it, expect } from "vitest"; +import { openaiToClaudeRequest } from "../../open-sse/translator/request/openai-to-claude.js"; + +describe("openaiToClaudeRequest", () => { + describe("response_format handling", () => { + it("should inject JSON schema instructions for json_schema type", () => { + const body = { + messages: [{ role: "user", content: "What is 2+2?" }], + response_format: { + type: "json_schema", + json_schema: { + name: "math_response", + schema: { + type: "object", + properties: { + answer: { type: "number" }, + explanation: { type: "string" } + }, + required: ["answer", "explanation"] + } + } + } + }; + + const result = openaiToClaudeRequest("claude-sonnet-4.5", body, false); + + // Should have system array with instructions + expect(result.system).toBeDefined(); + expect(Array.isArray(result.system)).toBe(true); + + // Check that system prompt includes schema + const systemText = result.system + .filter(s => s.type === "text") + .map(s => s.text) + .join("\n"); + + expect(systemText).toContain("You must respond with valid JSON"); + expect(systemText).toContain("\"answer\""); + expect(systemText).toContain("\"explanation\""); + expect(systemText).toContain("Respond ONLY with the JSON object"); + }); + + it("should inject basic JSON instructions for json_object type", () => { + const body = { + messages: [{ role: "user", content: "Give me a JSON object" }], + response_format: { + type: "json_object" + } + }; + + const result = openaiToClaudeRequest("claude-sonnet-4.5", body, false); + + // Should have system array with instructions + expect(result.system).toBeDefined(); + expect(Array.isArray(result.system)).toBe(true); + + const systemText = result.system + .filter(s => s.type === "text") + .map(s => s.text) + .join("\n"); + + expect(systemText).toContain("You must respond with valid JSON"); + expect(systemText).toContain("Respond ONLY with a JSON object"); + }); + + it("should not modify system prompt when response_format is missing", () => { + const body = { + messages: [{ role: "user", content: "Hello" }] + }; + + const result = openaiToClaudeRequest("claude-sonnet-4.5", body, false); + + // Should have system but without JSON instructions + expect(result.system).toBeDefined(); + + const systemText = result.system + .filter(s => s.type === "text") + .map(s => s.text) + .join("\n"); + + // Should NOT contain JSON-specific instructions + expect(systemText).not.toContain("You must respond with valid JSON"); + }); + + it("should preserve existing system messages when adding response_format", () => { + const body = { + messages: [ + { role: "system", content: "You are a helpful math tutor." }, + { role: "user", content: "What is 2+2?" } + ], + response_format: { + type: "json_schema", + json_schema: { + schema: { + type: "object", + properties: { + result: { type: "number" } + } + } + } + } + }; + + const result = openaiToClaudeRequest("claude-sonnet-4.5", body, false); + + // Should preserve original system message + const systemText = result.system + .filter(s => s.type === "text") + .map(s => s.text) + .join("\n"); + + expect(systemText).toContain("You are a helpful math tutor"); + expect(systemText).toContain("You must respond with valid JSON"); + }); + }); +}); \ No newline at end of file diff --git a/tests/unit/translator-request-normalization.test.js b/tests/unit/translator-request-normalization.test.js new file mode 100644 index 0000000..0a67a91 --- /dev/null +++ b/tests/unit/translator-request-normalization.test.js @@ -0,0 +1,118 @@ +import { describe, it, expect } from "vitest"; + +import { FORMATS } from "../../open-sse/translator/formats.js"; +import { translateRequest } from "../../open-sse/translator/index.js"; +import { claudeToOpenAIRequest } from "../../open-sse/translator/request/claude-to-openai.js"; +import { filterToOpenAIFormat } from "../../open-sse/translator/helpers/openaiHelper.js"; +import { parseSSELine } from "../../open-sse/utils/streamHelpers.js"; + +describe("request normalization", () => { + it("claudeToOpenAIRequest flattens text-only content arrays into string", () => { + const body = { + messages: [ + { + role: "user", + content: [ + { type: "text", text: "hi" }, + { type: "text", text: "there" }, + ], + }, + ], + }; + + const result = claudeToOpenAIRequest("gpt-oss:120b", body, true); + expect(result.messages[0].content).toBe("hi\nthere"); + }); + + it("claudeToOpenAIRequest preserves multimodal arrays", () => { + const body = { + messages: [ + { + role: "user", + content: [ + { type: "text", text: "describe" }, + { + type: "image", + source: { + type: "base64", + media_type: "image/png", + data: "ZmFrZQ==", + }, + }, + ], + }, + ], + }; + + const result = claudeToOpenAIRequest("gpt-4o", body, true); + expect(Array.isArray(result.messages[0].content)).toBe(true); + }); + + it("filterToOpenAIFormat flattens text-only arrays to string", () => { + const body = { + messages: [ + { + role: "user", + content: [ + { type: "text", text: "a" }, + { type: "text", text: "b" }, + ], + }, + ], + }; + + const result = filterToOpenAIFormat(JSON.parse(JSON.stringify(body))); + expect(result.messages[0].content).toBe("a\nb"); + }); + + it("translateRequest keeps /v1/messages Claude->OpenAI text payloads string-safe", () => { + const body = { + model: "ollama/gpt-oss:120b", + system: [{ type: "text", text: "You are helpful." }], + messages: [ + { + role: "user", + content: [ + { type: "text", text: "hello" }, + { type: "text", text: "world" }, + ], + }, + ], + stream: true, + }; + + const result = translateRequest( + FORMATS.CLAUDE, + FORMATS.OPENAI, + "gpt-oss:120b", + JSON.parse(JSON.stringify(body)), + true, + null, + "ollama", + ); + + const userMessage = result.messages.find((m) => m.role === "user"); + expect(typeof userMessage.content).toBe("string"); + expect(userMessage.content).toBe("hello\nworld"); + }); + + it("parseSSELine supports provider raw NDJSON stream lines", () => { + const raw = JSON.stringify({ + model: "gpt-oss:120b", + message: { role: "assistant", content: "hello" }, + done: false, + }); + + const parsed = parseSSELine(raw); + expect(parsed).toEqual({ + model: "gpt-oss:120b", + message: { role: "assistant", content: "hello" }, + done: false, + }); + }); + + it("parseSSELine still supports SSE data lines", () => { + const parsed = parseSSELine('data: {"choices":[{"delta":{"content":"hi"}}]}'); + expect(parsed.choices[0].delta.content).toBe("hi"); + }); +});