Commit graph

267 commits

Author SHA1 Message Date
Jiayuan Zhang
3f9a30423d feat(session): add artifact storage and pre-emptive tool result truncation
Oversized tool results (>30% of context window) are now saved as artifacts
before being truncated in the session. The LLM sees a truncated version with
head+tail preservation and a marker pointing to the full artifact file,
which it can re-read on demand. This prevents information loss during
context window management.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 23:02:18 +08:00
Jiayuan Zhang
b412ca902b refactor(compaction): remove dead code and legacy count mode
- Remove compactMessagesWithSummary (~100 lines, never called; only
  the Chunked variant was used)
- Remove compactMessagesByCount, findSafeCompactionPoint, and all
  count-mode references (~90 lines)
- Narrow CompactionResult.reason to "tokens" | "summary" | "pruning"
- Narrow compactionMode to "tokens" | "summary" (was "count" | ...)
- Simplify session-manager: remove maxMessages/keepLast params,
  enable tool result pruning by default

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 21:37:23 +08:00
Jiayuan Zhang
92cf312843 refactor(compaction): remove pre-flight tool result pruning
Pre-flight compaction runs in-memory only (not persisted), so tool
result pruning in this path was wasted work — results were thrown away
after the LLM call. Post-turn compaction still handles pruning and
persists the results. Only Phase 2 (emergency message drop) remains
as a safety net in pre-flight.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 21:37:15 +08:00
Jiayuan Zhang
fbb0b11c6e fix(compaction): fix system prompt token estimation and reduce safety margin
- estimateSystemPromptTokens now uses estimateTokens() (chars/4) instead
  of chars/2, eliminating the 2x overestimate that caused pre-flight
  compaction to fire on every LLM call at small context windows
- ESTIMATION_SAFETY_MARGIN reduced from 1.5 to 1.2, increasing usable
  context from ~53% to ~73% before compaction triggers

At 200k context, effective usable tokens before compaction improved from
~86k to ~120k message tokens (39% increase).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 21:37:09 +08:00
Jiayuan Zhang
40a2e8ae55 fix(context-window): prioritize config over model for context window resolution
resolveContextWindowInfo now uses config > model > default priority so
explicit --context-window flag overrides model defaults. Also adds
--context-window CLI option to the run command.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 21:37:02 +08:00
Jiayuan Zhang
084657868f revert(agent): remove parallel tool execution patch, keep serial
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 20:43:37 +08:00
Jiayuan Zhang
b394b0ccf9 fix(skills): use python3 and inject skill directory path into prompt
- SKILL.md: python → python3 (macOS has no `python` binary)
- skills/index.ts: inject skill directory path so agent can resolve
  relative paths like scripts/recalc.py to absolute paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 19:53:18 +08:00
Jiayuan Zhang
691e33e71e fix(tools): auto-create group when custom groupId is provided
LLM often invents custom groupId strings that don't exist in the
registry, causing "group not found" errors. Now auto-creates the
group instead, matching the behavior when `next` is provided.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 19:53:12 +08:00
Jiayuan Zhang
d162ba98a9 fix(agent): pass sessionId to tools for sub-agent session tracking
toolsOptions spread `options` which had sessionId undefined for
auto-generated sessions. This caused sessions_list and sessions_spawn
to fail with "No session ID available" — breaking sub-agent orchestration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 19:53:03 +08:00
Jiayuan Zhang
a254daff01 feat(agent): enable parallel tool execution via pi-agent-core patch
Replace sequential for+await tool dispatch with Promise.allSettled for
parallel execution. All tool_execution_start events emit immediately,
tools run concurrently, results are processed in original order.

Also fix run-log toolStartTimes to key by toolCallId instead of toolName
to prevent collisions with parallel same-name tools.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 18:47:53 +08:00
Jiayuan Zhang
c012bff246 fix(tools): show findings and full runId in sessions_list list view
Grouped runs now display findings for completed sub-agents (up to 4000
chars). Ungrouped runs increased truncation from 200 to 4000 chars. All
status lines include full runId for subsequent API queries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 18:47:44 +08:00
Jiayuan Zhang
02ed09b77b fix(tools): use boolean error flag in web_fetch and web_search error responses
Return error: true (boolean) with code field instead of error: "string_code"
to match ToolErrorPayload convention. Also update runner.ts formatRunLogToolSummary
to prefer details.code for error categorization.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 18:47:37 +08:00
Jiayuan Zhang
755ed5e9de feat(run-log): add result metadata to tool_end events
Enrich tool_end events with result_chars, result_summary, and
error_type fields. Since run-log.jsonl is append-only and never
compacted, this preserves tool result metadata that would otherwise
be lost when session.jsonl undergoes compaction.

New fields:
- result_chars: total character count of result content
- result_summary: short tool-specific summary (e.g. "10 results",
  "12.5KB", "finance/get_price_snapshot")
- error_type: error category for tool errors (e.g. "fetch_failed")

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 18:06:42 +08:00
Jiayuan Zhang
1c24dd2885 fix(credentials): add fallback to ~/.super-multica for custom data dirs
When SMC_DATA_DIR is set (e.g., for E2E tests), the credentials lookup
now falls back to ~/.super-multica/credentials.json5 if the custom
data dir doesn't have its own credentials file. This mirrors the
existing fallback pattern in auth-store.ts and removes the need for
the SMC_CREDENTIALS_PATH workaround in E2E tests.

Lookup order:
1. SMC_CREDENTIALS_PATH env var (explicit override)
2. {DATA_DIR}/credentials.json5 (current data dir)
3. ~/.super-multica/credentials.json5 (default location fallback)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 17:48:21 +08:00
Jiayuan Zhang
75fac3a2d7 fix(auth): fallback to dev auth.json for E2E tests
web_search and data tools authenticate via auth.json (sid + deviceId).
When SMC_DATA_DIR is set (e.g. for E2E tests), the auth file may not
exist in the custom dir. Now getLocalAuth() falls back to
~/.super-multica-dev/auth.json, which is created by pnpm dev:local
Desktop login and valid for the dev backend (api-dev.copilothub.ai).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:37:26 +08:00
Jiayuan Zhang
a2c1379c1d feat(cli): add --run-log flag and session dir output for agent-driven E2E testing
Add --run-log CLI flag to enable structured run logging without env var.
Print session directory path to stderr when run-log is enabled so Coding
Agents can easily locate log files for analysis.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:03:40 +08:00
Jiayuan Zhang
239dc5a7c6 fix(agent): report accurate compaction metrics and add run-log observability
Compaction was reporting only 189 tokens removed for 6 messages because
Phase 1 (tool result pruning) hollowed out messages before Phase 2
(summary compaction) measured them. Now captures pre-pruning token count
and reports combined savings from both phases.

Also threads RunLog through SessionManager to emit tool_result_pruning
and compaction_detail events, and adds preflight pruning stats logging.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 15:42:04 +08:00
Jiayuan Zhang
dba0c32d74
Merge pull request #199 from multica-ai/forrestchang/skill-env-storage
feat(skills): implement per-skill .env files with auto-discovery
2026-02-15 14:46:32 +08:00
Jiayuan Zhang
99167b9837 fix(agent): re-validate tool pairing after preflight compaction
The transformContext pipeline ran sanitizeToolUseResultPairing() before
preflightCompact(), but compaction (pruneToolResults + compactMessagesTokenAware)
can break tool_use/tool_result pairing by dropping assistant messages while
keeping their tool_result blocks. This caused 400 errors from the Anthropic API:
"unexpected tool_use_id found in tool_result blocks".

Add a second sanitizeToolUseResultPairing() call after preflightCompact()
to repair any orphaned tool_result blocks created during compaction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 14:25:48 +08:00
Jiayuan Zhang
57805cddb8 fix(ui): remove Cmd+B sidebar toggle shortcut
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 14:18:49 +08:00
Jiayuan Zhang
a4b7deac3e fix(skills): preserve all user files during bundled skill upgrades
Instead of only protecting .env files, use cpSync with force:true to
overlay bundle files onto the existing directory. This preserves any
user-created files (credentials.json, token.json, etc.) that don't
exist in the bundle, rather than deleting and re-copying the entire
directory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 14:15:31 +08:00
Jiayuan Zhang
8848f09107 refactor(skills): remove hardcoded API key hints, use dynamic web search
Remove hardcoded service API key hints from getApiKeyHint() — skill-specific
hints should be discovered dynamically by the agent via web_search/web_fetch
at runtime. Only keep LLM provider hints which are system-level. Update
skill-creator instructions accordingly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 13:39:14 +08:00
Jiayuan Zhang
6f67bb77b8 feat(skills): expose ineligible skills in system prompt for auto-discovery
Add buildIneligibleSkillsSummary() to SkillManager that surfaces skills
with actionable issues (missing env vars, binaries) in the agent's system
prompt. Expand getApiKeyHint() with common service API providers. Update
buildSkillsSection() to guide the agent to suggest activating inactive
skills when they match user intent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 13:34:51 +08:00
Jiayuan Zhang
bd0b380e2e refactor(credentials): remove skills.env.json5 support
Remove centralized skills.env.json5 in favor of per-skill .env files.
Clean up CredentialManager by removing hasEnv/getEnv/getResolvedEnvSnapshot
methods and skills env loading. Update CLI credentials and skills commands.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 13:34:51 +08:00
Jiayuan Zhang
9f98ccca58 feat(skills): store API keys in per-skill .env files
Move skill environment variables from centralized skills.env.json5 to
per-skill .env files within each skill's directory. This makes credential
management more intuitive and self-contained.

- Fix parser to handle metadata.requires, always, os, skillKey, install
- Add minimal .env parser (dotenv.ts) and load .env at skill parse time
- Add env field to Skill type for per-skill environment variables
- Update eligibility checker to use skill.env instead of CredentialManager
- Preserve user .env files across bundled skill upgrades in loader

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 13:34:51 +08:00
Naiyuan Qing
430f2c177e refactor(ui): move LoadingIndicator into MessageList
- Move LoadingIndicator from ChatView into MessageList for consistent padding
- Add isLoading and hasPendingApprovals props to MessageList
- Adjust message spacing (my-1 → my-2) for better visual balance

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-15 11:03:59 +08:00
Naiyuan Qing
deb747a859 refactor(ui): unify loading indicator component
- Create LoadingIndicator component with "generating" and "streaming" variants
- Remove inline loading indicator from StreamingMarkdown (empty content returns empty fragment)
- Use unified LoadingIndicator in ChatView with consistent positioning
- Eliminates layout shift between different loading states

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-15 10:52:46 +08:00
Naiyuan Qing
c6ca5f3270 refactor(ui): unify container layout and adjust spacing
- Use container utility class consistently across chat components
- Change container max-width from 5xl to 4xl for better readability
- Adjust message bubble padding (p-3 -> p-2)
- Fix logout dropdown alignment and add destructive variant

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-15 10:47:59 +08:00
Naiyuan Qing
0c5de3c5f4 fix(ui): preserve newlines in chat input multiline text
- Use TipTap's getText({ blockSeparator: '\n' }) instead of doc.textContent
  to preserve newlines between paragraphs when submitting messages
- Add whitespace-pre-wrap CSS to user message bubbles to render newlines
- Add className prop support to StreamingMarkdown component

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-15 10:32:58 +08:00
yushen
4dba1cfdf0 refactor: unify API URL env var to MULTICA_API_URL
Replace scattered API_URL, MAIN_VITE_API_URL, and RENDERER_VITE_API_URL
with a single MULTICA_API_URL across all apps and packages.

- Desktop: use envPrefix to expose MULTICA_* to main process, rename
  RENDERER_VITE_API_URL → RENDERER_VITE_MULTICA_API_URL, remove
  MAIN_VITE_API_URL (now read directly via MULTICA_API_URL)
- Web: add .env.development with MULTICA_API_URL, enforce required check
  in next.config.ts, update .gitignore to allow .env.development
- Core: make MULTICA_API_URL required in api-client (no silent fallback)
- Scripts: pass MULTICA_API_URL in dev-local.sh for web process
- Turbo: update globalEnv from API_URL to MULTICA_API_URL
- Docs: update references to the new env var name

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 06:31:00 +08:00
Jiayuan Zhang
7a92f716d9
refactor(skills): remove unused SkillConfig.apiKey/env/primaryEnv (#192)
These fields were only checked during eligibility but never injected
at runtime via credentialManager.getEnv(). Remove the half-implemented
per-skill credential config to reduce confusion.

API key configuration remains supported via skills.env.json5 and
process.env.

Refs: MUL-246, MUL-255

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 03:22:02 +08:00
Jiayuan Zhang
058af56d47
fix(ui): show generating indicator while agent is processing (#191)
When the user sends a message and the agent hasn't started streaming
yet, the chat area showed no visual feedback. Now a "Generating..."
spinner appears between message send and the first streaming content,
matching the existing indicator style used in StreamingMarkdown.

Closes MUL-224

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 03:13:18 +08:00
Jiayuan Zhang
418282da15
Merge pull request #183 from multica-ai/forrestchang/optimize-office-skills
refactor(skills): deduplicate office module and merge PDF scripts
2026-02-15 02:05:04 +08:00
Jiayuan Zhang
6234ba6139 refactor(skills): deduplicate office/ module across docx, xlsx, pptx
Move the shared office/ directory (pack/unpack, validators, schemas,
soffice wrapper) to skills/_shared/office/ and replace the three
identical copies with symlinks. Update skill loader to dereference
symlinks during copy to managed directory, and skip _-prefixed
directories in the bundled skills scan.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:31:53 +08:00
Jiayuan Zhang
40fb94e3c3 feat(utils): add SMC_DATA_DIR env var to override root data directory
Allows isolating dev and prod data by setting SMC_DATA_DIR (e.g.
~/.super-multica-dev). Also fixes cron/store.ts which bypassed
DATA_DIR with a hardcoded path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:39:13 +08:00
yushen
1a9e915d49 fix(hub): skip text processing after MessageAggregator reset
Add an `active` flag so that message_update/message_end events are
ignored after reset() until the next message_start, preventing stale
accumulated text from being re-emitted as a block.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 07:49:20 +08:00
yushen
b8aa5ad7e6 refactor(core): convert MULTICA_API_URL constant to getApiBaseUrl() function
Lazy-read process.env at call time instead of module import time.
This ensures the env bridge in the Electron main process has time to
set process.env.MULTICA_API_URL before the first API request.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 07:30:36 +08:00
Jiayuan Zhang
ef8b38899a fix(core): check for "toolCall" type in hasToolUse() to match pi-ai types
The hasToolUse() function was checking for "tool_use" (raw Anthropic format)
but pi-ai normalizes tool call blocks to type "toolCall". This made tool
narration non-functional in the ChannelManager (Desktop/embedded) path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 03:32:16 +08:00
Jiayuan Zhang
3c911807e1 chore(channels): remove unused Desktop Telegram channel plugin
The Telegram bot is handled entirely by the Gateway
(apps/gateway/telegram/). The Desktop channel plugin was never
registered (commented out in initChannels) and is dead code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 02:28:00 +08:00
Jiayuan Zhang
dde8cc542a feat(telegram): show editable status message during tool execution
Instead of silently discarding tool narration or spamming separate
messages, send a single editable Telegram message that updates in-place
as the agent works. First tool narration sends a reply; subsequent
narrations edit the same message. The final answer is sent as a new
message.

- Add replyTextEditable/editText to ChannelOutboundAdapter interface
- Implement editFormatted + editable methods in Telegram plugin
- Track statusMessageId in ChannelManager, clear on agent_end/error
- Add sendOrEditStatus to Gateway TelegramService with same behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 02:14:42 +08:00
Jiayuan Zhang
81998e6309 fix(telegram): skip tool narration messages, only send final answer
When the agent uses tools (web search, etc.), it generates intermediate
narration text like "Let me search..." before each tool call. These were
being sent as separate Telegram messages, causing message spam. Now we
detect tool_use blocks in the message content and skip sending those
intermediate messages — only the final answer reaches the user.

Applied to both Desktop channel plugin and Gateway Telegram service.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 02:04:17 +08:00
Jiayuan Zhang
8270762d66 feat(telegram): convert Markdown tables to vertical list format
Telegram doesn't support HTML tables, so pipe-delimited Markdown tables
were rendered as hard-to-read plain text on mobile. This converts tables
to a vertical "Header: Value" format with bold first column before
sending to Telegram.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 01:56:21 +08:00
Jiayuan Zhang
9189883710
Merge pull request #175 from multica-ai/forrestchang/telegram-arch
feat(gateway): Telegram QR-based connection flow with centralized bot
2026-02-14 01:45:46 +08:00
Jiayuan Zhang
c2a2c75b24
Merge pull request #174 from multica-ai/forrestchang/arch-analysis
test(core): migrate tests to strict mock policy with real implementations
2026-02-14 00:13:24 +08:00
Jiang Bohan
a86709a4cd chore: switch API host from api-dev.copilothub.ai to api.multica.ai
Update all backend API base URLs to use the production multica.ai domain.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 23:25:00 +08:00
Jiayuan Zhang
4cb8b93f93 refactor(desktop): hide manual Telegram bot creation, use official Gateway bot only
Remove the user-facing ability to create custom Telegram bots via BotFather.
Non-technical users should only need to message @multica_bot on Telegram.
- Disable telegramChannel plugin registration in initChannels()
- Remove ConnectStep from onboarding flow (Privacy → Provider → Start)
- Replace TelegramCard with simple text pointing to @multica_bot

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 22:20:39 +08:00
Jiayuan Zhang
f6b6c2909b test(core): migrate 5 tests from internal mocks to real implementations
- store.test.ts: use baseDir option instead of mocking paths.js
- session-file-repair.test.ts: remove write-lock mock, assert behavior
- announce-findings.test.ts: use real storage with temp dirs
- sessions-list.test.ts: use real registry with seed helper
- compaction.test.ts: mock only third-party pi-coding-agent, use real
  context-window internals

All tests exercise real code paths, improving confidence in actual
behavior per the strict mock policy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 22:17:52 +08:00
Jiayuan Zhang
37e550200c refactor(core): add baseDir DI and test helpers for mock-free testing
Add AuthStoreOptions with baseDir to auth-profiles/store.ts functions,
add baseDir option to announce.ts readLatestAssistantReply, and add
seedSubagentRunForTests helper to registry.ts. These enable tests to
use real implementations with temp directories instead of mocking
internal modules.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 22:17:41 +08:00
Jiayuan Zhang
b6db23c4b3 feat(agent): add structured run logging for debugging
Introduce a RunLog system that records agent execution events as
structured JSONL to ~/.super-multica/sessions/{id}/run-log.jsonl.
Enable via MULTICA_RUN_LOG=1 env var or AgentOptions.enableRunLog.

Logs: run lifecycle, LLM calls, tool execution timing, context
overflow recovery, auth profile rotation, error classification,
and compaction events.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 21:38:02 +08:00
Jiayuan Zhang
27c3ba5682 feat(web): add Accept: text/markdown header for Cloudflare Markdown for Agents
Prefer markdown responses from servers that support Cloudflare's Markdown
for Agents feature, reducing token usage by ~80% when available. Non-supporting
servers fall back to HTML as before.

- Update Accept header to prefer text/markdown in web_fetch requests
- Add text/markdown content-type handling to skip HTML parsing pipeline
- Capture x-markdown-tokens response header in WebFetchResult
- Add extractMarkdownTitle() helper for native markdown title extraction

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 19:47:17 +08:00