multica

Author	SHA1	Message	Date
Jiayuan Zhang	6e71598c2c	Merge pull request #215 from multica-ai/codex/docs-prune-and-regenerate-core-docs docs: prune stale docs and regenerate prioritized core docs	2026-02-17 01:26:21 +08:00
Jiayuan Zhang	fc8a813120	Merge pull request #214 from multica-ai/codex/chat-context-window-indicator feat(chat): add context window usage indicator	2026-02-17 00:55:09 +08:00
Jiayuan Zhang	ecb0cd392e	chore(docs): remove non-e2e documentation	2026-02-17 00:46:36 +08:00
Jiayuan Zhang	ec8b62cef1	feat(chat): add context window usage indicator	2026-02-17 00:38:17 +08:00
Jiayuan Zhang	909efb5dab	refactor(core): remove legacy subagent registry subsystem	2026-02-17 00:07:15 +08:00
Jiayuan Zhang	357bf326e0	fix(data): propagate errors so is_error is set correctly in run-log Previously the data tool caught all errors and returned them as normal tool results with error info in the JSON content. This meant pi-agent-core never saw an exception and always set isError=false in the run-log, even for rate limit errors (errCode 9001) and other API failures. Now errors propagate to pi-agent-core which sets isError=true and formats the error message for the LLM automatically. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 03:39:11 +08:00
Jiayuan Zhang	9c8be30d3d	fix(test): increase timeout for summary fallback artifact extraction test UC4 test times out in CI (5s default) because generateSummary's API provider layer takes longer to fail on slow CI runners. Increase to 15s. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 01:10:18 +08:00
Jiayuan Zhang	aada2916f4	fix(agent): clear timeout timer in delegate tool to prevent unhandled rejection The setTimeout in runSubagentTask was never cleared when childAgent.run() completed before the timeout. The dangling timer would later reject an unobserved promise, causing an unhandled promise rejection crash in Node.js v15+. Capture the timer and clear it in a .finally() block. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 01:09:21 +08:00
Jiayuan Zhang	f60551195a	chore(agent): remove old sessions_spawn/sessions_list tools and update references Delete sessions-spawn.ts, sessions-list.ts and their tests. Update CLI to remove waitForSubagents polling workaround (delegate is synchronous). Update UI, desktop IPC, SWE-bench, and system prompt tests to use the new delegate tool name. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 01:09:21 +08:00
Jiayuan Zhang	d3ef8ecc31	feat(agent): replace sessions_spawn with synchronous delegate tool Replace the async sessions_spawn/sessions_list sub-agent system with a single synchronous `delegate` tool. The new tool runs tasks in parallel via Promise.all with per-task timeout, returning combined results directly in the tool response. This eliminates the need for registry, announce queue, persistence, and Hub involvement. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 01:09:21 +08:00
Jiayuan Zhang	94ae88ed8b	Merge pull request #208 from multica-ai/forrestchang/compaction-audit Context window: 4-phase compaction improvements	2026-02-16 00:01:58 +08:00
Jiayuan Zhang	0bce493e10	fix(compaction): handle pi-agent-core toolResult format in truncation and pruning The pre-emptive truncation, tool result pruning, and summary fallback only checked for Anthropic-style `role: "user"` messages with `type: "tool_result"` blocks. The actual runtime uses pi-agent-core format with `role: "toolResult"`, `toolCallId`, and `toolName` on the message itself. This caused truncation and pruning to silently skip all tool results in real agent runs. Add handlers for the pi-agent-core format in all four affected modules: - session-manager.ts: check both "user" and "toolResult" roles - tool-result-truncation.ts: new handler for toolResult format - tool-result-pruning.ts: new processToolResultMessage() + updated loops - summary-fallback.ts: include "toolResult" in artifact ref extraction Verified via agent-driven E2E tests (5 test sessions, 6 artifacts). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 23:39:52 +08:00
Jiayuan Zhang	b15e1eeb2a	test(compaction): harden E2E integration tests for artifact pipeline - Add real user messages for bootstrap protection in pruning tests - Fix artifact directory path assertions (baseDir vs sessions/baseDir) - Add cross-phase tests (Phase 1 truncation → Phase 2 pruning) - Remove conditional assertion guards that could silently skip checks - All 30 E2E integration tests now pass with mandatory assertions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 23:13:12 +08:00
Jiayuan Zhang	58f02a2080	fix(compaction): match artifact refs from both soft trim and truncation markers The extractArtifactRef regex only matched "Full result saved to" (from pre-emptive truncation) but not "Full result available at" (from soft trim). This caused hard clear to lose artifact references when preceded by soft trim in the same pruning pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 23:13:05 +08:00
Jiayuan Zhang	a1ac250e2b	feat(system-prompt): enhance report with truncation tracking and token estimates SectionReport now includes truncated/originalChars fields for budget-controlled sections. formatPromptReport shows estimated token count and truncation details. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 23:02:41 +08:00
Jiayuan Zhang	c3433871a6	feat(system-prompt): add bootstrap budget control for workspace and skills Workspace.md content is capped at 20k chars and skills prompt at 12k chars. Oversized content is intelligently truncated (head 70% + marker + tail 20%) with newline-boundary snapping. Inspired by OpenClaw's bootstrap budget system. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 23:02:34 +08:00
Jiayuan Zhang	5aa8a52784	feat(compaction): make pruning and summary artifact-aware Soft trim and hard clear now detect and preserve artifact references in their markers. Summary instructions include guidance to note artifact paths. Plain-text fallback extracts and lists all artifact references in a "Saved Artifacts" section. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 23:02:27 +08:00
Jiayuan Zhang	3f9a30423d	feat(session): add artifact storage and pre-emptive tool result truncation Oversized tool results (>30% of context window) are now saved as artifacts before being truncated in the session. The LLM sees a truncated version with head+tail preservation and a marker pointing to the full artifact file, which it can re-read on demand. This prevents information loss during context window management. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 23:02:18 +08:00
Jiayuan Zhang	b8fb671a4b	feat(agent): add reply context and responsiveness guidance to channel system prompt Add instructions for the agent to understand [Replying to: "..."] annotations and to send brief acknowledgments before tool calls when messages come from messaging channels. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 21:39:45 +08:00
Jiayuan Zhang	b412ca902b	refactor(compaction): remove dead code and legacy count mode - Remove compactMessagesWithSummary (~100 lines, never called; only the Chunked variant was used) - Remove compactMessagesByCount, findSafeCompactionPoint, and all count-mode references (~90 lines) - Narrow CompactionResult.reason to "tokens" \| "summary" \| "pruning" - Narrow compactionMode to "tokens" \| "summary" (was "count" \| ...) - Simplify session-manager: remove maxMessages/keepLast params, enable tool result pruning by default Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 21:37:23 +08:00
Jiayuan Zhang	92cf312843	refactor(compaction): remove pre-flight tool result pruning Pre-flight compaction runs in-memory only (not persisted), so tool result pruning in this path was wasted work — results were thrown away after the LLM call. Post-turn compaction still handles pruning and persists the results. Only Phase 2 (emergency message drop) remains as a safety net in pre-flight. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 21:37:15 +08:00
Jiayuan Zhang	fbb0b11c6e	fix(compaction): fix system prompt token estimation and reduce safety margin - estimateSystemPromptTokens now uses estimateTokens() (chars/4) instead of chars/2, eliminating the 2x overestimate that caused pre-flight compaction to fire on every LLM call at small context windows - ESTIMATION_SAFETY_MARGIN reduced from 1.5 to 1.2, increasing usable context from ~53% to ~73% before compaction triggers At 200k context, effective usable tokens before compaction improved from ~86k to ~120k message tokens (39% increase). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 21:37:09 +08:00
Jiayuan Zhang	40a2e8ae55	fix(context-window): prioritize config over model for context window resolution resolveContextWindowInfo now uses config > model > default priority so explicit --context-window flag overrides model defaults. Also adds --context-window CLI option to the run command. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 21:37:02 +08:00
Jiayuan Zhang	71f95d042a	fix(agent): prevent double-save of assistant message on abort Track the last assistant message saved by the message_end event handler and skip saving it again in the abort handler. This prevents the duplicate assistant entries in session.jsonl that caused the "tool_call_id is not found" bug. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 21:31:36 +08:00
Jiayuan Zhang	fa2616c390	fix(session): drop duplicate assistant messages in transcript repair When a session is aborted mid-tool-execution, the assistant message can be persisted twice (once by message_end, once by the abort handler). The repair logic failed to handle this: it generated a synthetic tool result for the first copy but deduplicated the result for the second, leaving an orphaned tool call that caused "tool_call_id is not found" errors on all subsequent API calls. Detect and remove duplicate assistant messages whose tool call IDs have all already been paired with results from an earlier copy. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 21:31:30 +08:00
Jiayuan Zhang	084657868f	revert(agent): remove parallel tool execution patch, keep serial Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 20:43:37 +08:00
Jiayuan Zhang	b394b0ccf9	fix(skills): use python3 and inject skill directory path into prompt - SKILL.md: python → python3 (macOS has no `python` binary) - skills/index.ts: inject skill directory path so agent can resolve relative paths like scripts/recalc.py to absolute paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 19:53:18 +08:00
Jiayuan Zhang	691e33e71e	fix(tools): auto-create group when custom groupId is provided LLM often invents custom groupId strings that don't exist in the registry, causing "group not found" errors. Now auto-creates the group instead, matching the behavior when `next` is provided. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 19:53:12 +08:00
Jiayuan Zhang	d162ba98a9	fix(agent): pass sessionId to tools for sub-agent session tracking toolsOptions spread `options` which had sessionId undefined for auto-generated sessions. This caused sessions_list and sessions_spawn to fail with "No session ID available" — breaking sub-agent orchestration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 19:53:03 +08:00
Jiayuan Zhang	a254daff01	feat(agent): enable parallel tool execution via pi-agent-core patch Replace sequential for+await tool dispatch with Promise.allSettled for parallel execution. All tool_execution_start events emit immediately, tools run concurrently, results are processed in original order. Also fix run-log toolStartTimes to key by toolCallId instead of toolName to prevent collisions with parallel same-name tools. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 18:47:53 +08:00
Jiayuan Zhang	c012bff246	fix(tools): show findings and full runId in sessions_list list view Grouped runs now display findings for completed sub-agents (up to 4000 chars). Ungrouped runs increased truncation from 200 to 4000 chars. All status lines include full runId for subsequent API queries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 18:47:44 +08:00
Jiayuan Zhang	02ed09b77b	fix(tools): use boolean error flag in web_fetch and web_search error responses Return error: true (boolean) with code field instead of error: "string_code" to match ToolErrorPayload convention. Also update runner.ts formatRunLogToolSummary to prefer details.code for error categorization. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 18:47:37 +08:00
Jiayuan Zhang	755ed5e9de	feat(run-log): add result metadata to tool_end events Enrich tool_end events with result_chars, result_summary, and error_type fields. Since run-log.jsonl is append-only and never compacted, this preserves tool result metadata that would otherwise be lost when session.jsonl undergoes compaction. New fields: - result_chars: total character count of result content - result_summary: short tool-specific summary (e.g. "10 results", "12.5KB", "finance/get_price_snapshot") - error_type: error category for tool errors (e.g. "fetch_failed") Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 18:06:42 +08:00
Jiayuan Zhang	1c24dd2885	fix(credentials): add fallback to ~/.super-multica for custom data dirs When SMC_DATA_DIR is set (e.g., for E2E tests), the credentials lookup now falls back to ~/.super-multica/credentials.json5 if the custom data dir doesn't have its own credentials file. This mirrors the existing fallback pattern in auth-store.ts and removes the need for the SMC_CREDENTIALS_PATH workaround in E2E tests. Lookup order: 1. SMC_CREDENTIALS_PATH env var (explicit override) 2. {DATA_DIR}/credentials.json5 (current data dir) 3. ~/.super-multica/credentials.json5 (default location fallback) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 17:48:21 +08:00
Jiayuan Zhang	a2c1379c1d	feat(cli): add --run-log flag and session dir output for agent-driven E2E testing Add --run-log CLI flag to enable structured run logging without env var. Print session directory path to stderr when run-log is enabled so Coding Agents can easily locate log files for analysis. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 16:03:40 +08:00
Jiayuan Zhang	239dc5a7c6	fix(agent): report accurate compaction metrics and add run-log observability Compaction was reporting only 189 tokens removed for 6 messages because Phase 1 (tool result pruning) hollowed out messages before Phase 2 (summary compaction) measured them. Now captures pre-pruning token count and reports combined savings from both phases. Also threads RunLog through SessionManager to emit tool_result_pruning and compaction_detail events, and adds preflight pruning stats logging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 15:42:04 +08:00
Jiayuan Zhang	99167b9837	fix(agent): re-validate tool pairing after preflight compaction The transformContext pipeline ran sanitizeToolUseResultPairing() before preflightCompact(), but compaction (pruneToolResults + compactMessagesTokenAware) can break tool_use/tool_result pairing by dropping assistant messages while keeping their tool_result blocks. This caused 400 errors from the Anthropic API: "unexpected tool_use_id found in tool_result blocks". Add a second sanitizeToolUseResultPairing() call after preflightCompact() to repair any orphaned tool_result blocks created during compaction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 14:25:48 +08:00
Jiayuan Zhang	a4b7deac3e	fix(skills): preserve all user files during bundled skill upgrades Instead of only protecting .env files, use cpSync with force:true to overlay bundle files onto the existing directory. This preserves any user-created files (credentials.json, token.json, etc.) that don't exist in the bundle, rather than deleting and re-copying the entire directory. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 14:15:31 +08:00
Jiayuan Zhang	8848f09107	refactor(skills): remove hardcoded API key hints, use dynamic web search Remove hardcoded service API key hints from getApiKeyHint() — skill-specific hints should be discovered dynamically by the agent via web_search/web_fetch at runtime. Only keep LLM provider hints which are system-level. Update skill-creator instructions accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 13:39:14 +08:00
Jiayuan Zhang	6f67bb77b8	feat(skills): expose ineligible skills in system prompt for auto-discovery Add buildIneligibleSkillsSummary() to SkillManager that surfaces skills with actionable issues (missing env vars, binaries) in the agent's system prompt. Expand getApiKeyHint() with common service API providers. Update buildSkillsSection() to guide the agent to suggest activating inactive skills when they match user intent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 13:34:51 +08:00
Jiayuan Zhang	bd0b380e2e	refactor(credentials): remove skills.env.json5 support Remove centralized skills.env.json5 in favor of per-skill .env files. Clean up CredentialManager by removing hasEnv/getEnv/getResolvedEnvSnapshot methods and skills env loading. Update CLI credentials and skills commands. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 13:34:51 +08:00
Jiayuan Zhang	9f98ccca58	feat(skills): store API keys in per-skill .env files Move skill environment variables from centralized skills.env.json5 to per-skill .env files within each skill's directory. This makes credential management more intuitive and self-contained. - Fix parser to handle metadata.requires, always, os, skillKey, install - Add minimal .env parser (dotenv.ts) and load .env at skill parse time - Add env field to Skill type for per-skill environment variables - Update eligibility checker to use skill.env instead of CredentialManager - Preserve user .env files across bundled skill upgrades in loader Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 13:34:51 +08:00
Jiayuan Zhang	7a92f716d9	refactor(skills): remove unused SkillConfig.apiKey/env/primaryEnv (#192 ) These fields were only checked during eligibility but never injected at runtime via credentialManager.getEnv(). Remove the half-implemented per-skill credential config to reduce confusion. API key configuration remains supported via skills.env.json5 and process.env. Refs: MUL-246, MUL-255 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 03:22:02 +08:00
Jiayuan Zhang	6234ba6139	refactor(skills): deduplicate office/ module across docx, xlsx, pptx Move the shared office/ directory (pack/unpack, validators, schemas, soffice wrapper) to skills/_shared/office/ and replace the three identical copies with symlinks. Update skill loader to dereference symlinks during copy to managed directory, and skip _-prefixed directories in the bundled skills scan. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 01:31:53 +08:00
yushen	b8aa5ad7e6	refactor(core): convert MULTICA_API_URL constant to getApiBaseUrl() function Lazy-read process.env at call time instead of module import time. This ensures the env bridge in the Electron main process has time to set process.env.MULTICA_API_URL before the first API request. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-14 07:30:36 +08:00
Jiayuan Zhang	ef8b38899a	fix(core): check for "toolCall" type in hasToolUse() to match pi-ai types The hasToolUse() function was checking for "tool_use" (raw Anthropic format) but pi-ai normalizes tool call blocks to type "toolCall". This made tool narration non-functional in the ChannelManager (Desktop/embedded) path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-14 03:32:16 +08:00
Jiayuan Zhang	81998e6309	fix(telegram): skip tool narration messages, only send final answer When the agent uses tools (web search, etc.), it generates intermediate narration text like "Let me search..." before each tool call. These were being sent as separate Telegram messages, causing message spam. Now we detect tool_use blocks in the message content and skip sending those intermediate messages — only the final answer reaches the user. Applied to both Desktop channel plugin and Gateway Telegram service. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-14 02:04:17 +08:00
Jiayuan Zhang	c2a2c75b24	Merge pull request #174 from multica-ai/forrestchang/arch-analysis test(core): migrate tests to strict mock policy with real implementations	2026-02-14 00:13:24 +08:00
Jiayuan Zhang	f6b6c2909b	test(core): migrate 5 tests from internal mocks to real implementations - store.test.ts: use baseDir option instead of mocking paths.js - session-file-repair.test.ts: remove write-lock mock, assert behavior - announce-findings.test.ts: use real storage with temp dirs - sessions-list.test.ts: use real registry with seed helper - compaction.test.ts: mock only third-party pi-coding-agent, use real context-window internals All tests exercise real code paths, improving confidence in actual behavior per the strict mock policy. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 22:17:52 +08:00
Jiayuan Zhang	37e550200c	refactor(core): add baseDir DI and test helpers for mock-free testing Add AuthStoreOptions with baseDir to auth-profiles/store.ts functions, add baseDir option to announce.ts readLatestAssistantReply, and add seedSubagentRunForTests helper to registry.ts. These enable tests to use real implementations with temp directories instead of mocking internal modules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 22:17:41 +08:00

1 2 3

123 commits