Commit graph

1018 commits

Author SHA1 Message Date
Jiayuan Zhang
45acb965ba docs: add SWE-bench section to CLAUDE.md
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 18:32:04 +08:00
Jiayuan Zhang
10c57c0f7a docs: add SWE-bench runner guide
Covers the full pipeline: dataset download, agent execution,
result analysis, and official Docker evaluation. Includes
runner options, output format, known limitations, and initial
benchmark results.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 18:30:58 +08:00
Jiayuan Zhang
90d374ffd5 feat(scripts): add SWE-bench runner for Multica agent evaluation
- download-dataset.py: fetches SWE-bench Lite/Verified/Full from HuggingFace
- run.ts: core runner that clones repos, runs Agent, collects git diff patches
- evaluate.sh: wrapper for official SWE-bench Docker evaluation harness
- analyze.ts: summarizes run results with per-repo and timing breakdowns

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 18:05:17 +08:00
Jiayuan Zhang
47f8e621c8
Merge pull request #201 from multica-ai/forrestchang/debug-agent-logs
fix(agent): report accurate compaction metrics and add run-log observability
2026-02-15 16:58:27 +08:00
Jiayuan Zhang
75fac3a2d7 fix(auth): fallback to dev auth.json for E2E tests
web_search and data tools authenticate via auth.json (sid + deviceId).
When SMC_DATA_DIR is set (e.g. for E2E tests), the auth file may not
exist in the custom dir. Now getLocalAuth() falls back to
~/.super-multica-dev/auth.json, which is created by pnpm dev:local
Desktop login and valid for the dev backend (api-dev.copilothub.ai).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:37:26 +08:00
Jiayuan Zhang
1ffa8b1389 docs: add SMC_DATA_DIR isolation for E2E test sessions
E2E tests now use ~/.super-multica-e2e to avoid polluting dev
(~/.super-multica-dev) or production (~/.super-multica) session data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:29:39 +08:00
Jiayuan Zhang
a823e391b9 docs: add E2E testing workflow to CLAUDE.md and update guide with MULTICA_API_URL
Add agent-driven E2E testing section to CLAUDE.md so all team members'
Coding Agents automatically know how to run and analyze E2E tests.
Update guide with MULTICA_API_URL requirement discovered during testing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:23:37 +08:00
Jiayuan Zhang
496eda82d7 docs: add agent-driven E2E testing guide for Coding Agents
Comprehensive guide teaching Coding Agents how to perform automated E2E
testing by running the agent CLI with --run-log and analyzing structured
run-log events. Includes feature test playbooks, event reference, and
analysis patterns.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:03:47 +08:00
Jiayuan Zhang
a2c1379c1d feat(cli): add --run-log flag and session dir output for agent-driven E2E testing
Add --run-log CLI flag to enable structured run logging without env var.
Print session directory path to stderr when run-log is enabled so Coding
Agents can easily locate log files for analysis.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:03:40 +08:00
Jiayuan Zhang
239dc5a7c6 fix(agent): report accurate compaction metrics and add run-log observability
Compaction was reporting only 189 tokens removed for 6 messages because
Phase 1 (tool result pruning) hollowed out messages before Phase 2
(summary compaction) measured them. Now captures pre-pruning token count
and reports combined savings from both phases.

Also threads RunLog through SessionManager to emit tool_result_pruning
and compaction_detail events, and adds preflight pruning stats logging.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 15:42:04 +08:00
Jiayuan Zhang
313f826d58
Merge pull request #200 from multica-ai/forrestchang/fix-telegram-conflict
fix(gateway): handle Telegram 409 conflict and add error resilience
2026-02-15 14:48:25 +08:00
Jiayuan Zhang
dba0c32d74
Merge pull request #199 from multica-ai/forrestchang/skill-env-storage
feat(skills): implement per-skill .env files with auto-discovery
2026-02-15 14:46:32 +08:00
Jiayuan Zhang
51741d5111
Merge pull request #198 from multica-ai/forrestchang/rm-cmd-b-shortcut
fix(ui): remove Cmd+B sidebar toggle shortcut
2026-02-15 14:42:12 +08:00
Jiayuan Zhang
99167b9837 fix(agent): re-validate tool pairing after preflight compaction
The transformContext pipeline ran sanitizeToolUseResultPairing() before
preflightCompact(), but compaction (pruneToolResults + compactMessagesTokenAware)
can break tool_use/tool_result pairing by dropping assistant messages while
keeping their tool_result blocks. This caused 400 errors from the Anthropic API:
"unexpected tool_use_id found in tool_result blocks".

Add a second sanitizeToolUseResultPairing() call after preflightCompact()
to repair any orphaned tool_result blocks created during compaction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 14:25:48 +08:00
Jiayuan Zhang
57805cddb8 fix(ui): remove Cmd+B sidebar toggle shortcut
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 14:18:49 +08:00
Jiayuan Zhang
a4b7deac3e fix(skills): preserve all user files during bundled skill upgrades
Instead of only protecting .env files, use cpSync with force:true to
overlay bundle files onto the existing directory. This preserves any
user-created files (credentials.json, token.json, etc.) that don't
exist in the bundle, rather than deleting and re-copying the entire
directory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 14:15:31 +08:00
Jiayuan Zhang
fe7c772219 fix(gateway): add process-level error handlers and graceful shutdown
Add unhandledRejection and uncaughtException handlers to prevent the
gateway from crashing on unexpected errors. Add SIGTERM/SIGINT handlers
for graceful shutdown via app.close().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 14:07:50 +08:00
Jiayuan Zhang
5741402a1a fix(telegram): handle 409 polling conflict and add global error boundary
Add bot.catch() to prevent unhandled errors from crashing the polling
loop, and catch the 409 "terminated by other getUpdates request" error
specifically when another bot instance is already running.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 14:07:49 +08:00
Jiayuan Zhang
8848f09107 refactor(skills): remove hardcoded API key hints, use dynamic web search
Remove hardcoded service API key hints from getApiKeyHint() — skill-specific
hints should be discovered dynamically by the agent via web_search/web_fetch
at runtime. Only keep LLM provider hints which are system-level. Update
skill-creator instructions accordingly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 13:39:14 +08:00
Jiayuan Zhang
8004403e1b feat(skill-creator): add activation flow and API key onboarding instructions
Update skill-creator SKILL.md with proactive skill activation workflow:
guide users through API key setup, accept keys in chat, write .env files
automatically. Add sections for creating skills with env requirements and
.env file format reference.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 13:34:51 +08:00
Jiayuan Zhang
6f67bb77b8 feat(skills): expose ineligible skills in system prompt for auto-discovery
Add buildIneligibleSkillsSummary() to SkillManager that surfaces skills
with actionable issues (missing env vars, binaries) in the agent's system
prompt. Expand getApiKeyHint() with common service API providers. Update
buildSkillsSection() to guide the agent to suggest activating inactive
skills when they match user intent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 13:34:51 +08:00
Jiayuan Zhang
0678431a7d docs: update credential docs for per-skill .env files
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 13:34:51 +08:00
Jiayuan Zhang
bd0b380e2e refactor(credentials): remove skills.env.json5 support
Remove centralized skills.env.json5 in favor of per-skill .env files.
Clean up CredentialManager by removing hasEnv/getEnv/getResolvedEnvSnapshot
methods and skills env loading. Update CLI credentials and skills commands.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 13:34:51 +08:00
Jiayuan Zhang
9f98ccca58 feat(skills): store API keys in per-skill .env files
Move skill environment variables from centralized skills.env.json5 to
per-skill .env files within each skill's directory. This makes credential
management more intuitive and self-contained.

- Fix parser to handle metadata.requires, always, os, skillKey, install
- Add minimal .env parser (dotenv.ts) and load .env at skill parse time
- Add env field to Skill type for per-skill environment variables
- Update eligibility checker to use skill.env instead of CredentialManager
- Preserve user .env files across bundled skill upgrades in loader

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 13:34:51 +08:00
Jiayuan Zhang
358fcb3c0e
Merge pull request #197 from multica-ai/forrestchang/create-pr
feat(report): add code stats report generator
2026-02-15 13:09:52 +08:00
Naiyuan Qing
59f8802f7f
Merge pull request #196 from multica-ai/fix/chat-input-multiline
fix(ui): preserve newlines in chat input multiline text
2026-02-15 11:08:09 +08:00
Naiyuan Qing
430f2c177e refactor(ui): move LoadingIndicator into MessageList
- Move LoadingIndicator from ChatView into MessageList for consistent padding
- Add isLoading and hasPendingApprovals props to MessageList
- Adjust message spacing (my-1 → my-2) for better visual balance

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-15 11:03:59 +08:00
Naiyuan Qing
deb747a859 refactor(ui): unify loading indicator component
- Create LoadingIndicator component with "generating" and "streaming" variants
- Remove inline loading indicator from StreamingMarkdown (empty content returns empty fragment)
- Use unified LoadingIndicator in ChatView with consistent positioning
- Eliminates layout shift between different loading states

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-15 10:52:46 +08:00
Naiyuan Qing
c6ca5f3270 refactor(ui): unify container layout and adjust spacing
- Use container utility class consistently across chat components
- Change container max-width from 5xl to 4xl for better readability
- Adjust message bubble padding (p-3 -> p-2)
- Fix logout dropdown alignment and add destructive variant

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-15 10:47:59 +08:00
Naiyuan Qing
0c5de3c5f4 fix(ui): preserve newlines in chat input multiline text
- Use TipTap's getText({ blockSeparator: '\n' }) instead of doc.textContent
  to preserve newlines between paragraphs when submitting messages
- Add whitespace-pre-wrap CSS to user message bubbles to render newlines
- Add className prop support to StreamingMarkdown component

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-15 10:32:58 +08:00
LinYushen
339b586025
Merge pull request #195 from multica-ai/forrestchang/unify-multica-api-url
refactor: unify API URL env var to MULTICA_API_URL
2026-02-15 06:52:22 +08:00
yushen
276e9a5b25 fix(web): defer MULTICA_API_URL check to runtime in next.config
Move the env var read into the rewrites function so `next build`
succeeds without MULTICA_API_URL set (it is only needed at runtime).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 06:42:30 +08:00
yushen
4dba1cfdf0 refactor: unify API URL env var to MULTICA_API_URL
Replace scattered API_URL, MAIN_VITE_API_URL, and RENDERER_VITE_API_URL
with a single MULTICA_API_URL across all apps and packages.

- Desktop: use envPrefix to expose MULTICA_* to main process, rename
  RENDERER_VITE_API_URL → RENDERER_VITE_MULTICA_API_URL, remove
  MAIN_VITE_API_URL (now read directly via MULTICA_API_URL)
- Web: add .env.development with MULTICA_API_URL, enforce required check
  in next.config.ts, update .gitignore to allow .env.development
- Core: make MULTICA_API_URL required in api-client (no silent fallback)
- Scripts: pass MULTICA_API_URL in dev-local.sh for web process
- Turbo: update globalEnv from API_URL to MULTICA_API_URL
- Docs: update references to the new env var name

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 06:31:00 +08:00
yushen
25629f97ca fix(gateway): add build stage for workspace packages in Dockerfile
Add intermediate build stage to compile @multica/types, @multica/utils,
and @multica/core before the runtime stage so dist/ artifacts are
available. Also adds @multica/utils as an explicit gateway dependency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 05:46:41 +08:00
Jiayuan Zhang
a7c1b42b31 feat(report): add code stats report generator 2026-02-15 04:32:30 +08:00
Jiayuan Zhang
0fba476b57
Merge pull request #194 from multica-ai/forrestchang/telegram-bot-menu
feat(telegram): add inline keyboard onboarding and menu button
2026-02-15 04:15:39 +08:00
Jiayuan Zhang
e9d54e94ab feat(telegram): add inline keyboard onboarding and menu button
Replace plain-text bot messages with HTML-formatted messages and
inline keyboard buttons. Users are now guided through connection
with interactive buttons (How to connect, What is Multica?, Check
status, Help, Reconnect) that edit messages in-place. Add
setChatMenuButton to show commands in the hamburger menu.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 04:11:56 +08:00
Jiayuan Zhang
f781d27177
docs: refactor README and clean up obsolete docs (#193)
* docs: slim down README and split into topic-specific docs

Add local full-stack development section (pnpm dev:local) and
move detailed content (credentials, CLI, skills/tools, time
injection, development guide) into separate docs/ files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: remove openclaw research notes and add doc index to README

Delete docs/channel/openclaw-research.md (1187-line research dump,
insights already absorbed into implementation). Expand README
documentation section with categorized links to all docs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: remove obsolete design proposals and plans

Delete auto-memory-refresh, cron-job-tool, and dashboard-design
docs that were never implemented. Remove Design Proposals section
from README.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 03:35:54 +08:00
Jiayuan Zhang
00aa2d26ef docs: remove obsolete design proposals and plans
Delete auto-memory-refresh, cron-job-tool, and dashboard-design
docs that were never implemented. Remove Design Proposals section
from README.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 03:32:47 +08:00
Jiayuan Zhang
a91e6b7a08 docs: remove openclaw research notes and add doc index to README
Delete docs/channel/openclaw-research.md (1187-line research dump,
insights already absorbed into implementation). Expand README
documentation section with categorized links to all docs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 03:26:51 +08:00
Jiayuan Zhang
18a6996c97 docs: slim down README and split into topic-specific docs
Add local full-stack development section (pnpm dev:local) and
move detailed content (credentials, CLI, skills/tools, time
injection, development guide) into separate docs/ files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 03:23:54 +08:00
Jiayuan Zhang
7a92f716d9
refactor(skills): remove unused SkillConfig.apiKey/env/primaryEnv (#192)
These fields were only checked during eligibility but never injected
at runtime via credentialManager.getEnv(). Remove the half-implemented
per-skill credential config to reduce confusion.

API key configuration remains supported via skills.env.json5 and
process.env.

Refs: MUL-246, MUL-255

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 03:22:02 +08:00
Jiayuan Zhang
058af56d47
fix(ui): show generating indicator while agent is processing (#191)
When the user sends a message and the agent hasn't started streaming
yet, the chat area showed no visual feedback. Now a "Generating..."
spinner appears between message send and the first streaming content,
matching the existing indicator style used in StreamingMarkdown.

Closes MUL-224

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 03:13:18 +08:00
Jiayuan Zhang
a131f3b7f5
Remove Multica App tab from Clients page (#190)
* feat(desktop): remove Multica App tab from Clients page

Only Telegram is currently available as a connection method.
Remove the unused "Multica App" tab, tabs UI, and related
components (QRCodeCard, DevicesCard, MulticaAppTab) to simplify
the page.

MUL-252

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(desktop): restore authorized devices list on Clients page

The devices list was accidentally removed along with the Multica App
tab. Add it back below the Telegram card.

MUL-252

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 03:04:53 +08:00
Jiayuan Zhang
a8ef4061cd
chore(ci): remove dependabot configuration (#189)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:57:31 +08:00
Jiayuan Zhang
8c5ff8c6ff
Merge pull request #186 from multica-ai/dependabot/github_actions/actions/cache-5
chore(deps): Bump actions/cache from 4 to 5
2026-02-15 02:40:44 +08:00
Jiayuan Zhang
828c32f5e5
Merge pull request #185 from multica-ai/dependabot/github_actions/actions/setup-node-6
chore(deps): Bump actions/setup-node from 4 to 6
2026-02-15 02:40:10 +08:00
Jiayuan Zhang
ab2f411ca7
Merge pull request #184 from multica-ai/dependabot/github_actions/actions/checkout-6
chore(deps): Bump actions/checkout from 4 to 6
2026-02-15 02:39:52 +08:00
dependabot[bot]
40be100c1c
chore(deps): Bump actions/cache from 4 to 5
Bumps [actions/cache](https://github.com/actions/cache) from 4 to 5.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](https://github.com/actions/cache/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-02-14 18:15:23 +00:00
dependabot[bot]
2a7cbfa45d
chore(deps): Bump actions/setup-node from 4 to 6
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4 to 6.
- [Release notes](https://github.com/actions/setup-node/releases)
- [Commits](https://github.com/actions/setup-node/compare/v4...v6)

---
updated-dependencies:
- dependency-name: actions/setup-node
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-02-14 18:15:20 +00:00