Commit graph

1106 commits

Author SHA1 Message Date
Jiayuan Zhang
62feefb07d
Merge pull request #219 from multica-ai/codex/web-policy-hybrid-roadmap
docs(agent): add web tools policy optimization roadmap
2026-02-17 02:47:02 +08:00
Jiayuan Zhang
e28ecb9a91
Merge pull request #216 from multica-ai/codex/meta-skill-installer-e2e-skills-benchmark
feat(skills): add ClawHub meta installer and agent-driven E2E benchmark
2026-02-17 02:45:45 +08:00
Jiayuan Zhang
39fde8e4b0
Merge pull request #218 from multica-ai/codex/web-fetch-evidence-coverage
fix(agent): enforce web search fetch evidence coverage
2026-02-17 02:45:12 +08:00
Jiayuan Zhang
276e30626a docs(agent): add web tools policy optimization roadmap 2026-02-17 02:43:55 +08:00
Jiayuan Zhang
8a2b3e10f3 test(e2e): add natural Notion gap-discovery benchmark case 2026-02-17 02:37:29 +08:00
Jiayuan Zhang
4b7f0afb50 fix(agent): guard workaround and local skill mutation commands 2026-02-17 02:37:29 +08:00
Jiayuan Zhang
6fd4819280 fix(agent): surface installed skill ids in prompt 2026-02-17 02:37:29 +08:00
Jiayuan Zhang
50407918b9 test(e2e): add spotify capability-gap ux benchmark case 2026-02-17 02:37:29 +08:00
Jiayuan Zhang
7eb18f47fc fix(agent): enforce capability-gap skill recovery guidance 2026-02-17 02:37:29 +08:00
Jiayuan Zhang
2074aac49e feat(e2e): add clawhub skills benchmark suite 2026-02-17 02:37:29 +08:00
Jiayuan Zhang
0c1856b54b feat(skills): add clawhub meta skill with security gate 2026-02-17 02:37:29 +08:00
Jiayuan Zhang
850d55336a fix(agent): enforce sufficient search-fetch evidence 2026-02-17 02:08:15 +08:00
Jiayuan Zhang
eebbf93e8b
Merge pull request #217 from multica-ai/codex/queued-message-ux
feat(desktop): queue and collapse pending chat messages
2026-02-17 01:58:15 +08:00
Jiayuan Zhang
b5b65c6bae fix(agent): enforce cross-turn web fetch evidence 2026-02-17 01:48:53 +08:00
Jiayuan Zhang
a5901b7db8 fix(desktop): restore remove action in collapsed queue 2026-02-17 01:47:19 +08:00
Jiayuan Zhang
fe50519a92 fix(desktop): keep queued message bar hook order stable 2026-02-17 01:47:19 +08:00
Jiayuan Zhang
37a68fc5c0 feat(desktop): add collapsible queued message panel 2026-02-17 01:47:19 +08:00
Jiayuan Zhang
3c1fa3f349 fix(desktop): keep queued messages blocked during toolUse phases 2026-02-17 01:47:19 +08:00
Jiayuan Zhang
bfe0c82e87 feat(desktop): queue user messages while agent is busy 2026-02-17 01:47:19 +08:00
Jiayuan Zhang
6e71598c2c
Merge pull request #215 from multica-ai/codex/docs-prune-and-regenerate-core-docs
docs: prune stale docs and regenerate prioritized core docs
2026-02-17 01:26:21 +08:00
Jiayuan Zhang
61ea022c78 docs: restore minimal project context in readme and claude 2026-02-17 01:18:51 +08:00
Jiayuan Zhang
2447230ca7 docs: keep only workflow testing and process guidance 2026-02-17 01:15:22 +08:00
Jiayuan Zhang
88582fe050 docs(development): restore dev-local workflow docs 2026-02-17 01:11:21 +08:00
Jiayuan Zhang
6c90fbf169 fix(skills): restore bundled skill documentation 2026-02-17 01:00:13 +08:00
Jiayuan Zhang
fc8a813120
Merge pull request #214 from multica-ai/codex/chat-context-window-indicator
feat(chat): add context window usage indicator
2026-02-17 00:55:09 +08:00
Jiayuan Zhang
0ed46510ee docs: regenerate prioritized core documentation 2026-02-17 00:53:37 +08:00
Jiayuan Zhang
ce6291e9eb fix(agent): enforce web_fetch after successful web_search 2026-02-17 00:49:57 +08:00
Jiayuan Zhang
ecb0cd392e chore(docs): remove non-e2e documentation 2026-02-17 00:46:36 +08:00
Jiayuan Zhang
ec8b62cef1 feat(chat): add context window usage indicator 2026-02-17 00:38:17 +08:00
Jiayuan Zhang
db63369837
Merge pull request #213 from multica-ai/codex/remove-legacy-subagent-registry
refactor: remove legacy async subagent orchestration path
2026-02-17 00:30:19 +08:00
Jiayuan Zhang
a1bb77c162
Merge pull request #212 from multica-ai/codex/telegram-reply-context-queue-fix
fix(gateway): preserve Telegram reply context ordering
2026-02-17 00:23:41 +08:00
Jiayuan Zhang
4d339ab80f fix(gateway): preserve telegram reply context ordering 2026-02-17 00:20:46 +08:00
Jiayuan Zhang
db0f8b3f7b refactor(desktop): drop legacy subagent dashboard wiring 2026-02-17 00:07:23 +08:00
Jiayuan Zhang
909efb5dab refactor(core): remove legacy subagent registry subsystem 2026-02-17 00:07:15 +08:00
Jiayuan Zhang
292e2b9454
Merge pull request #211 from multica-ai/codex/telegram-agent-welcome
feat(telegram): send agent-generated welcome after connect and reconnect
2026-02-16 13:22:12 +08:00
Jiayuan Zhang
cf94bc32d2 fix(gateway): relax rpc param generic for typed sdk payloads 2026-02-16 12:33:45 +08:00
Jiayuan Zhang
4a2ef835fb feat(gateway): send agent-generated welcome after telegram connect 2026-02-16 12:24:30 +08:00
Jiayuan Zhang
43198d9dcc feat(core): add rpc to generate channel welcome messages 2026-02-16 12:24:24 +08:00
Jiayuan Zhang
47606ab84c
Merge pull request #210 from multica-ai/forrestchang/serial-rerun
finance e2e benchmark: translate cases, run analysis, fix data tool errors
2026-02-16 11:25:18 +08:00
Jiayuan Zhang
63e7541149 chore(benchmark): remove hardcoded output directories from case prompts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 11:17:24 +08:00
Jiayuan Zhang
357bf326e0 fix(data): propagate errors so is_error is set correctly in run-log
Previously the data tool caught all errors and returned them as normal
tool results with error info in the JSON content. This meant pi-agent-core
never saw an exception and always set isError=false in the run-log, even
for rate limit errors (errCode 9001) and other API failures.

Now errors propagate to pi-agent-core which sets isError=true and formats
the error message for the LLM automatically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 03:39:11 +08:00
Jiayuan Zhang
486b090577 chore(benchmark): translate finance e2e case prompts to English
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 02:39:35 +08:00
Jiayuan Zhang
c38a576b8f feat(benchmark): parallelize finance e2e runs 2026-02-16 02:15:55 +08:00
Jiayuan Zhang
b897719775 fix(benchmark): support long-horizon timeout control 2026-02-16 01:54:20 +08:00
Jiayuan Zhang
edc55390cf feat(benchmark): add finance e2e case suite 2026-02-16 01:29:24 +08:00
Jiayuan Zhang
faa54fc671
Merge pull request #209 from multica-ai/forrestchang/subagent-redesign
feat(agent): replace sessions_spawn with synchronous delegate tool
2026-02-16 01:22:14 +08:00
Jiayuan Zhang
9c8be30d3d fix(test): increase timeout for summary fallback artifact extraction test
UC4 test times out in CI (5s default) because generateSummary's API
provider layer takes longer to fail on slow CI runners. Increase to 15s.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 01:10:18 +08:00
Jiayuan Zhang
aada2916f4 fix(agent): clear timeout timer in delegate tool to prevent unhandled rejection
The setTimeout in runSubagentTask was never cleared when childAgent.run()
completed before the timeout. The dangling timer would later reject an
unobserved promise, causing an unhandled promise rejection crash in Node.js
v15+. Capture the timer and clear it in a .finally() block.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 01:09:21 +08:00
Jiayuan Zhang
f60551195a chore(agent): remove old sessions_spawn/sessions_list tools and update references
Delete sessions-spawn.ts, sessions-list.ts and their tests. Update CLI
to remove waitForSubagents polling workaround (delegate is synchronous).
Update UI, desktop IPC, SWE-bench, and system prompt tests to use the
new delegate tool name.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 01:09:21 +08:00
Jiayuan Zhang
d3ef8ecc31 feat(agent): replace sessions_spawn with synchronous delegate tool
Replace the async sessions_spawn/sessions_list sub-agent system with a
single synchronous `delegate` tool. The new tool runs tasks in parallel
via Promise.all with per-task timeout, returning combined results directly
in the tool response. This eliminates the need for registry, announce queue,
persistence, and Hub involvement.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 01:09:21 +08:00