Commit graph

628 commits

Author SHA1 Message Date
Bohan Jiang
b9300f3328
Merge pull request #113 from multica-ai/Bohan-J/dual-track-user-display-content-20260209
feat(desktop): model switching UI & pi-ai 0.52.9 upgrade
2026-02-09 19:59:34 +08:00
Jiang Bohan
368012bd66 chore(providers): remove Mistral from provider registry
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 19:20:21 +08:00
Jiang Bohan
a61e29469e fix(desktop): add overflow-y scroll to provider dropdown
The provider & model dropdown now scrolls internally instead of
causing the whole page to scroll when content exceeds viewport.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 19:20:16 +08:00
Jiang Bohan
5380b146b3 chore(deps): upgrade pi-ai and pi-agent-core to 0.52.9
Upgrade @mariozechner/pi-ai and @mariozechner/pi-agent-core from 0.50.3
to 0.52.9 to support latest models (claude-opus-4-6, o3, o3-mini).

Breaking type changes addressed:
- exactOptionalPropertyTypes: use conditional spread or `| undefined`
- TOOL_PROFILES removed: strip all profile references from CLI
- AgentMessage union requires timestamp: cast test fixtures
- AsyncAgent.id → sessionId
- Add explicit callback parameter types for SDK event handlers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 19:13:38 +08:00
Naiyuan Qing
f7ddcec78d
Merge pull request #112 from multica-ai/NevilleQingNY/telegram-msg-issue
fix(channels): correct reply routing, ack lifecycle, and whisper detection
2026-02-09 18:31:58 +08:00
Naiyuan Qing
7b885673da docs(skills): update whisper skill with correct transcription priority
Clarify that local whisper is the primary provider (free, offline),
OpenAI API is the fallback, and the skill only activates when both
are unavailable. Add setup instructions noting no restart is required.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 18:28:03 +08:00
Naiyuan Qing
1bd25c5aec fix(media): don't cache whisper binary detection failure
Only cache successful whisper binary lookups. When whisper is not found,
leave the cache empty so subsequent calls re-check PATH. This allows
the app to detect a newly installed whisper without requiring a restart.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 18:27:06 +08:00
Naiyuan Qing
c27f4e66b5 docs(channels): update README with route queue pattern and ack lifecycle
Document the new channel system design: FIFO pendingRoutes queue,
activeRoute/activeAcks state, agent_start/agent_end lifecycle,
InboundDebouncer, typing/reaction lifecycle, and UI metadata stripping.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 18:27:06 +08:00
Naiyuan Qing
ec67dd6706 fix(ui): strip channel metadata prefixes from user messages in display
Add stripUserMetadata() to remove timestamp envelopes and media type
labels ([Voice Message] Transcript:, [Image] Description:, etc.) that
are injected for LLM context but should not appear in the desktop UI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 18:27:06 +08:00
Naiyuan Qing
f9d1e5c809 fix(channels): correct reply routing and ack lifecycle for rapid-fire messages
Replace global lastRoute-based reply targeting with a FIFO route queue
(pendingRoutes) that snapshots route + ack targets at each debouncer flush.
Use agent_start/agent_end lifecycle instead of message_end to ensure stable
routing across multi-turn runs. Track per-message 👀 acks in ackBuffer →
activeAcks for precise cleanup. Two-gate typing stop: only stop when all
queued runs complete.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 18:27:06 +08:00
Jiang Bohan
28a14a4beb chore(providers): update model registry with latest valid models
Update model lists against pi-ai 0.50.3 getModel() validation:
- Anthropic: add claude-opus-4-1, claude-sonnet-4-0
- OpenAI: add gpt-5.2, gpt-5-mini, gpt-4.1, gpt-4.1-mini
- Google: update to gemini-2.5-pro/flash/2.0-flash
- xAI: add grok-4 as default
- Groq: remove invalid mixtral-8x7b-32768
- OpenRouter: update to claude-sonnet-4-5

Note: pi-ai upgrade to 0.52.9 (for claude-opus-4-6, o3 etc.)
requires separate effort due to breaking type changes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 17:58:03 +08:00
Bohan Jiang
4a3a9101c2
Merge pull request #111 from multica-ai/Bohan-J/dual-track-user-display-content-20260209
fix(channels): suppress heartbeat ack in outbound replies
2026-02-09 16:57:55 +08:00
Jiang Bohan
2349868e8e feat(desktop): add model switching to provider dropdown
Add model selector section below the provider grid in the dropdown.
Shows available models for the current provider with active model
indicator. Clicking a model calls setProvider with the modelId.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 16:56:25 +08:00
Jiang Bohan
96f83c0bc6 fix(channels): suppress heartbeat ack in outbound replies 2026-02-09 16:53:41 +08:00
Bohan Jiang
be312cd3e0
Merge pull request #110 from multica-ai/Bohan-J/dual-track-user-display-content-20260209
fix(agent): separate display content from agent user turns
2026-02-09 16:41:31 +08:00
Jiang Bohan
d5090441da fix(agent): use user message content type for displayContent 2026-02-09 16:37:33 +08:00
Jiang Bohan
ff0694175e Merge branch 'main' of https://github.com/multica-ai/super-multica 2026-02-09 16:25:30 +08:00
Bohan Jiang
a12907df86
Merge pull request #109 from multica-ai/Bohan-J/fix-chat-error-catch
fix(agent): only emit agent_error for LLM provider 401
2026-02-09 16:07:13 +08:00
Jiang Bohan
a8e7a803c9 fix(agent): separate display content from agent user turns 2026-02-09 15:54:10 +08:00
Jiang Bohan
7294e76929 fix(agent): only emit agent_error for auth issues, not runtime 400 errors
Previously all agent errors (including 400 invalid_request_error) were
emitted as agent_error events, triggering the UI error banner and
interrupting the chat flow. Now only auth-related errors (401, no API key)
emit agent_error so the "Configure" banner appears. Runtime errors like
400 are still shown as chat messages but don't block the agent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 15:53:41 +08:00
Jiang Bohan
2fa8c383fb fix(agent): save original provider ID instead of alias in session meta
setProvider() was saving the alias-resolved provider (e.g. "anthropic"
instead of "claude-code") to session metadata. On restart, this caused
the wrong provider to be selected. Now saves the original providerId
so the exact user selection is preserved.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 15:45:11 +08:00
Jiang Bohan
15f6604a16 fix(agent): persist provider selection across restarts
Move session metadata loading before provider resolution so that
the stored provider from a previous setProvider() call is used
in the fallback chain (options > session meta > credentials > default)
instead of always falling back to "kimi-coding".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 15:41:56 +08:00
Jiang Bohan
5c7d913128 fix(desktop): externalize grammy in Vite Electron build
grammy was not in rollupOptions.external, causing the channels IPC
handlers to fail at runtime with 'No handler registered'.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 15:08:11 +08:00
Jiang Bohan
19fed71f09 fix(heartbeat): bypass empty-file check for cron-triggered wakes
Cron reminders were silently skipped when heartbeat.md had no actionable
content. Now cron: and exec-event reasons both bypass the empty-file
guard so scheduled reminders always reach the agent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 15:08:04 +08:00
Jiang Bohan
d6993000ca Merge branch 'main' of https://github.com/multica-ai/super-multica
# Conflicts:
#	apps/desktop/electron/electron-env.d.ts
2026-02-09 14:34:48 +08:00
Bohan Jiang
94d6e15117
Merge pull request #101 from multica-ai/fix/credentials-reload
fix(agent): reload credentials dynamically
2026-02-09 14:33:40 +08:00
Naiyuan Qing
f6ad045e02
Merge pull request #106 from multica-ai/feat/telegram-channel
feat(channels): add Telegram channel integration with desktop UI
2026-02-09 14:31:21 +08:00
Jiang Bohan
5716932903 fix(agent): resolve merge conflict in runner.ts for credentials reload
Merge main's run mutex + soft error return with branch's refreshAuthState(),
keeping getApiKey defensive throw as defense-in-depth.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 14:30:57 +08:00
Naiyuan Qing
0895d42d3b Merge remote-tracking branch 'origin/main' into feat/telegram-channel
# Conflicts:
#	apps/desktop/src/hooks/use-local-chat.ts
#	packages/sdk/src/actions/stream.ts
#	packages/ui/src/components/chat-view.tsx
#	src/agent/async-agent.ts
#	src/agent/events.ts
2026-02-09 14:28:06 +08:00
Jiang Bohan
9d6468acca Merge branch 'Bohan-J/rm-comm-style' into main 2026-02-09 14:25:44 +08:00
Jiang Bohan
22e225c6a8 refactor(profile): remove Communication Style UI and programmatic API
Style is now solely managed by the agent editing soul.md directly,
removing the need for UI controls, IPC handlers, and typed constants.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 14:25:38 +08:00
Bohan Jiang
5c26f92caa
Merge pull request #108 from multica-ai/Bohan-J/add-time-awareness
fix(agent): improve time awareness with timestamped turns
2026-02-09 14:14:31 +08:00
Jiang Bohan
59ae49e73f docs(readme): add time awareness flow section 2026-02-09 14:05:22 +08:00
Jiang Bohan
667f3e533e fix(agent): improve time awareness with timestamped turns 2026-02-09 14:05:22 +08:00
Bohan Jiang
b7085b2bf5
Merge pull request #107 from multica-ai/fix/new-user-onboarding
fix(desktop): new user onboarding — show errors and configure dialog in Chat
2026-02-09 13:54:48 +08:00
Jiang Bohan
ed681a96bf feat(desktop): add Configure button in chat error banner
When the agent fails due to missing API key, the error banner now
shows a "Configure" button that opens the same ApiKeyDialog (or
OAuthDialog) used on the home page. After successful configuration
the error clears and the user can immediately start chatting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 13:51:55 +08:00
Jiang Bohan
8d32a06b5c fix(agent): validate API key before calling PiAgentCore.prompt()
getApiKey errors thrown inside PiAgentCore's internal async context
result in UnhandledPromiseRejection instead of propagating to the
caller. Return a graceful error early so AsyncAgent can emit it
through the subscriber mechanism to the UI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 13:51:45 +08:00
Naiyuan Qing
23905daaa1 Merge remote-tracking branch 'origin/main' into feat/telegram-channel
# Conflicts:
#	apps/desktop/electron/electron-env.d.ts
#	apps/desktop/electron/ipc/index.ts
#	apps/desktop/electron/preload.ts
#	apps/desktop/src/App.tsx
#	apps/desktop/src/pages/layout.tsx
#	src/agent/async-agent.ts
#	src/agent/runner.ts
#	src/hub/hub.ts
2026-02-09 13:44:08 +08:00
Naiyuan Qing
43d11a6e5d fix(channels): address code review issues
- Fix double useChannels() instantiation: call once in ChannelsPage,
  pass as props to TelegramCard
- Mask bot tokens in channels:getConfig before sending to renderer
- Add input validation (isValidId, token length) on all IPC handlers
- Fix stopAccount() to clean up typingTimer, lastRoute, aggregator,
  and debouncer when stopping the account they belong to
- Add try/catch to stopChannel/startChannel in useChannels hook
- Consistent return type { ok, error? } on channels:stop handler
- Add tooltip hint on disabled Remove button

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 13:05:34 +08:00
Naiyuan Qing
c99675b6e4 feat(desktop): add Channels configuration page with Telegram support
Add IPC handlers, preload API, useChannels hook, and Channels page UI.
Users can save/remove Telegram bot tokens and start/stop bots directly
from the desktop app with immediate effect and persistence across restarts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 12:50:24 +08:00
Naiyuan Qing
0542c82fe6 feat(channels): add credential write and per-account lifecycle control
Add setChannelAccountConfig/removeChannelAccountConfig to CredentialManager
for persisting channel tokens. Make ChannelManager.startAccount public and
add stopAccount for individual account lifecycle control via IPC.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 12:50:15 +08:00
Naiyuan Qing
6a02fd29be fix(ui): adjust chat input padding, icon, and button size
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 11:43:49 +08:00
Naiyuan Qing
0500dc1d53 feat(channels): add inbound debouncer, ACK reactions, and sequentialize
- InboundDebouncer: batches rapid-fire messages from the same conversation
  into a single agent.write() call (500ms idle, 2s hard cap)
- ACK reactions: add 👀 emoji on message receipt, remove on completion
  (addReaction/removeReaction on ChannelOutboundAdapter interface)
- Grammy sequentialize middleware: ensures same-chat updates are processed
  in order, preventing race conditions on shared state

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 11:43:42 +08:00
Naiyuan Qing
614d2cfd88 fix(channels): address code review issues
Critical:
- describe-video: add mkdir for MEDIA_CACHE_DIR before ffmpeg write
- telegram: check bot ID (not is_bot) for reply-to detection in groups

Important:
- telegram: check @mention in caption for media messages in groups
- hub: add .catch() to channelManager.startAll()
- describe-image: add 20MB file size check to prevent OOM
- async-agent: remove dead writeWithImages, refactor with enqueue()
- manager: lazy agent subscription via ensureSubscribed() to handle
  late agent availability and agent replacement

Suggestions:
- telegram-format: escape quotes in link URLs to prevent HTML breakout
- transcribe: catch API errors and return null (match local fallback)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 11:22:17 +08:00
Naiyuan Qing
49623b4779 docs(channels): add system overview and update media handling docs
- Create docs/channels/README.md: plugin architecture, adapters, lastRoute
  pattern, message flow, configuration, and new plugin guide
- Update media-handling.md: local whisper priority in tables, rewrite
  fallback section, remove completed items from future work
- Add @see doc references in types.ts, telegram.ts, manager.ts,
  transcribe.ts, describe-image.ts, describe-video.ts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 11:03:49 +08:00
Naiyuan Qing
30e9041084 fix(chunker): add table awareness and increase Telegram chunk limit
- Add isInsideTable() to BlockChunker: prevents breaking Markdown tables
  in the middle (table rows lose header context when split across messages)
- Set Telegram chunkerConfig maxChars to 4000 (was default 2000; Telegram
  API limit is 4096, leaving room for HTML formatting overhead)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 11:03:39 +08:00
Naiyuan Qing
db214b25ca feat(media): add image/video description and local whisper priority
- Add describe-image.ts: OpenAI Vision API (gpt-4o-mini) image description
- Add describe-video.ts: ffmpeg frame extraction + Vision API description
- Rewrite transcribe.ts: local whisper/whisper-cli → OpenAI API → null
- Update manager.ts routeMedia(): all media converted to text before agent
  - Image: describeImage() → text (was: raw ImageContent via writeWithImages)
  - Video: describeVideo() → text (was: file path info only)
  - Audio: unchanged (but underlying transcribeAudio now tries local first)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 11:03:31 +08:00
Naiyuan Qing
4e5780692e feat(media): transcribe audio via Whisper API before reaching agent
Move audio transcription from agent-driven (exec + local whisper) to
Manager-layer processing via OpenAI Whisper API. Voice messages are
now transcribed automatically before the agent sees them, so the
agent only receives text. Local whisper skill remains as fallback
when API key is not configured. Also changed default model from
turbo to base for faster first-time experience.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 10:06:11 +08:00
Naiyuan Qing
922a3b2bb7 feat(skills): add whisper audio transcription skill
Bundled skill that enables the agent to transcribe audio files using
OpenAI Whisper CLI. Uses anyBins requirement so the skill is only
visible when whisper is installed. Includes brew and uv install specs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 09:44:49 +08:00
Naiyuan Qing
23da5a35ff feat(channels): route media messages through agent
Add writeWithImages() to AsyncAgent for passing images directly to
the LLM via ImageContent. Extend Agent.run() to accept optional
images parameter. Update ChannelManager.routeIncoming() to download
media files and forward them: images as ImageContent to the LLM,
audio/video/document as file paths for agent-driven processing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 09:44:39 +08:00