Commit graph

23 commits

Author SHA1 Message Date
Naiyuan Qing
7b885673da docs(skills): update whisper skill with correct transcription priority
Clarify that local whisper is the primary provider (free, offline),
OpenAI API is the fallback, and the skill only activates when both
are unavailable. Add setup instructions noting no restart is required.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 18:28:03 +08:00
Naiyuan Qing
f9d1e5c809 fix(channels): correct reply routing and ack lifecycle for rapid-fire messages
Replace global lastRoute-based reply targeting with a FIFO route queue
(pendingRoutes) that snapshots route + ack targets at each debouncer flush.
Use agent_start/agent_end lifecycle instead of message_end to ensure stable
routing across multi-turn runs. Track per-message 👀 acks in ackBuffer →
activeAcks for precise cleanup. Two-gate typing stop: only stop when all
queued runs complete.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 18:27:06 +08:00
Jiang Bohan
96f83c0bc6 fix(channels): suppress heartbeat ack in outbound replies 2026-02-09 16:53:41 +08:00
Naiyuan Qing
0895d42d3b Merge remote-tracking branch 'origin/main' into feat/telegram-channel
# Conflicts:
#	apps/desktop/src/hooks/use-local-chat.ts
#	packages/sdk/src/actions/stream.ts
#	packages/ui/src/components/chat-view.tsx
#	src/agent/async-agent.ts
#	src/agent/events.ts
2026-02-09 14:28:06 +08:00
Naiyuan Qing
43d11a6e5d fix(channels): address code review issues
- Fix double useChannels() instantiation: call once in ChannelsPage,
  pass as props to TelegramCard
- Mask bot tokens in channels:getConfig before sending to renderer
- Add input validation (isValidId, token length) on all IPC handlers
- Fix stopAccount() to clean up typingTimer, lastRoute, aggregator,
  and debouncer when stopping the account they belong to
- Add try/catch to stopChannel/startChannel in useChannels hook
- Consistent return type { ok, error? } on channels:stop handler
- Add tooltip hint on disabled Remove button

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 13:05:34 +08:00
Naiyuan Qing
0542c82fe6 feat(channels): add credential write and per-account lifecycle control
Add setChannelAccountConfig/removeChannelAccountConfig to CredentialManager
for persisting channel tokens. Make ChannelManager.startAccount public and
add stopAccount for individual account lifecycle control via IPC.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 12:50:15 +08:00
Naiyuan Qing
0500dc1d53 feat(channels): add inbound debouncer, ACK reactions, and sequentialize
- InboundDebouncer: batches rapid-fire messages from the same conversation
  into a single agent.write() call (500ms idle, 2s hard cap)
- ACK reactions: add 👀 emoji on message receipt, remove on completion
  (addReaction/removeReaction on ChannelOutboundAdapter interface)
- Grammy sequentialize middleware: ensures same-chat updates are processed
  in order, preventing race conditions on shared state

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 11:43:42 +08:00
Naiyuan Qing
614d2cfd88 fix(channels): address code review issues
Critical:
- describe-video: add mkdir for MEDIA_CACHE_DIR before ffmpeg write
- telegram: check bot ID (not is_bot) for reply-to detection in groups

Important:
- telegram: check @mention in caption for media messages in groups
- hub: add .catch() to channelManager.startAll()
- describe-image: add 20MB file size check to prevent OOM
- async-agent: remove dead writeWithImages, refactor with enqueue()
- manager: lazy agent subscription via ensureSubscribed() to handle
  late agent availability and agent replacement

Suggestions:
- telegram-format: escape quotes in link URLs to prevent HTML breakout
- transcribe: catch API errors and return null (match local fallback)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 11:22:17 +08:00
Naiyuan Qing
49623b4779 docs(channels): add system overview and update media handling docs
- Create docs/channels/README.md: plugin architecture, adapters, lastRoute
  pattern, message flow, configuration, and new plugin guide
- Update media-handling.md: local whisper priority in tables, rewrite
  fallback section, remove completed items from future work
- Add @see doc references in types.ts, telegram.ts, manager.ts,
  transcribe.ts, describe-image.ts, describe-video.ts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 11:03:49 +08:00
Naiyuan Qing
30e9041084 fix(chunker): add table awareness and increase Telegram chunk limit
- Add isInsideTable() to BlockChunker: prevents breaking Markdown tables
  in the middle (table rows lose header context when split across messages)
- Set Telegram chunkerConfig maxChars to 4000 (was default 2000; Telegram
  API limit is 4096, leaving room for HTML formatting overhead)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 11:03:39 +08:00
Naiyuan Qing
db214b25ca feat(media): add image/video description and local whisper priority
- Add describe-image.ts: OpenAI Vision API (gpt-4o-mini) image description
- Add describe-video.ts: ffmpeg frame extraction + Vision API description
- Rewrite transcribe.ts: local whisper/whisper-cli → OpenAI API → null
- Update manager.ts routeMedia(): all media converted to text before agent
  - Image: describeImage() → text (was: raw ImageContent via writeWithImages)
  - Video: describeVideo() → text (was: file path info only)
  - Audio: unchanged (but underlying transcribeAudio now tries local first)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 11:03:31 +08:00
Naiyuan Qing
4e5780692e feat(media): transcribe audio via Whisper API before reaching agent
Move audio transcription from agent-driven (exec + local whisper) to
Manager-layer processing via OpenAI Whisper API. Voice messages are
now transcribed automatically before the agent sees them, so the
agent only receives text. Local whisper skill remains as fallback
when API key is not configured. Also changed default model from
turbo to base for faster first-time experience.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 10:06:11 +08:00
Naiyuan Qing
23da5a35ff feat(channels): route media messages through agent
Add writeWithImages() to AsyncAgent for passing images directly to
the LLM via ImageContent. Extend Agent.run() to accept optional
images parameter. Update ChannelManager.routeIncoming() to download
media files and forward them: images as ImageContent to the LLM,
audio/video/document as file paths for agent-driven processing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 09:44:39 +08:00
Naiyuan Qing
78738e89bf feat(telegram): detect and download media messages
Add handlers for voice, audio, photo, video, and document messages.
Each handler emits a ChannelMessage with media attachment metadata.
Implement downloadMedia() to fetch files from Telegram API and save
to the local media cache directory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 09:44:30 +08:00
Naiyuan Qing
020d132260 feat(channels): add media attachment types and cache directory
Add ChannelMediaAttachment type with support for audio, image, video,
and document media types. Extend ChannelMessage with optional media
field and ChannelPlugin with optional downloadMedia method.
Add MEDIA_CACHE_DIR path for downloaded media files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 09:44:21 +08:00
Naiyuan Qing
00c80b55c4 Revert "feat(telegram): handle voice and audio messages with <media:audio> placeholder"
This reverts commit 48f8302ebf.
2026-02-09 08:38:50 +08:00
Naiyuan Qing
48f8302ebf feat(telegram): handle voice and audio messages with <media:audio> placeholder
Forward voice messages and audio files to the agent as <media:audio>
placeholder text. In groups, only process voice/audio that replies to
the bot. Includes caption text if present.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 08:37:57 +08:00
Naiyuan Qing
ceb960c390 feat(channels): add typing indicators and Telegram HTML formatting
- Add sendTyping to ChannelOutboundAdapter (optional per platform)
- Implement typing lifecycle in ChannelManager (5s interval, cleanup on message_end/error/clear)
- Convert Markdown to Telegram HTML subset (bold, italic, code, links, blockquotes)
- Fallback to plain text on HTML parse errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 08:37:34 +08:00
Naiyuan Qing
9bb1fd6e12 refactor(channels): rewrite ChannelManager with lastRoute pattern
Replace per-conversation agent creation with single Hub agent model.
Messages from channels are routed to the existing Hub agent via
agent.write(), and replies are sent back through the lastRoute context.
Desktop and Gateway paths call clearLastRoute() so channel replies
stop when the user switches input surface.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 19:35:25 +08:00
Naiyuan Qing
287e6d5c4f fix(channels): catch telegram bot polling errors gracefully
Handle 409 conflict (another bot instance running) with a clear error
message instead of an unhandled promise rejection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 16:07:27 +08:00
Naiyuan Qing
112ae6cac9 refactor(channels): read config from credentials.json5 instead of separate file
Move channel configuration into the existing credentials.json5 under a
`channels` section, matching OpenClaw's single-config-file approach.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 16:00:44 +08:00
Naiyuan Qing
971d68b605 feat(channels): add ChannelManager and Telegram plugin
ChannelManager orchestrates channel lifecycles and routes messages to per-conversation Agents.
Telegram plugin uses grammy for long polling with group @mention detection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 15:47:41 +08:00
Naiyuan Qing
5d63727a04 feat(channels): add channel plugin system with types, registry, and config
Introduces the extensible channel plugin architecture for messaging platform integrations.
- ChannelPlugin interface with config, gateway, and outbound adapters
- Plugin registry with register/get/list operations
- Config loader for ~/.super-multica/channels.json5

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 15:47:36 +08:00