multica/docs/channels/README.md
Naiyuan Qing 49623b4779 docs(channels): add system overview and update media handling docs
- Create docs/channels/README.md: plugin architecture, adapters, lastRoute
  pattern, message flow, configuration, and new plugin guide
- Update media-handling.md: local whisper priority in tables, rewrite
  fallback section, remove completed items from future work
- Add @see doc references in types.ts, telegram.ts, manager.ts,
  transcribe.ts, describe-image.ts, describe-video.ts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 11:03:49 +08:00

8.6 KiB

Channel System

The Channel system connects external messaging platforms (Telegram, Discord, etc.) to the Hub's agent. Each platform is a plugin that translates platform-specific APIs into a unified interface.

For media handling details (audio transcription, image/video description), see media-handling.md. For message flow across all three I/O paths (Desktop / Web / Channel), see message-paths.md.

Architecture

┌─────────────────────────────────────────────────────────────┐
│  credentials.json5                                          │
│  { channels: { telegram: { default: { botToken } } } }     │
└──────────────────────┬──────────────────────────────────────┘
                       │ loadChannelsConfig()
                       ▼
┌─────────────────────────────────────────────────────────────┐
│  Channel Manager (manager.ts)                               │
│                                                             │
│  startAll() → iterate plugins → startAccount() per account  │
│  subscribeToAgent() → listen for AI replies                 │
│                                                             │
│  Incoming: routeIncoming() → routeMedia() → agent.write()  │
│  Outgoing: lastRoute → aggregator → plugin.outbound.*()    │
└──────────┬──────────────────────────────────────────────────┘
           │
           ▼
┌─────────────────────────────────────────────────────────────┐
│  Plugin Registry (registry.ts)                              │
│  registerChannel(plugin) / listChannels() / getChannel(id)  │
└──────────┬──────────────────────────────────────────────────┘
           │
           ▼
┌─────────────────────────────────────────────────────────────┐
│  Channel Plugins (e.g. telegram.ts)                         │
│                                                             │
│  config    — resolve account credentials                    │
│  gateway   — receive messages (polling / webhook)           │
│  outbound  — send replies back to platform                  │
│  downloadMedia() — download media files to local disk       │
└─────────────────────────────────────────────────────────────┘

Plugin Interface

Each channel plugin implements ChannelPlugin (defined in types.ts):

interface ChannelPlugin {
  readonly id: string;                          // "telegram", "discord", etc.
  readonly meta: { name: string; description: string };
  readonly chunkerConfig?: BlockChunkerConfig;  // override text chunking per platform
  readonly config: ChannelConfigAdapter;        // credential resolution
  readonly gateway: ChannelGatewayAdapter;      // receive messages
  readonly outbound: ChannelOutboundAdapter;    // send replies
  downloadMedia?(fileId: string, accountId: string): Promise<string>;  // optional
}

Three Adapters

Adapter Role Key Methods
config Resolve credentials from credentials.json5 listAccountIds(), resolveAccount(), isConfigured()
gateway Receive inbound messages from the platform start(accountId, config, onMessage, signal)
outbound Send replies back to the platform sendText(), replyText(), sendTyping?()

downloadMedia (optional)

Platforms that support media (voice, image, video, document) implement downloadMedia() to download files to ~/.super-multica/cache/media/ with UUID filenames. The Manager calls this before processing media.

Message Flow

Inbound (Platform → Agent)

User sends message in Telegram
  → grammy long-polling → onMessage callback
    → ChannelManager.routeIncoming()
      1. Update lastRoute (reply target)
      2. Start typing indicator
      3. If media: routeMedia() → download → transcribe/describe → text
      4. agent.write(text)

All media is converted to text before the agent sees it. See media-handling.md for details.

Outbound (Agent → Platform)

Agent produces reply
  → agent.subscribe() in ChannelManager
    → Check: if (!lastRoute) return   // not from a channel, skip
    → message_start → create MessageAggregator
    → message_update → feed text to aggregator
    → message_end → aggregator flushes final block
      → Aggregator emits BlockReply chunks
        → Block 0: plugin.outbound.replyText()   // Telegram reply format
        → Block N: plugin.outbound.sendText()     // follow-up messages

The MessageAggregator buffers streaming LLM output and splits it into blocks at natural text boundaries (paragraphs, code blocks). This is necessary because messaging platforms cannot consume raw streaming deltas.

lastRoute Pattern

The lastRoute tracks which channel last sent a message:

  • Channel message arriveslastRoute is set to that plugin + conversation
  • Desktop/Web message arrivesclearLastRoute() is called
  • Agent replies → if lastRoute is set, reply goes to that channel; otherwise skipped

This ensures replies go back to the originating channel. Desktop and Web always receive agent events independently via their own mechanisms (IPC / Gateway).

Configuration

Channel credentials are stored in ~/.super-multica/credentials.json5 under the channels key:

{
  channels: {
    telegram: {
      default: {
        botToken: "123456:ABC-DEF..."
      }
    },
    // discord: { default: { botToken: "..." } },
  }
}

Each channel ID maps to accounts (keyed by account ID, typically "default"). The config adapter for each plugin knows how to extract and validate its credentials.

Adding a New Plugin

  1. Create src/channels/plugins/<name>.ts implementing ChannelPlugin
  2. Register it in src/channels/index.ts:
    import { <name>Channel } from "./plugins/<name>.js";
    registerChannel(<name>Channel);
    
  3. Add the config shape to the channels section of credentials.json5

Implementation Checklist

  • config adapter: parse credentials from credentials.json5
  • gateway adapter: connect to platform, normalize messages to ChannelMessage
  • outbound adapter: sendText, replyText, optional sendTyping
  • downloadMedia (if platform supports media): download to MEDIA_CACHE_DIR
  • Group filtering: only respond to messages directed at the bot
  • Graceful shutdown: respect the AbortSignal passed to gateway.start()

File Map

File Role
src/channels/types.ts All type definitions (ChannelPlugin, ChannelMessage, DeliveryContext, etc.)
src/channels/manager.ts ChannelManager — bridges plugins to the Hub's agent
src/channels/registry.ts Plugin registry (registerChannel, listChannels, getChannel)
src/channels/config.ts Load channel config from credentials.json5
src/channels/index.ts Bootstrap: register built-in plugins, re-export public API
src/channels/plugins/telegram.ts Telegram plugin (grammy, long polling)
src/channels/plugins/telegram-format.ts Markdown → Telegram HTML converter
src/media/transcribe.ts Audio transcription (local whisper → OpenAI API)
src/media/describe-image.ts Image description (OpenAI Vision API)
src/media/describe-video.ts Video description (ffmpeg frame + Vision API)
src/shared/paths.ts MEDIA_CACHE_DIR path constant
src/hub/message-aggregator.ts Streaming text → block chunking for channel delivery

Current Plugins

Plugin Platform Transport Library
telegram Telegram Long polling grammy

Planned: Discord, Feishu, LINE, etc.