chore(docs): remove non-e2e documentation
This commit is contained in:
parent
292e2b9454
commit
ecb0cd392e
47 changed files with 0 additions and 11844 deletions
|
|
@ -1,75 +0,0 @@
|
|||
# Desktop 登录集成
|
||||
|
||||
## 登录流程
|
||||
|
||||
```
|
||||
Desktop 点击登录
|
||||
↓
|
||||
启动本地 HTTP 服务器 (随机端口,如 54321)
|
||||
↓
|
||||
打开浏览器 → http://localhost:3000/api/desktop/session?port=54321&platform=web
|
||||
↓
|
||||
Web 重定向 → /login?next=...
|
||||
↓
|
||||
用户登录,调用 /api/v1/auth/login (代理到 api-dev.copilothub.ai)
|
||||
↓
|
||||
登录成功,回调 → http://127.0.0.1:54321/callback?sid=xxx&user=xxx
|
||||
↓
|
||||
Desktop 保存到 ~/.super-multica/auth.json
|
||||
```
|
||||
|
||||
## 前端逻辑
|
||||
|
||||
### Web 端
|
||||
|
||||
- 端口:**3000**
|
||||
- 登录 API:`/api/v1/auth/login`(通过 Next.js rewrites 代理到后端)
|
||||
- 登录成功后回调:`http://127.0.0.1:{port}/callback?sid=xxx&user=xxx`
|
||||
|
||||
### Desktop 端
|
||||
|
||||
- 点击登录 → 启动本地服务器 → 打开浏览器
|
||||
- 收到回调 → 保存到本地文件
|
||||
|
||||
## 存储
|
||||
|
||||
**路径:** `~/.super-multica/auth.json`
|
||||
|
||||
Desktop 登录成功后,SID 和用户信息存储在本地文件:
|
||||
|
||||
```json
|
||||
{
|
||||
"sid": "session-id-from-backend",
|
||||
"user": {
|
||||
"uid": "user-id",
|
||||
"name": "User Name",
|
||||
"email": "user@example.com"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
后续请求可从此文件读取 `sid` 进行认证。
|
||||
|
||||
## 退出登录
|
||||
|
||||
**后端只需要返回错误,前端会自动处理退出。**
|
||||
|
||||
前端收到认证错误后:
|
||||
1. 调用 `auth:clear` 清除本地数据
|
||||
2. 跳转到登录页
|
||||
|
||||
## 本地调试
|
||||
|
||||
```bash
|
||||
# 1. 启动 Web(Next.js rewrites 自动代理 /api/* 到 api-dev.copilothub.ai)
|
||||
pnpm dev:web
|
||||
|
||||
# 2. 启动 Desktop
|
||||
pnpm dev:desktop
|
||||
```
|
||||
|
||||
本地调试时,Next.js rewrites(配置在 `apps/web/next.config.ts`)自动将 `/api/*` 请求代理到 `MULTICA_API_URL` 指定的后端。
|
||||
|
||||
## 参考
|
||||
|
||||
- **Cap** - https://github.com/CapSoftware/Cap
|
||||
|
|
@ -1,288 +0,0 @@
|
|||
# Channel System
|
||||
|
||||
The Channel system connects external messaging platforms (Telegram, Discord, etc.) to the Hub's agent. Each platform is a **plugin** that translates platform-specific APIs into a unified interface.
|
||||
|
||||
> For media handling details (audio transcription, image/video description), see [media-handling.md](./media-handling.md).
|
||||
> For message flow across all three I/O paths (Desktop / Web / Channel), see [message-paths.md](../message-paths.md).
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ credentials.json5 │
|
||||
│ { channels: { telegram: { default: { botToken } } } } │
|
||||
└──────────────────────┬──────────────────────────────────────┘
|
||||
│ loadChannelsConfig()
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Channel Manager (manager.ts) │
|
||||
│ │
|
||||
│ startAll() → iterate plugins → startAccount() per account │
|
||||
│ ensureSubscribed() → listen for agent lifecycle events │
|
||||
│ │
|
||||
│ Incoming: │
|
||||
│ routeIncoming() → 👀 ack + debouncer → agent.write() │
|
||||
│ Outgoing: │
|
||||
│ activeRoute → aggregator → plugin.outbound.*() │
|
||||
│ │
|
||||
│ State: │
|
||||
│ pendingRoutes[] ─(FIFO)→ activeRoute + activeAcks │
|
||||
│ ackBuffer[] ─(snapshot on flush)→ pendingRoutes[].acks │
|
||||
└──────────┬──────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ InboundDebouncer (inbound-debouncer.ts) │
|
||||
│ 500ms idle window / 2000ms hard cap per conversationId │
|
||||
│ Each flush → snapshot route + acks → agent.write() │
|
||||
└──────────┬──────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Plugin Registry (registry.ts) │
|
||||
│ registerChannel(plugin) / listChannels() / getChannel(id) │
|
||||
└──────────┬──────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Channel Plugins (e.g. telegram.ts) │
|
||||
│ │
|
||||
│ config — resolve account credentials │
|
||||
│ gateway — receive messages (polling / webhook) │
|
||||
│ outbound — send replies, typing, reactions (👀 ack) │
|
||||
│ downloadMedia() — download media files to local disk │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Plugin Interface
|
||||
|
||||
Each channel plugin implements `ChannelPlugin` (defined in `types.ts`):
|
||||
|
||||
```typescript
|
||||
interface ChannelPlugin {
|
||||
readonly id: string; // "telegram", "discord", etc.
|
||||
readonly meta: { name: string; description: string };
|
||||
readonly chunkerConfig?: BlockChunkerConfig; // override text chunking per platform
|
||||
readonly config: ChannelConfigAdapter; // credential resolution
|
||||
readonly gateway: ChannelGatewayAdapter; // receive messages
|
||||
readonly outbound: ChannelOutboundAdapter; // send replies
|
||||
downloadMedia?(fileId: string, accountId: string): Promise<string>; // optional
|
||||
}
|
||||
```
|
||||
|
||||
### Three Adapters
|
||||
|
||||
| Adapter | Role | Key Methods |
|
||||
|---------|------|-------------|
|
||||
| **config** | Resolve credentials from `credentials.json5` | `listAccountIds()`, `resolveAccount()`, `isConfigured()` |
|
||||
| **gateway** | Receive inbound messages from the platform | `start(accountId, config, onMessage, signal)` |
|
||||
| **outbound** | Send replies back to the platform | `sendText()`, `replyText()`, `sendTyping?()`, `addReaction?()`, `removeReaction?()` |
|
||||
|
||||
### downloadMedia (optional)
|
||||
|
||||
Platforms that support media (voice, image, video, document) implement `downloadMedia()` to download files to `~/.super-multica/cache/media/` with UUID filenames. The Manager calls this before processing media.
|
||||
|
||||
## Message Flow
|
||||
|
||||
### Inbound (Platform → Agent)
|
||||
|
||||
```
|
||||
User sends message in Telegram
|
||||
→ grammy long-polling → onMessage callback
|
||||
→ ChannelManager.routeIncoming()
|
||||
1. Update lastRoute (reply target)
|
||||
2. Start typing indicator (repeats every 5s)
|
||||
3. Add 👀 reaction to this message (ack)
|
||||
4. Push ack route to ackBuffer
|
||||
5. If media: routeMedia() → download → transcribe/describe → text
|
||||
6. Push text into InboundDebouncer
|
||||
|
||||
InboundDebouncer (per conversationId):
|
||||
┌─ 500ms idle window: wait for more messages
|
||||
│ If another message arrives within 500ms, reset timer and append
|
||||
│ If 2000ms since first message, force-flush immediately
|
||||
└─ On flush:
|
||||
1. Snapshot lastRoute → route
|
||||
2. Snapshot ackBuffer → acks, clear buffer
|
||||
3. Push { route, acks } to pendingRoutes queue
|
||||
4. Call agent.write(combinedText)
|
||||
```
|
||||
|
||||
All media is converted to text before the agent sees it. See [media-handling.md](./media-handling.md) for details.
|
||||
|
||||
### Outbound (Agent → Platform)
|
||||
|
||||
```
|
||||
agent.write() queued → agent.run() starts
|
||||
→ agent_start event
|
||||
1. Shift entry from pendingRoutes queue
|
||||
2. Set activeRoute = entry.route (stable for entire run)
|
||||
3. Set activeAcks = entry.acks
|
||||
|
||||
→ message_start (assistant)
|
||||
1. Create MessageAggregator wired to activeRoute
|
||||
→ message_update (assistant)
|
||||
1. Feed text deltas to aggregator
|
||||
→ message_end (assistant)
|
||||
1. Aggregator flushes final block, then null out
|
||||
(May repeat if agent does multi-turn tool calls)
|
||||
|
||||
→ Aggregator emits BlockReply chunks:
|
||||
Block 0: plugin.outbound.replyText() // reply to original message
|
||||
Block N: plugin.outbound.sendText() // follow-up messages
|
||||
|
||||
→ agent_end event
|
||||
1. Remove 👀 from all activeAcks messages
|
||||
2. Clear activeRoute and activeAcks
|
||||
3. If pendingRoutes is empty → stop typing
|
||||
If more pending → keep typing for next run
|
||||
```
|
||||
|
||||
The **MessageAggregator** buffers streaming LLM output and splits it into blocks at natural text boundaries (paragraphs, code blocks). This is necessary because messaging platforms cannot consume raw streaming deltas.
|
||||
|
||||
## Route Queue Pattern
|
||||
|
||||
The channel system uses a FIFO queue to correctly route replies when multiple messages arrive while the agent is busy. This solves the "reply-to mismatch" problem where rapid-fire messages would cause replies to target the wrong original message.
|
||||
|
||||
### State Fields
|
||||
|
||||
| Field | Type | Purpose |
|
||||
|-------|------|---------|
|
||||
| `lastRoute` | `LastRoute \| null` | Where the most recent channel message came from. Updated on every incoming message. |
|
||||
| `pendingRoutes` | `{ route, acks }[]` | FIFO queue of snapshotted routes, one per debouncer flush. Dequeued on `agent_start`. |
|
||||
| `activeRoute` | `LastRoute \| null` | Route for the currently running agent. Set on `agent_start`, cleared on `agent_end`. Stable across all turns within one run. |
|
||||
| `ackBuffer` | `LastRoute[]` | Accumulates 👀 ack targets between debouncer flushes. Snapshotted and cleared on each flush. |
|
||||
| `activeAcks` | `LastRoute[]` | All messages with 👀 in the current run. Cleaned up on `agent_end`. |
|
||||
|
||||
### Lifecycle
|
||||
|
||||
```
|
||||
Message A arrives → lastRoute = A, ackBuffer = [A], 👀 on A
|
||||
Message B arrives (50ms) → lastRoute = B, ackBuffer = [A, B], 👀 on B
|
||||
─── 500ms idle ───
|
||||
Debouncer flushes → pendingRoutes.push({ route: B, acks: [A, B] })
|
||||
ackBuffer = [], agent.write("A\nB")
|
||||
|
||||
Message C arrives → lastRoute = C, ackBuffer = [C], 👀 on C
|
||||
─── 500ms idle ───
|
||||
Debouncer flushes → pendingRoutes.push({ route: C, acks: [C] })
|
||||
ackBuffer = [], agent.write("C")
|
||||
|
||||
agent_start (run 1) → activeRoute = B, activeAcks = [A, B]
|
||||
(agent processes "A\nB", replies to message B)
|
||||
agent_end (run 1) → remove 👀 from A and B, pendingRoutes still has 1 → keep typing
|
||||
|
||||
agent_start (run 2) → activeRoute = C, activeAcks = [C]
|
||||
(agent processes "C", replies to message C)
|
||||
agent_end (run 2) → remove 👀 from C, pendingRoutes empty → stop typing
|
||||
```
|
||||
|
||||
### Why agent_start / agent_end (not message_end)
|
||||
|
||||
In multi-turn agent runs (e.g. when the agent uses tools), `message_end` fires once per assistant message — potentially multiple times per `agent.run()`. Using `message_end` for state management would:
|
||||
- Clear `activeRoute` mid-run, causing the next turn's aggregator to pick up the wrong route
|
||||
- Remove 👀 too early (before the agent is actually done)
|
||||
- Stop typing between tool-call turns
|
||||
|
||||
`agent_start` and `agent_end` fire exactly once per `agent.run()`, making them the correct lifecycle boundaries.
|
||||
|
||||
### lastRoute vs activeRoute
|
||||
|
||||
- **`lastRoute`** — global, updated on every incoming message. Used for: typing indicators, error reporting, creating aggregators when no activeRoute exists.
|
||||
- **`activeRoute`** — per-run, set from queue on `agent_start`. Used for: reply targeting via aggregator. Guarantees that a run's reply goes to the correct message even if new messages arrive during processing.
|
||||
|
||||
Desktop and Web always receive agent events independently via their own mechanisms (IPC / Gateway). `clearLastRoute()` is called when a desktop/web message arrives to prevent channel forwarding.
|
||||
|
||||
## Inbound Debouncer
|
||||
|
||||
The `InboundDebouncer` (`inbound-debouncer.ts`) batches rapid-fire messages from the same conversation into a single `agent.write()` call. This prevents the agent from processing incomplete thoughts when users send multiple short messages quickly.
|
||||
|
||||
**Parameters:**
|
||||
- `delayMs` (default 500ms) — idle window: how long to wait after each message before flushing
|
||||
- `maxWaitMs` (default 2000ms) — hard cap: max time since first message before force-flushing
|
||||
|
||||
**Behavior:**
|
||||
- Messages within 500ms of each other are combined with newlines
|
||||
- Messages >500ms apart get independent flushes and separate agent runs
|
||||
- No busy-awareness: each flush is independent regardless of agent state
|
||||
- Each flush triggers a route snapshot (lastRoute + ackBuffer) pushed to the pendingRoutes queue
|
||||
|
||||
## Typing and Reaction Lifecycle
|
||||
|
||||
### Typing Indicator
|
||||
- **Start:** `routeIncoming()` — starts a 5s repeating interval (Telegram requires re-sending "typing" every 5s)
|
||||
- **Stop:** `agent_end` — only if `pendingRoutes` is empty (all queued runs complete). If runs remain queued, typing persists.
|
||||
- **Also stops on:** `clearLastRoute()` (desktop/web message), `stopAccount()`, `stopAll()`, `agent_error`
|
||||
|
||||
### 👀 Ack Reaction
|
||||
- **Add:** `routeIncoming()` — immediately on each message, before debouncing
|
||||
- **Track:** pushed to `ackBuffer`, then snapshotted into `pendingRoutes[].acks` on debouncer flush, then moved to `activeAcks` on `agent_start`
|
||||
- **Remove:** `agent_end` — iterates `activeAcks` and removes 👀 from each message
|
||||
- **Also removed on:** `agent_error`
|
||||
|
||||
This ensures every queued message shows 👀 while waiting, and all 👀 are cleaned up precisely when the agent finishes processing that batch.
|
||||
|
||||
## Configuration
|
||||
|
||||
Channel credentials are stored in `~/.super-multica/credentials.json5` under the `channels` key:
|
||||
|
||||
```json5
|
||||
{
|
||||
channels: {
|
||||
telegram: {
|
||||
default: {
|
||||
botToken: "123456:ABC-DEF..."
|
||||
}
|
||||
},
|
||||
// discord: { default: { botToken: "..." } },
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Each channel ID maps to accounts (keyed by account ID, typically `"default"`). The config adapter for each plugin knows how to extract and validate its credentials.
|
||||
|
||||
## Adding a New Plugin
|
||||
|
||||
1. Create `src/channels/plugins/<name>.ts` implementing `ChannelPlugin`
|
||||
2. Register it in `src/channels/index.ts`:
|
||||
```typescript
|
||||
import { <name>Channel } from "./plugins/<name>.js";
|
||||
registerChannel(<name>Channel);
|
||||
```
|
||||
3. Add the config shape to the `channels` section of `credentials.json5`
|
||||
|
||||
### Implementation Checklist
|
||||
|
||||
- [ ] `config` adapter: parse credentials from `credentials.json5`
|
||||
- [ ] `gateway` adapter: connect to platform, normalize messages to `ChannelMessage`
|
||||
- [ ] `outbound` adapter: `sendText`, `replyText`, optional `sendTyping`, `addReaction`, `removeReaction`
|
||||
- [ ] `downloadMedia` (if platform supports media): download to `MEDIA_CACHE_DIR`
|
||||
- [ ] Group filtering: only respond to messages directed at the bot
|
||||
- [ ] Graceful shutdown: respect the `AbortSignal` passed to `gateway.start()`
|
||||
|
||||
## File Map
|
||||
|
||||
| File | Role |
|
||||
|------|------|
|
||||
| `src/channels/types.ts` | All type definitions (`ChannelPlugin`, `ChannelMessage`, `DeliveryContext`, etc.) |
|
||||
| `src/channels/manager.ts` | `ChannelManager` — bridges plugins to the Hub's agent, route queue, typing/ack lifecycle |
|
||||
| `src/channels/inbound-debouncer.ts` | `InboundDebouncer` — batches rapid-fire messages per conversationId |
|
||||
| `src/channels/registry.ts` | Plugin registry (`registerChannel`, `listChannels`, `getChannel`) |
|
||||
| `src/channels/config.ts` | Load channel config from `credentials.json5` |
|
||||
| `src/channels/index.ts` | Bootstrap: register built-in plugins, re-export public API |
|
||||
| `src/channels/plugins/telegram.ts` | Telegram plugin (grammy, long polling) |
|
||||
| `src/channels/plugins/telegram-format.ts` | Markdown → Telegram HTML converter |
|
||||
| `src/media/transcribe.ts` | Audio transcription (local whisper → OpenAI API) |
|
||||
| `src/media/describe-image.ts` | Image description (OpenAI Vision API) |
|
||||
| `src/media/describe-video.ts` | Video description (ffmpeg frame + Vision API) |
|
||||
| `src/shared/paths.ts` | `MEDIA_CACHE_DIR` path constant |
|
||||
| `src/hub/message-aggregator.ts` | Streaming text → block chunking for channel delivery |
|
||||
| `packages/ui/src/components/message-list.tsx` | UI rendering with `stripUserMetadata()` for clean display |
|
||||
|
||||
## Current Plugins
|
||||
|
||||
| Plugin | Platform | Transport | Library |
|
||||
|--------|----------|-----------|---------|
|
||||
| `telegram` | Telegram | Long polling | grammy |
|
||||
|
||||
Planned: Discord, Feishu, LINE, etc.
|
||||
|
|
@ -1,161 +0,0 @@
|
|||
# Channel Media Handling
|
||||
|
||||
How multimedia messages (voice, image, video, document) from messaging platforms are processed before reaching the Agent.
|
||||
|
||||
## Core Principle
|
||||
|
||||
All media is converted to text before the Agent sees it. The Agent only ever receives plain text via `agent.write()`.
|
||||
|
||||
```
|
||||
Platform message (voice/image/video/doc)
|
||||
→ Plugin: detect type + download file
|
||||
→ Manager: convert to text (API transcription / vision description)
|
||||
→ Agent receives text via agent.write()
|
||||
```
|
||||
|
||||
## Reference Architecture (OpenClaw)
|
||||
|
||||
OpenClaw supports 6 platforms (Telegram, Discord, LINE, Signal, iMessage, Slack). All share the same media processing pipeline.
|
||||
|
||||
### Per-Platform Layer (different for each platform)
|
||||
|
||||
Each platform detects media type using its own API:
|
||||
|
||||
| Platform | Detection Method |
|
||||
|----------|-----------------|
|
||||
| Telegram | `msg.voice`, `msg.audio`, `msg.photo`, `msg.video`, `msg.document` |
|
||||
| Discord | `attachment.content_type` MIME prefix (`audio/`, `image/`, `video/`) |
|
||||
| LINE | `message.type` field (`"audio"`, `"image"`, `"video"`, `"file"`) |
|
||||
| Signal | `attachment.contentType` MIME prefix |
|
||||
| iMessage | `attachment.mime_type` MIME prefix |
|
||||
| Slack | Any file attachment (MIME-based detection happens later) |
|
||||
|
||||
Each platform downloads the file using its own API, saves to local disk, and tags it:
|
||||
- `<media:audio>` for voice/audio
|
||||
- `<media:image>` for images
|
||||
- `<media:video>` for video
|
||||
- `<media:document>` for files
|
||||
|
||||
### Shared Layer (`applyMediaUnderstanding()`)
|
||||
|
||||
One function handles all conversions, called automatically before the Agent sees the message:
|
||||
|
||||
1. Reads local file path + MIME type
|
||||
2. Selects conversion method based on type:
|
||||
- **audio** → transcription (whisper local / OpenAI API / Groq / Deepgram / Google)
|
||||
- **image** → vision model description (Gemini / OpenAI / Anthropic)
|
||||
- **video** → vision model description
|
||||
3. Replaces placeholder with formatted text:
|
||||
- Audio: `[Audio]\nTranscript:\n<transcribed text>`
|
||||
- Image: `[Image]\nDescription:\n<description text>`
|
||||
4. If conversion fails (no provider configured), the raw placeholder stays in the message
|
||||
|
||||
### Transcription Provider Priority
|
||||
|
||||
Auto-detection order:
|
||||
1. sherpa-onnx-offline (local)
|
||||
2. whisper-cli / whisper.cpp (local)
|
||||
3. whisper Python CLI (local)
|
||||
4. gemini CLI (local)
|
||||
5. API providers: OpenAI → Groq → Deepgram → Google
|
||||
|
||||
### Skill Integration
|
||||
|
||||
Whisper skills declare requirements in `SKILL.md` metadata:
|
||||
```yaml
|
||||
requires:
|
||||
bins: ["whisper"] # must exist in PATH
|
||||
```
|
||||
|
||||
If the binary is missing, the skill is filtered out — the Agent never sees it. If present, the Agent can use it for transcription.
|
||||
|
||||
---
|
||||
|
||||
## Our Implementation
|
||||
|
||||
All media is converted to text in the Manager layer (`routeMedia()`) before reaching the Agent, matching OpenClaw's `applyMediaUnderstanding()` pattern.
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ Platform Plugin (e.g. telegram.ts) │
|
||||
│ │
|
||||
│ bot.on("message:voice") → detect type │
|
||||
│ bot.api.getFile() → download to local disk │
|
||||
│ Emit ChannelMessage with media attachment │
|
||||
└──────────────────┬──────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ Channel Manager (manager.ts → routeMedia()) │
|
||||
│ │
|
||||
│ Download file via plugin.downloadMedia() │
|
||||
│ audio → transcribeAudio() → text │
|
||||
│ image → describeImage() → text │
|
||||
│ video → describeVideo() (ffmpeg frame + vision) → text │
|
||||
│ document → file path info │
|
||||
│ All results → agent.write(text) │
|
||||
└──────────────────┬──────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ Agent receives plain text only │
|
||||
│ e.g. "[Voice Message]\nTranscript: ..." │
|
||||
│ e.g. "[Image]\nDescription: ..." │
|
||||
│ e.g. "[Video]\nDescription: ..." │
|
||||
└─────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Media Processing Modules
|
||||
|
||||
| Type | Module | Method | API |
|
||||
|------|--------|--------|-----|
|
||||
| audio | `src/media/transcribe.ts` | `transcribeAudio()` | Local whisper/whisper-cli → OpenAI Whisper API (`whisper-1`) |
|
||||
| image | `src/media/describe-image.ts` | `describeImage()` | OpenAI Vision API (`gpt-4o-mini`) |
|
||||
| video | `src/media/describe-video.ts` | `describeVideo()` | ffmpeg frame extraction + Vision API |
|
||||
| document | (inline in manager) | — | File path info only |
|
||||
|
||||
### Agent Output Format
|
||||
|
||||
| Type | Success | No API Key |
|
||||
|------|---------|------------|
|
||||
| audio | `[Voice Message]\nTranscript: <text>` | `[audio message received]\nFile: <path>` |
|
||||
| image | `[Image]\nDescription: <text>` | `[image message received]\nFile: <path>` |
|
||||
| video | `[Video]\nDescription: <text>` | `[video message received]\nFile: <path>` |
|
||||
| document | `[document message received]\nFile: <path>` | same |
|
||||
|
||||
### Audio Transcription Priority
|
||||
|
||||
`transcribeAudio()` tries providers in order, matching OpenClaw's local-first approach:
|
||||
|
||||
1. **Local whisper/whisper-cli** — Free, no latency, works offline. Detected via `which` and cached.
|
||||
2. **OpenAI Whisper API** (`whisper-1`) — Requires API key in `credentials.json5`.
|
||||
3. **null** — No provider available. Placeholder stays in message, agent naturally responds (e.g. suggests installing whisper).
|
||||
|
||||
### Whisper Skill (Agent Fallback)
|
||||
|
||||
The `skills/whisper/SKILL.md` skill is a secondary safety net. If transcription returned null (no local binary, no API key), the agent receives a placeholder with the file path. If whisper is installed, the skill tells the agent how to transcribe it via the exec tool.
|
||||
|
||||
### File Map
|
||||
|
||||
| File | Role |
|
||||
|------|------|
|
||||
| `src/channels/types.ts` | `ChannelMediaAttachment`, `ChannelMessage.media`, `ChannelPlugin.downloadMedia` |
|
||||
| `src/channels/plugins/telegram.ts` | Detect voice/audio/photo/video/document + download via Grammy API |
|
||||
| `src/channels/manager.ts` | `routeMedia()` — download, convert, `agent.write(text)` |
|
||||
| `src/media/transcribe.ts` | Audio → text (local whisper → OpenAI Whisper API) |
|
||||
| `src/media/describe-image.ts` | Image → text via OpenAI Vision API (gpt-4o-mini) |
|
||||
| `src/media/describe-video.ts` | Video → extract frame (ffmpeg) → text via Vision API |
|
||||
| `src/shared/paths.ts` | `MEDIA_CACHE_DIR` (`~/.super-multica/cache/media/`) |
|
||||
| `skills/whisper/SKILL.md` | Local whisper CLI fallback skill |
|
||||
|
||||
### Future Work
|
||||
|
||||
| Task | Scope |
|
||||
|------|-------|
|
||||
| Groq / Deepgram fallback for audio | `src/media/transcribe.ts` |
|
||||
| Multi-provider vision support (Gemini, Anthropic) | `src/media/describe-image.ts` |
|
||||
| Document text extraction (PDF, DOCX) | `src/media/` |
|
||||
| Media cache cleanup (delete old files) | `src/shared/` |
|
||||
| Outbound media (send images/audio back to channels) | `types.ts`, plugins |
|
||||
30
docs/cli.md
30
docs/cli.md
|
|
@ -1,30 +0,0 @@
|
|||
# CLI
|
||||
|
||||
```bash
|
||||
multica # Interactive mode
|
||||
multica run "prompt" # Single prompt
|
||||
multica chat --profile my-agent # Use profile
|
||||
multica --session abc123 # Continue session
|
||||
multica session list # List sessions
|
||||
multica profile list # List profiles
|
||||
multica skills list # List skills
|
||||
multica help # Show help
|
||||
```
|
||||
|
||||
Short alias: `mu`
|
||||
|
||||
## Sessions
|
||||
|
||||
Sessions persist to `~/.super-multica/sessions/<id>/` with JSONL message history and JSON metadata. Context windows are automatically managed with token-aware compaction.
|
||||
|
||||
## Profiles
|
||||
|
||||
Profiles define agent identity, personality, and memory in `~/.super-multica/agent-profiles/<id>/`.
|
||||
|
||||
```bash
|
||||
multica profile new my-agent # Create profile
|
||||
multica profile list # List all
|
||||
multica profile edit my-agent # Open in file manager
|
||||
```
|
||||
|
||||
Profile files: `soul.md`, `user.md`, `workspace.md`, `memory.md`, `memory/*.md`
|
||||
|
|
@ -1,338 +0,0 @@
|
|||
# Client Streaming Protocol
|
||||
|
||||
How clients receive real-time agent events via WebSocket (Gateway mode) or IPC (Desktop mode), and what data structures to use for rendering.
|
||||
|
||||
## Transport Overview
|
||||
|
||||
```
|
||||
Gateway mode (Web App):
|
||||
Client ←──WebSocket──→ Gateway ←──→ Hub ←──→ Agent
|
||||
|
||||
Desktop mode (Electron):
|
||||
Renderer ←──IPC──→ Main Process (Hub + Agent)
|
||||
```
|
||||
|
||||
Both transports deliver the same logical events. The client receives a `StreamPayload` envelope containing an event, and routes it to the store for rendering.
|
||||
|
||||
## StreamPayload Envelope
|
||||
|
||||
Every real-time event arrives wrapped in a `StreamPayload`:
|
||||
|
||||
```ts
|
||||
interface StreamPayload {
|
||||
streamId: string; // groups events belonging to the same assistant turn
|
||||
agentId: string; // which agent produced this event
|
||||
event: AgentEvent | CompactionEvent;
|
||||
}
|
||||
```
|
||||
|
||||
In Gateway mode, these arrive as Socket.io messages with `action = "stream"`. In Desktop IPC mode, they arrive as `localChat:event` messages with the same structure.
|
||||
|
||||
## Event Types
|
||||
|
||||
### 1. Message Lifecycle Events (AgentEvent)
|
||||
|
||||
These events represent an LLM response being generated in real time.
|
||||
|
||||
#### `message_start`
|
||||
|
||||
A new assistant message has begun streaming.
|
||||
|
||||
```json
|
||||
{
|
||||
"streamId": "019abc12-...",
|
||||
"agentId": "019def34-...",
|
||||
"event": {
|
||||
"type": "message_start",
|
||||
"message": {
|
||||
"role": "assistant",
|
||||
"content": []
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Client action:** Create a new empty assistant message bubble. Use `streamId` as the message ID for subsequent updates.
|
||||
|
||||
#### `message_update`
|
||||
|
||||
Partial content has arrived for the current message.
|
||||
|
||||
```json
|
||||
{
|
||||
"streamId": "019abc12-...",
|
||||
"agentId": "019def34-...",
|
||||
"event": {
|
||||
"type": "message_update",
|
||||
"message": {
|
||||
"role": "assistant",
|
||||
"content": [
|
||||
{ "type": "text", "text": "Here is the partial response so far..." },
|
||||
{ "type": "thinking", "thinking": "Let me consider..." }
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Client action:** Replace the message's `content` array with the new snapshot. Each update contains the full accumulated content, not a delta.
|
||||
|
||||
#### `message_end`
|
||||
|
||||
The assistant message is complete.
|
||||
|
||||
```json
|
||||
{
|
||||
"streamId": "019abc12-...",
|
||||
"agentId": "019def34-...",
|
||||
"event": {
|
||||
"type": "message_end",
|
||||
"message": {
|
||||
"role": "assistant",
|
||||
"content": [
|
||||
{ "type": "text", "text": "Final complete response." }
|
||||
],
|
||||
"stopReason": "end_turn"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Client action:** Finalize the message. Mark streaming as complete. Extract `stopReason` if needed.
|
||||
|
||||
### 2. Tool Execution Events (AgentEvent)
|
||||
|
||||
These events track tool calls made by the assistant during a turn.
|
||||
|
||||
#### `tool_execution_start`
|
||||
|
||||
The agent has begun executing a tool.
|
||||
|
||||
```json
|
||||
{
|
||||
"streamId": "019abc12-...",
|
||||
"agentId": "019def34-...",
|
||||
"event": {
|
||||
"type": "tool_execution_start",
|
||||
"toolCallId": "toolu_01ABC...",
|
||||
"toolName": "Bash",
|
||||
"args": { "command": "ls -la" }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Client action:** Create a tool result message with `toolStatus: "running"`. Display a spinner or loading indicator.
|
||||
|
||||
#### `tool_execution_end`
|
||||
|
||||
The tool has finished executing.
|
||||
|
||||
```json
|
||||
{
|
||||
"streamId": "019abc12-...",
|
||||
"agentId": "019def34-...",
|
||||
"event": {
|
||||
"type": "tool_execution_end",
|
||||
"toolCallId": "toolu_01ABC...",
|
||||
"result": "file1.txt\nfile2.txt\n",
|
||||
"isError": false
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Client action:** Update the matching tool result message. Set `toolStatus` to `"success"` or `"error"` based on `isError`. Render `result` as the tool output.
|
||||
|
||||
### 3. Compaction Events (CompactionEvent)
|
||||
|
||||
These events notify the client when context window compaction occurs. They use a synthetic `streamId` of `compaction:{agentId}` and do not belong to any message stream.
|
||||
|
||||
#### `compaction_start`
|
||||
|
||||
Context compaction has begun. The agent is removing old messages to free up context window space.
|
||||
|
||||
```json
|
||||
{
|
||||
"streamId": "compaction:019def34-...",
|
||||
"agentId": "019def34-...",
|
||||
"event": {
|
||||
"type": "compaction_start"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Client action:** Show a compaction indicator (e.g., "Compacting context...").
|
||||
|
||||
#### `compaction_end`
|
||||
|
||||
Compaction is complete. Includes statistics about what was removed.
|
||||
|
||||
```json
|
||||
{
|
||||
"streamId": "compaction:019def34-...",
|
||||
"agentId": "019def34-...",
|
||||
"event": {
|
||||
"type": "compaction_end",
|
||||
"removed": 24,
|
||||
"kept": 8,
|
||||
"tokensRemoved": 45000,
|
||||
"tokensKept": 12000,
|
||||
"reason": "tokens"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `removed` | `number` | Number of messages removed |
|
||||
| `kept` | `number` | Number of messages retained |
|
||||
| `tokensRemoved` | `number?` | Estimated tokens freed (absent in count mode) |
|
||||
| `tokensKept` | `number?` | Estimated tokens remaining (absent in count mode) |
|
||||
| `reason` | `string` | What triggered compaction: `"tokens"`, `"count"`, or `"summary"` |
|
||||
|
||||
**Client action:** Hide the compaction indicator. Optionally display a toast or inline notice with the stats.
|
||||
|
||||
## Content Block Types
|
||||
|
||||
Message content is an array of `ContentBlock`, which is a union of:
|
||||
|
||||
```ts
|
||||
// Plain text
|
||||
interface TextContent {
|
||||
type: "text";
|
||||
text: string;
|
||||
}
|
||||
|
||||
// LLM reasoning (extended thinking)
|
||||
interface ThinkingContent {
|
||||
type: "thinking";
|
||||
thinking: string;
|
||||
}
|
||||
|
||||
// Tool invocation (appears in assistant messages)
|
||||
interface ToolCall {
|
||||
type: "toolCall";
|
||||
id: string;
|
||||
name: string;
|
||||
arguments: Record<string, unknown>;
|
||||
}
|
||||
|
||||
// Image content (appears in user messages)
|
||||
interface ImageContent {
|
||||
type: "image";
|
||||
source: { type: "base64"; media_type: string; data: string };
|
||||
}
|
||||
```
|
||||
|
||||
## Client-Side Store Structure
|
||||
|
||||
The recommended Zustand store shape for rendering:
|
||||
|
||||
```ts
|
||||
interface Message {
|
||||
id: string;
|
||||
role: "user" | "assistant" | "toolResult";
|
||||
content: ContentBlock[];
|
||||
agentId: string;
|
||||
stopReason?: string;
|
||||
// Tool result fields (role === "toolResult" only)
|
||||
toolCallId?: string;
|
||||
toolName?: string;
|
||||
toolArgs?: Record<string, unknown>;
|
||||
toolStatus?: "running" | "success" | "error" | "interrupted";
|
||||
isError?: boolean;
|
||||
}
|
||||
|
||||
interface CompactionStats {
|
||||
removed: number;
|
||||
kept: number;
|
||||
tokensRemoved?: number;
|
||||
tokensKept?: number;
|
||||
reason: string;
|
||||
}
|
||||
|
||||
interface MessagesState {
|
||||
messages: Message[];
|
||||
streamingIds: Set<string>; // IDs of messages currently streaming
|
||||
compacting: boolean; // true while compaction is in progress
|
||||
lastCompaction: CompactionStats | null; // stats from most recent compaction
|
||||
}
|
||||
```
|
||||
|
||||
## Event Routing Pseudocode
|
||||
|
||||
```ts
|
||||
function handleStreamEvent(payload: StreamPayload) {
|
||||
const { streamId, agentId, event } = payload;
|
||||
|
||||
switch (event.type) {
|
||||
case "message_start":
|
||||
store.startStream(streamId, agentId);
|
||||
break;
|
||||
case "message_update":
|
||||
store.appendStream(streamId, event.message.content);
|
||||
break;
|
||||
case "message_end":
|
||||
store.endStream(streamId, event.message.content, event.message.stopReason);
|
||||
break;
|
||||
case "tool_execution_start":
|
||||
store.startToolExecution(agentId, event.toolCallId, event.toolName, event.args);
|
||||
break;
|
||||
case "tool_execution_end":
|
||||
store.endToolExecution(event.toolCallId, event.result, event.isError);
|
||||
break;
|
||||
case "compaction_start":
|
||||
store.startCompaction();
|
||||
break;
|
||||
case "compaction_end":
|
||||
store.endCompaction({
|
||||
removed: event.removed,
|
||||
kept: event.kept,
|
||||
tokensRemoved: event.tokensRemoved,
|
||||
tokensKept: event.tokensKept,
|
||||
reason: event.reason,
|
||||
});
|
||||
break;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Message History via RPC
|
||||
|
||||
Clients can also fetch historical messages using the `getAgentMessages` RPC method. See [rpc.md](./rpc.md) for details.
|
||||
|
||||
The response returns `AgentMessage[]` which must be normalized into the `Message` format above. Key differences from streaming:
|
||||
|
||||
- Historical messages don't have `toolStatus` — infer it from `isError` (`"error"` or `"success"`).
|
||||
- Historical messages may have `content` as a plain `string` instead of `ContentBlock[]` — normalize by wrapping in `[{ type: "text", text: content }]`.
|
||||
- Tool arguments are not stored on `toolResult` messages — build a lookup map from assistant `ToolCall` blocks by `toolCallId` to reconstruct `toolArgs`.
|
||||
|
||||
## SDK Imports
|
||||
|
||||
All types are available from `@multica/sdk`:
|
||||
|
||||
```ts
|
||||
import {
|
||||
StreamAction,
|
||||
type StreamPayload,
|
||||
type AgentEvent,
|
||||
type CompactionEvent,
|
||||
type CompactionStartEvent,
|
||||
type CompactionEndEvent,
|
||||
type ContentBlock,
|
||||
type TextContent,
|
||||
type ThinkingContent,
|
||||
type ToolCall,
|
||||
type ImageContent,
|
||||
} from "@multica/sdk";
|
||||
```
|
||||
|
||||
Store types are available from `@multica/store`:
|
||||
|
||||
```ts
|
||||
import {
|
||||
useMessagesStore,
|
||||
type Message,
|
||||
type CompactionStats,
|
||||
type ToolStatus,
|
||||
} from "@multica/store";
|
||||
```
|
||||
|
|
@ -1,64 +0,0 @@
|
|||
# Credentials & LLM Providers
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
multica credentials init
|
||||
```
|
||||
|
||||
Creates:
|
||||
- `~/.super-multica/credentials.json5` — LLM providers + tools
|
||||
|
||||
Example `credentials.json5`:
|
||||
|
||||
```json5
|
||||
{
|
||||
version: 1,
|
||||
llm: {
|
||||
provider: "openai",
|
||||
providers: {
|
||||
openai: { apiKey: "sk-xxx", model: "gpt-4o" }
|
||||
}
|
||||
},
|
||||
tools: {
|
||||
brave: { apiKey: "brv-..." }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Skill API Keys
|
||||
|
||||
Skill-specific API keys are stored in `.env` files within each skill's directory:
|
||||
|
||||
```
|
||||
~/.super-multica/skills/<skill-id>/.env
|
||||
```
|
||||
|
||||
Example for the `earnings-analysis` skill:
|
||||
|
||||
```bash
|
||||
# ~/.super-multica/skills/earnings-analysis/.env
|
||||
FINANCIAL_DATASETS_API_KEY=your-key-here
|
||||
```
|
||||
|
||||
Skills declare their required environment variables in `SKILL.md` frontmatter:
|
||||
|
||||
```yaml
|
||||
metadata:
|
||||
requires:
|
||||
env:
|
||||
- FINANCIAL_DATASETS_API_KEY
|
||||
```
|
||||
|
||||
The `.env` file is preserved across skill upgrades and is never committed to version control.
|
||||
|
||||
## LLM Providers
|
||||
|
||||
**OAuth Providers** (external CLI login):
|
||||
- `claude-code` — requires `claude login`
|
||||
- `openai-codex` — requires `codex login`
|
||||
|
||||
**API Key Providers** (configure in `credentials.json5`):
|
||||
- `anthropic`, `openai`, `kimi-coding`, `google`, `groq`, `mistral`, `xai`, `openrouter`
|
||||
|
||||
Check status: `/provider` in interactive mode
|
||||
|
|
@ -1,82 +0,0 @@
|
|||
# Development Guide
|
||||
|
||||
## Dev Commands
|
||||
|
||||
```bash
|
||||
pnpm dev # Desktop app (recommended)
|
||||
pnpm dev:desktop # Same as above
|
||||
pnpm dev:gateway # Gateway only
|
||||
pnpm dev:web # Web app only
|
||||
pnpm dev:all # Gateway + Web
|
||||
|
||||
pnpm build # Production build (turbo-orchestrated)
|
||||
pnpm typecheck # Type check all packages
|
||||
pnpm test # Run tests
|
||||
pnpm test:watch # Watch mode
|
||||
pnpm test:coverage # With v8 coverage
|
||||
```
|
||||
|
||||
## Local Full-Stack Development
|
||||
|
||||
`pnpm dev:local` starts Gateway + Desktop + Web together with isolated data directories.
|
||||
|
||||
**Setup:**
|
||||
|
||||
1. Copy `.env.example` to `.env` at the repo root
|
||||
2. Fill in `TELEGRAM_BOT_TOKEN` (get from [@BotFather](https://t.me/BotFather))
|
||||
3. Run `pnpm dev:local`
|
||||
|
||||
| Service | Address | Notes |
|
||||
|---------|---------|-------|
|
||||
| Gateway | `http://localhost:4000` | Telegram long-polling mode |
|
||||
| Web | `http://localhost:3000` | OAuth login flow |
|
||||
| Desktop | — | Connects to local Gateway + Web |
|
||||
|
||||
Data is stored in `~/.super-multica-dev` and `~/Documents/Multica-dev`, isolated from production.
|
||||
|
||||
```bash
|
||||
pnpm dev:local:archive # Archive dev data and start fresh
|
||||
```
|
||||
|
||||
## Environment Configuration
|
||||
|
||||
**Desktop** (`apps/desktop/.env.*`):
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `MAIN_VITE_GATEWAY_URL` | WebSocket Gateway URL for remote device pairing |
|
||||
| `MAIN_VITE_WEB_URL` | Web app URL for OAuth login redirect |
|
||||
|
||||
**Web** (`apps/web/next.config.ts`):
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `MULTICA_API_URL` | Backend API URL (required, no default) |
|
||||
|
||||
**Build for different environments:**
|
||||
|
||||
```bash
|
||||
# Desktop
|
||||
pnpm --filter @multica/desktop build # Production (.env.production)
|
||||
pnpm --filter @multica/desktop build:staging # Staging (.env.staging)
|
||||
|
||||
# Web (Vercel)
|
||||
# Set MULTICA_API_URL in Vercel Dashboard → Settings → Environment Variables
|
||||
```
|
||||
|
||||
See `apps/desktop/.env.example` for the full variable reference.
|
||||
|
||||
## Monorepo Workflow
|
||||
|
||||
| Command | Purpose |
|
||||
|---------|---------|
|
||||
| `pnpm dev` | Full dev mode — watches `core`, `types`, `utils` packages |
|
||||
| `pnpm dev:desktop` | Desktop only — skip package watching |
|
||||
|
||||
**When modifying packages:**
|
||||
|
||||
1. Edit code in `packages/core`, `packages/types`, or `packages/utils`
|
||||
2. Terminal shows `[core] ESM ⚡️ Build success` (~100ms)
|
||||
3. Restart Desktop to apply changes (Ctrl+C, then `pnpm dev`)
|
||||
|
||||
> **Why restart?** Electron main process does not support hot reload — this is an Electron limitation, not ours.
|
||||
|
|
@ -1,235 +0,0 @@
|
|||
# Exec Approval Protocol
|
||||
|
||||
Human-in-the-loop command execution approval for the `exec` tool. When an agent attempts to run a shell command that doesn't pass safety checks, the Hub requests approval from the connected client before proceeding.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
Agent (exec tool) Hub Gateway Client (UI)
|
||||
| | | |
|
||||
|-- onApprovalNeeded -->| | |
|
||||
| |-- evaluateCommandSafety() |
|
||||
| |-- requiresApproval()? |
|
||||
| | | |
|
||||
| |== exec-approval-request =============> |
|
||||
| | | |-- show UI
|
||||
| | | |-- user decides
|
||||
| | <== resolveExecApproval RPC ==========|
|
||||
| | | |
|
||||
| <-- approved/denied -| | |
|
||||
| | | |
|
||||
```
|
||||
|
||||
1. The **Agent** calls the `exec` tool with a shell command.
|
||||
2. The `exec` tool invokes the `onApprovalNeeded` callback (injected by the Hub).
|
||||
3. The **Hub** evaluates the command through a 4-layer safety engine.
|
||||
4. If approval is needed, the Hub sends an `exec-approval-request` message to the Client via the Gateway.
|
||||
5. The **Client** displays the approval UI and the user makes a decision.
|
||||
6. The Client calls the `resolveExecApproval` RPC with the decision.
|
||||
7. The Hub resolves the pending promise and the command is either executed or denied.
|
||||
|
||||
## Safety Evaluation
|
||||
|
||||
Before requesting approval, the Hub evaluates the command through 4 layers:
|
||||
|
||||
| Layer | Description | Example |
|
||||
|-------|-------------|---------|
|
||||
| **Allowlist** | Glob patterns of pre-approved commands | `git **`, `pnpm **` |
|
||||
| **Shell syntax** | Detects dangerous shell constructs | `\|&`, `` ` ` ``, `$()`, `;` |
|
||||
| **Safe binaries** | ~40 known-safe commands (no file-path args) | `ls`, `cat`, `git status` |
|
||||
| **Dangerous patterns** | 25+ regex patterns for risky commands | `rm -rf`, `sudo`, `curl \| sh` |
|
||||
|
||||
The result is a risk level: `"safe"`, `"needs-review"`, or `"dangerous"`.
|
||||
|
||||
### Configuration
|
||||
|
||||
Stored in profile config (`~/.super-multica/agent-profiles/{profileId}/config.json`):
|
||||
|
||||
```json
|
||||
{
|
||||
"execApproval": {
|
||||
"security": "allowlist",
|
||||
"ask": "on-miss",
|
||||
"timeoutMs": 60000,
|
||||
"askFallback": "deny",
|
||||
"allowlist": [
|
||||
{ "pattern": "git **" },
|
||||
{ "pattern": "pnpm **" }
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Field | Values | Default | Description |
|
||||
|-------|--------|---------|-------------|
|
||||
| `security` | `"deny"` \| `"allowlist"` \| `"full"` | `"allowlist"` | `deny` blocks all exec, `full` allows all, `allowlist` requires matching |
|
||||
| `ask` | `"off"` \| `"on-miss"` \| `"always"` | `"on-miss"` | `off` never asks, `on-miss` asks when allowlist misses, `always` always asks |
|
||||
| `timeoutMs` | number (ms) | `60000` | Time before auto-deny |
|
||||
| `askFallback` | `"deny"` \| `"allowlist"` \| `"full"` | `"deny"` | What happens on timeout |
|
||||
| `allowlist` | array of entries | `[]` | Pre-approved command patterns |
|
||||
|
||||
## WebSocket Protocol
|
||||
|
||||
### Step 1: Approval Request (Hub → Client)
|
||||
|
||||
When a command requires approval, the Hub sends a push message with action `exec-approval-request`:
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "019444a0-0000-7000-8000-000000000001",
|
||||
"from": "<hubDeviceId>",
|
||||
"to": "<clientDeviceId>",
|
||||
"action": "exec-approval-request",
|
||||
"payload": {
|
||||
"approvalId": "019444a0-1234-7abc-8000-abcdef123456",
|
||||
"agentId": "019444a0-5678-7def-8000-123456abcdef",
|
||||
"command": "rm -rf /tmp/test-data",
|
||||
"cwd": "/Users/alice/projects/my-app",
|
||||
"riskLevel": "dangerous",
|
||||
"riskReasons": [
|
||||
"Matches dangerous pattern: rm with -r or -f flags",
|
||||
"Uses recursive/force deletion flags"
|
||||
],
|
||||
"expiresAtMs": 1738700060000
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Payload Fields
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `approvalId` | `string` | Unique ID for this approval request (UUIDv7). Must be included in the response. |
|
||||
| `agentId` | `string` | Session ID of the agent that initiated the command. |
|
||||
| `command` | `string` | The shell command to be executed. |
|
||||
| `cwd` | `string?` | Working directory for the command. Optional. |
|
||||
| `riskLevel` | `"safe" \| "needs-review" \| "dangerous"` | Evaluated risk level. |
|
||||
| `riskReasons` | `string[]` | Human-readable reasons for the risk assessment. |
|
||||
| `expiresAtMs` | `number` | Unix timestamp (ms) when this request expires. After this time, the Hub auto-resolves based on `askFallback`. |
|
||||
|
||||
### Step 2: User Decision (Client → Hub)
|
||||
|
||||
The client sends a standard RPC request with method `resolveExecApproval`:
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "019444a0-0000-7000-8000-000000000002",
|
||||
"from": "<clientDeviceId>",
|
||||
"to": "<hubDeviceId>",
|
||||
"action": "request",
|
||||
"payload": {
|
||||
"requestId": "client-req-001",
|
||||
"method": "resolveExecApproval",
|
||||
"params": {
|
||||
"approvalId": "019444a0-1234-7abc-8000-abcdef123456",
|
||||
"decision": "allow-once"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Decision Values
|
||||
|
||||
| Decision | Effect |
|
||||
|----------|--------|
|
||||
| `"allow-once"` | Allow this command to execute. No persistent change. |
|
||||
| `"allow-always"` | Allow and add the command's binary to the profile allowlist (e.g., `rm **`). Future commands from the same binary will auto-approve. |
|
||||
| `"deny"` | Block the command. The agent receives a denial message. |
|
||||
|
||||
### Step 3: RPC Response (Hub → Client)
|
||||
|
||||
**Success** — the approval was found and resolved:
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "019444a0-0000-7000-8000-000000000003",
|
||||
"from": "<hubDeviceId>",
|
||||
"to": "<clientDeviceId>",
|
||||
"action": "response",
|
||||
"payload": {
|
||||
"requestId": "client-req-001",
|
||||
"ok": true,
|
||||
"payload": {
|
||||
"ok": true
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Error** — the approval was not found (already resolved or expired):
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "019444a0-0000-7000-8000-000000000004",
|
||||
"from": "<hubDeviceId>",
|
||||
"to": "<clientDeviceId>",
|
||||
"action": "response",
|
||||
"payload": {
|
||||
"requestId": "client-req-001",
|
||||
"ok": false,
|
||||
"error": {
|
||||
"code": "NOT_FOUND",
|
||||
"message": "Approval request not found or already resolved"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Timeout Behavior
|
||||
|
||||
If the client does not respond within `timeoutMs` (default: 60 seconds), the Hub resolves the approval automatically based on the `askFallback` configuration:
|
||||
|
||||
| `askFallback` | Behavior on timeout |
|
||||
|---------------|---------------------|
|
||||
| `"deny"` (default) | Command is denied (fail-closed). |
|
||||
| `"full"` | Command is allowed. |
|
||||
| `"allowlist"` | Command is allowed only if it matched the allowlist; otherwise denied. |
|
||||
|
||||
## SDK Types
|
||||
|
||||
All protocol types are exported from `@multica/sdk`:
|
||||
|
||||
```ts
|
||||
import {
|
||||
ExecApprovalRequestAction, // "exec-approval-request"
|
||||
type ApprovalDecision, // "allow-once" | "allow-always" | "deny"
|
||||
type ExecApprovalRequestPayload,
|
||||
type ResolveExecApprovalParams,
|
||||
type ResolveExecApprovalResult,
|
||||
} from "@multica/sdk";
|
||||
```
|
||||
|
||||
## Client Implementation Guide
|
||||
|
||||
A minimal client handling exec approvals:
|
||||
|
||||
```ts
|
||||
import { GatewayClient, ExecApprovalRequestAction } from "@multica/sdk";
|
||||
import type { ExecApprovalRequestPayload, ApprovalDecision } from "@multica/sdk";
|
||||
|
||||
// Listen for approval requests
|
||||
client.onMessage((msg) => {
|
||||
if (msg.action === ExecApprovalRequestAction) {
|
||||
const payload = msg.payload as ExecApprovalRequestPayload;
|
||||
showApprovalUI(payload);
|
||||
}
|
||||
});
|
||||
|
||||
// When user makes a decision
|
||||
async function respondToApproval(approvalId: string, decision: ApprovalDecision) {
|
||||
const result = await client.request(hubDeviceId, "resolveExecApproval", {
|
||||
approvalId,
|
||||
decision,
|
||||
});
|
||||
// result.ok === true if resolved successfully
|
||||
}
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
The system is designed to be **fail-closed**:
|
||||
|
||||
- If sending the approval request to the client fails → command is denied.
|
||||
- If the client disconnects before responding → timeout fires, command follows `askFallback` (default: deny).
|
||||
- If the RPC response references an unknown `approvalId` → `NOT_FOUND` error returned, no side effects.
|
||||
- If the agent is closed while an approval is pending → all pending approvals for that agent are auto-denied.
|
||||
265
docs/memo.md
265
docs/memo.md
|
|
@ -1,265 +0,0 @@
|
|||
# Multica Memo
|
||||
|
||||
**Multiplexed Information & Computing Agent**
|
||||
|
||||
---
|
||||
|
||||
## What is Multica
|
||||
|
||||
Multica is an always-on AI agent that pulls real data, runs real computation, and takes real action on behalf of users.
|
||||
|
||||
It is not a chatbot. It is not a search engine. It is not an analytics dashboard. It is an **autonomous employee** that works 24/7 — monitoring, analyzing, and acting within user-defined authorization boundaries.
|
||||
|
||||
Users interact with Multica through natural conversation. They can ask for immediate analysis, or tell the agent to run recurring tasks in the background. The same interface handles both modes — no separate workflow builder, no configuration forms. You talk to it like you'd talk to a team member.
|
||||
|
||||
---
|
||||
|
||||
## Core Insight
|
||||
|
||||
The value chain of knowledge work is: **Data → Analysis → Decision → Action**.
|
||||
|
||||
Existing AI products truncate this chain. ChatGPT and Claude stop at conversation. Perplexity stops at search. BI dashboards stop at visualization. Each one hands the remaining work back to the human.
|
||||
|
||||
Multica completes the full chain:
|
||||
|
||||
- **Data**: Pulls structured data from multiple sources through a unified `data` tool, backed by Multica's centralized data infrastructure. Users never configure API keys or deal with data providers.
|
||||
- **Analysis**: Runs actual computation — Python, statistical models, charts — not just text summaries. The agent writes and executes code to derive quantitative insights.
|
||||
- **Decision**: Applies domain-specific analytical frameworks encoded as Skills to evaluate the data and form actionable conclusions.
|
||||
- **Action**: Executes real-world actions (trade, send email, update records) within a tiered authorization model that the user controls.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
### One Tool, Infinite Domains
|
||||
|
||||
Multica's extensibility model is designed for horizontal scaling across verticals without agent-side complexity growth.
|
||||
|
||||
```
|
||||
Finance Legal Medical ...
|
||||
┌──────────┐ ┌──────────┐ ┌──────────┐
|
||||
Skills │ Earnings │ │ Case │ │ Literature│
|
||||
(Markdown) │ Screening │ │ Contract │ │ Drug │
|
||||
│ Macro │ │ Compliance│ │ Clinical │
|
||||
└─────┬─────┘ └─────┬─────┘ └─────┬────┘
|
||||
│ │ │
|
||||
┌─────┴───────────────┴────────────────┴────┐
|
||||
Tool │ data(query, domain) │
|
||||
(single) └─────────────────┬──────────────────────────┘
|
||||
│
|
||||
┌─────────────────┴──────────────────────────┐
|
||||
Backend │ Multica Data Service │
|
||||
│ routing / caching / normalization │
|
||||
├─────────┬───────────┬───────────┬──────────┤
|
||||
│ Polygon │ FRED │ PubMed │ Court- │
|
||||
│ SEC │ NewsAPI │ OpenFDA │ listener │
|
||||
└─────────┴───────────┴───────────┴──────────┘
|
||||
```
|
||||
|
||||
**One `data` tool** serves all verticals. Adding a new domain means adding backend source adapters and writing Skill markdown files. The agent engine, tool set, and product surface remain unchanged.
|
||||
|
||||
**Skills encode domain expertise, not data plumbing.** A Skill is a Markdown file that teaches the agent an analytical workflow: what data to request, how to process it, what to look for, how to present findings. Domain experts can author Skills without writing code.
|
||||
|
||||
**Multica proxies all data access.** Users never register for third-party data APIs. Multica's backend handles authentication, rate limiting, caching, and normalization. This simplifies the user experience and creates a natural monetization layer.
|
||||
|
||||
### Foreground + Background, One Interface
|
||||
|
||||
```
|
||||
User in conversation:
|
||||
|
||||
"Analyze TSLA" → Immediate execution
|
||||
"Send me a market briefing every morning" → Agent schedules cron task
|
||||
"Alert me if NVDA drops below 100" → Agent sets event trigger
|
||||
"Cancel the morning briefing" → Agent removes cron task
|
||||
```
|
||||
|
||||
The agent manages its own background tasks through existing tools (`cron`, `exec`). There is no separate workflow configuration UI. Conversation is the control plane.
|
||||
|
||||
Background tasks run persistently, independent of the app being open. Results are delivered through the user's preferred channel (email, Slack, Telegram, push notification, or in-app).
|
||||
|
||||
### Tiered Action Authorization
|
||||
|
||||
The agent's ability to take action is governed by a user-controlled trust gradient:
|
||||
|
||||
| Level | Behavior | Example |
|
||||
|-------|----------|---------|
|
||||
| 0 — Read-only | Pull data, analyze, report | Generate earnings analysis |
|
||||
| 1 — Notify | Detect signal, alert user | "TSLA broke your stop-loss level" |
|
||||
| 2 — Confirm | Propose action, wait for approval | "Sell 50% TSLA position? [Confirm]" |
|
||||
| 3 — Autonomous | Execute within preset rules, notify after | Auto-rebalance portfolio within mandate |
|
||||
|
||||
Each action type can be independently configured. Users start conservative and escalate trust as they build confidence in the agent. Authorization constraints include per-action limits, daily caps, and scope restrictions.
|
||||
|
||||
---
|
||||
|
||||
## Product
|
||||
|
||||
### Form Factor
|
||||
|
||||
**Web-first** for distribution, with desktop and mobile for persistent background operation.
|
||||
|
||||
The primary interface is conversational — but output is structured. When the agent produces an analysis, it renders as a formatted report with charts, tables, and data citations, not a chat bubble. Reports are exportable (PDF, Excel).
|
||||
|
||||
The secondary interface is **the user's inbox**. Background tasks deliver results via email or messaging. Many users will interact with Multica more through their email than through the app itself.
|
||||
|
||||
### User Experience
|
||||
|
||||
A new user's first 24 hours:
|
||||
|
||||
1. Sign up (web, 30 seconds)
|
||||
2. Tell the agent which stocks/sectors they follow
|
||||
3. Next morning: first market briefing arrives in their inbox
|
||||
4. Open the app, ask a follow-up question about something in the briefing
|
||||
5. Tell the agent "do this every morning"
|
||||
|
||||
**Time to first value: < 24 hours, zero configuration, zero learning curve.**
|
||||
|
||||
### Cross-Domain Composition
|
||||
|
||||
The most powerful use cases combine multiple domains in a single workflow:
|
||||
|
||||
> "We're evaluating an acquisition of a gene-editing company. Give me a full due diligence report."
|
||||
>
|
||||
> Agent combines:
|
||||
> - `data(query, "finance")` → Target's financials, valuation comps
|
||||
> - `data(query, "legal")` → Patent portfolio, regulatory filings
|
||||
> - `data(query, "medical")` → Clinical pipeline, trial results
|
||||
> - `exec` → Python analysis, charts, risk scoring
|
||||
> - Output: Integrated due diligence report spanning finance + IP + science
|
||||
|
||||
One `data` tool, three domains, agent orchestrates autonomously.
|
||||
|
||||
---
|
||||
|
||||
## Go-to-Market
|
||||
|
||||
### First Vertical: Finance
|
||||
|
||||
Finance is the right starting point because:
|
||||
|
||||
- **Data accessibility**: Abundant free and commercial APIs (market data, filings, macro indicators)
|
||||
- **Willingness to pay**: Finance professionals value time; current tools (Bloomberg terminal: $24k/year) prove the market pays for information advantage
|
||||
- **Quantitative output**: The agent's ability to compute (not just chat) is most visible in finance — ratios, models, charts, backtests
|
||||
- **Recurring workflows**: Daily briefings, portfolio monitoring, earnings tracking — these drive retention naturally
|
||||
|
||||
### Target User
|
||||
|
||||
Individual investors, independent financial advisors, small fund analysts (< $50M AUM). They currently cobble together Yahoo Finance + SEC EDGAR + Excel + maybe Python scripts. A full company analysis takes them half a day.
|
||||
|
||||
Multica does it in 2 minutes.
|
||||
|
||||
### Distribution
|
||||
|
||||
| Channel | Approach |
|
||||
|---------|----------|
|
||||
| Twitter/X FinTwit | Real analysis examples as content — the output IS the demo |
|
||||
| YouTube | "AI analyst built my morning briefing in 2 minutes" |
|
||||
| Finance newsletters (Substack) | Weekly analysis pieces generated by Multica, attributed |
|
||||
| Reddit (r/investing, r/SecurityAnalysis) | High-quality analysis posts, organic |
|
||||
| Finance KOLs | Free Pro accounts, let them showcase their own output |
|
||||
|
||||
### Growth Loop
|
||||
|
||||
```
|
||||
Free daily briefing (user signs up, picks stocks)
|
||||
↓
|
||||
Briefing arrives next morning (immediate value)
|
||||
↓
|
||||
User shares briefing excerpt on social media
|
||||
↓
|
||||
Report footer: "Generated with Multica"
|
||||
↓
|
||||
New user sees it → signs up
|
||||
```
|
||||
|
||||
The output is inherently shareable. Every analysis report is a marketing asset.
|
||||
|
||||
### Pricing
|
||||
|
||||
| Tier | Price | Includes |
|
||||
|------|-------|---------|
|
||||
| Free | $0/mo | 5 analyses/month, 1 daily briefing, delayed data |
|
||||
| Pro | $29/mo | Unlimited analyses, custom briefings, real-time data, export, action (Level 0-2) |
|
||||
| Team | $79/user/mo | Shared workspace, collaborative Skills, API access |
|
||||
| Enterprise | Custom | Private deployment, custom data sources, autonomous actions (Level 3), SLA |
|
||||
|
||||
---
|
||||
|
||||
## Roadmap
|
||||
|
||||
### Phase 0→1: Finance MVP (8 weeks)
|
||||
|
||||
| Week | Deliverable |
|
||||
|------|-------------|
|
||||
| 1-2 | `data` tool backend + 2 sources (market data, macro) |
|
||||
| 3-4 | 3 finance Skills (company analysis, screening, macro briefing) |
|
||||
| 5-6 | Email channel (agent sends results, receives instructions) |
|
||||
| 7-8 | Web app (conversation + report rendering + task management) |
|
||||
|
||||
**Launch artifact**: "Sign up, pick 3 stocks, get your first AI briefing tomorrow morning."
|
||||
|
||||
### Phase 1→10: Deepen and Expand (months 3-12)
|
||||
|
||||
**Months 3-6 — Deepen finance:**
|
||||
- More data sources (SEC filings, alternative data, earnings call transcripts)
|
||||
- More Skills (DCF modeling, options analysis, sector comparison, portfolio review)
|
||||
- Portfolio binding (user connects brokerage, agent gives personalized analysis)
|
||||
- Event triggers (price alerts, earnings surprises, insider trading signals)
|
||||
- Action capability (Level 1-2: trade proposals with confirmation)
|
||||
|
||||
**Months 6-12 — Adjacent verticals:**
|
||||
- Finance + Legal (M&A due diligence, SEC compliance, patent analysis)
|
||||
- Finance + Macro (policy impact, central bank analysis, geopolitical risk)
|
||||
- Open Skill authoring (users create and share their own Skills)
|
||||
|
||||
### Phase 10→100: Platform (year 2+)
|
||||
|
||||
**Skill Ecosystem:**
|
||||
|
||||
```
|
||||
multica.ai/skills/
|
||||
├── @multica/ Official Skills (free)
|
||||
├── @analyst-pro/ Community contributor (free/paid)
|
||||
├── @hedgefund-x/ Enterprise private Skills
|
||||
└── @lawfirm-y/ Vertical-specific paid Skills
|
||||
```
|
||||
|
||||
- Anyone can publish a Skill (it's a Markdown file)
|
||||
- Enterprises deploy private Skills for their teams
|
||||
- Paid Skills: creator sets price, Multica takes platform fee
|
||||
|
||||
**Data Marketplace:**
|
||||
- Third-party data providers plug into Multica's backend
|
||||
- Premium data sources available to paying users
|
||||
- Multica becomes the distribution channel for data providers
|
||||
|
||||
**Multi-vertical expansion:**
|
||||
- Each new vertical = backend source adapters + domain Skills
|
||||
- Agent engine unchanged
|
||||
- Same authorization model, same product surface
|
||||
|
||||
---
|
||||
|
||||
## Defensibility
|
||||
|
||||
| Layer | Moat |
|
||||
|-------|------|
|
||||
| Data infrastructure | Aggregated, normalized, cached — hard to replicate per-source |
|
||||
| Skill ecosystem | Network effects: more Skills → more users → more Skill creators |
|
||||
| User data | Portfolio history, preference patterns, analysis history — switching cost |
|
||||
| Trust calibration | User's authorization levels and constraints are personalized over time |
|
||||
| Domain compounding | Cross-vertical composition (finance + legal + medical) is uniquely enabled by the unified `data` tool architecture |
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
Multica is an always-on AI agent that completes the full knowledge work chain: data → analysis → decision → action.
|
||||
|
||||
It starts in finance — where data is accessible, users pay, and quantitative output is the clearest differentiator — with a daily briefing that delivers value in < 24 hours.
|
||||
|
||||
It scales horizontally through a unified `data` tool + Skill architecture that adds new verticals without changing the agent engine.
|
||||
|
||||
It builds a platform moat through a Skill ecosystem where domain experts encode their workflows as shareable, composable Markdown files.
|
||||
|
||||
The product is not a tool you open. It's an employee that works while you sleep.
|
||||
|
|
@ -1,232 +0,0 @@
|
|||
# Message Paths — Desktop / Web / Channel
|
||||
|
||||
Three independent paths deliver messages to and from the Hub's agent.
|
||||
All three share the same `AsyncAgent` instance — they are just different I/O surfaces.
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
```
|
||||
Desktop (Electron IPC) Web (WebSocket via Gateway) Channel (Bot API, e.g. Telegram)
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
localChat:send IPC client.send → Gateway WS plugin.gateway (polling/webhook)
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
hub.ts / ipc/hub.ts hub.ts / onMessage manager.ts / routeIncoming
|
||||
clearLastRoute() clearLastRoute() set lastRoute
|
||||
│ │ │
|
||||
└────────────────► agent.write(text) ◄──────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
AsyncAgent.run()
|
||||
│
|
||||
┌────────────┴────────────────┐
|
||||
▼ ▼
|
||||
agent.subscribe() agent.read()
|
||||
(multi-consumer) (single-consumer iterable)
|
||||
│ │
|
||||
┌────────┴────────┐ ▼
|
||||
▼ ▼ hub.ts / consumeAgent()
|
||||
Desktop IPC Channel Manager │
|
||||
(ipc/hub.ts) (manager.ts) ▼
|
||||
│ │ Gateway WS → Web client
|
||||
▼ ▼
|
||||
localChat:event Bot API reply
|
||||
→ renderer (via lastRoute)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Path 1: Desktop (Electron IPC)
|
||||
|
||||
### Send (User → Agent)
|
||||
|
||||
```
|
||||
Renderer: sendMessage(text)
|
||||
→ IPC: localChat:send
|
||||
→ ipc/hub.ts handler
|
||||
→ hub.channelManager.clearLastRoute() // reply stays in desktop
|
||||
→ agent.write(text)
|
||||
```
|
||||
|
||||
**File**: `apps/desktop/electron/ipc/hub.ts` — `localChat:send` handler (line ~373)
|
||||
|
||||
### Receive (Agent → User)
|
||||
|
||||
```
|
||||
Agent runs LLM
|
||||
→ pi-agent-core fires AgentEvent
|
||||
→ Agent.subscribeAll() → AsyncAgent channel + subscribers
|
||||
→ agent.subscribe() callback in ipc/hub.ts
|
||||
→ Filter: assistant messages + tool_execution + passthrough (compaction, agent_error)
|
||||
→ IPC: mainWindow.webContents.send('localChat:event', { agentId, streamId, event })
|
||||
→ Renderer: use-local-chat.ts onEvent callback
|
||||
→ chat.handleStream(payload)
|
||||
```
|
||||
|
||||
**Files**:
|
||||
- `apps/desktop/electron/ipc/hub.ts` — `localChat:subscribe` handler (line ~248)
|
||||
- `apps/desktop/src/hooks/use-local-chat.ts` — `onEvent` listener (line ~54)
|
||||
- `packages/hooks/src/use-chat.ts` — `handleStream()` (line ~133)
|
||||
|
||||
### Error Handling
|
||||
|
||||
```
|
||||
Agent.run() throws / returns error
|
||||
→ AsyncAgent.write() catch block
|
||||
→ channel.send(legacy Message) // for read() consumers (Web)
|
||||
→ agent.emitMulticaEvent({ type: "agent_error", error }) // for subscribe() consumers
|
||||
→ ipc/hub.ts subscriber → passthrough event → localChat:event
|
||||
→ use-local-chat.ts → chat.setError() + setIsLoading(false)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Path 2: Web (WebSocket via Gateway)
|
||||
|
||||
### Send (User → Agent)
|
||||
|
||||
```
|
||||
Web app: sendMessage(text)
|
||||
→ GatewayClient.send(hubId, "message", { agentId, content })
|
||||
→ Socket.io → Gateway server → routes to Hub device
|
||||
→ hub.ts / onMessage handler
|
||||
→ channelManager.clearLastRoute() // reply stays in gateway
|
||||
→ agentSenders.set(agentId, deviceId)
|
||||
→ agent.write(content)
|
||||
```
|
||||
|
||||
**File**: `src/hub/hub.ts` — `onMessage` handler (line ~154)
|
||||
|
||||
### Receive (Agent → User)
|
||||
|
||||
```
|
||||
Agent runs LLM
|
||||
→ pi-agent-core fires AgentEvent
|
||||
→ Agent.subscribeAll() → AsyncAgent channel + subscribers
|
||||
→ agent.read() consumed by hub.ts / consumeAgent()
|
||||
→ Filter: assistant messages + tool_execution + passthrough (compaction, agent_error)
|
||||
→ client.send(targetDeviceId, StreamAction, { streamId, agentId, event })
|
||||
→ Socket.io → Gateway → routes to Web client device
|
||||
→ GatewayClient.onMessage callback
|
||||
→ use-gateway-chat.ts → chat.handleStream(payload)
|
||||
```
|
||||
|
||||
**Files**:
|
||||
- `src/hub/hub.ts` — `consumeAgent()` (line ~314)
|
||||
- `packages/hooks/src/use-gateway-chat.ts` — `onMessage` listener (line ~50)
|
||||
- `packages/hooks/src/use-chat.ts` — `handleStream()` (line ~133)
|
||||
|
||||
### Error Handling
|
||||
|
||||
```
|
||||
Agent.run() throws / returns error
|
||||
→ AsyncAgent.write() catch block
|
||||
→ channel.send(legacy Message) // consumed by consumeAgent() → sent as "message" action
|
||||
→ agent.emitMulticaEvent({ type: "agent_error", error })
|
||||
→ read() → consumeAgent() → passthrough event → StreamAction
|
||||
→ GatewayClient → use-gateway-chat.ts → chat.setError() + setIsLoading(false)
|
||||
```
|
||||
|
||||
**Note**: Legacy error Messages also reach the Web client as `"message"` action (a plain text fallback). The `agent_error` event provides structured error info for proper UI rendering.
|
||||
|
||||
---
|
||||
|
||||
## Path 3: Channel (Bot API, e.g. Telegram)
|
||||
|
||||
### Send (User → Agent)
|
||||
|
||||
```
|
||||
User sends message in Telegram
|
||||
→ grammy long-polling receives Update
|
||||
→ plugin.gateway.start() callback: onMessage(channelMessage)
|
||||
→ ChannelManager.routeIncoming()
|
||||
→ Set lastRoute = { plugin, deliveryCtx } // reply goes back to Telegram
|
||||
→ agent.write(text) // same as desktop/web
|
||||
```
|
||||
|
||||
**File**: `src/channels/manager.ts` — `routeIncoming()` (line ~233)
|
||||
|
||||
### Receive (Agent → User)
|
||||
|
||||
```
|
||||
Agent runs LLM
|
||||
→ pi-agent-core fires AgentEvent
|
||||
→ Agent.subscribeAll() → AsyncAgent channel + subscribers
|
||||
→ agent.subscribe() callback in ChannelManager.subscribeToAgent()
|
||||
→ Check: if (!lastRoute) return // no active channel route, skip
|
||||
→ Filter: only assistant messages
|
||||
→ message_start → createAggregator() // MessageAggregator buffers/chunks text
|
||||
→ message_update → aggregator.handleEvent()
|
||||
→ message_end → aggregator.handleEvent() → null aggregator
|
||||
→ Aggregator emits text blocks
|
||||
→ Block 0: plugin.outbound.replyText(deliveryCtx, text) // Telegram reply
|
||||
→ Block N: plugin.outbound.sendText(deliveryCtx, text) // follow-up messages
|
||||
```
|
||||
|
||||
**Files**:
|
||||
- `src/channels/manager.ts` — `subscribeToAgent()` (line ~151), `createAggregator()` (line ~205)
|
||||
- `src/hub/message-aggregator.ts` — text chunking/buffering logic
|
||||
|
||||
### Error Handling
|
||||
|
||||
```
|
||||
Agent.run() throws / returns error
|
||||
→ AsyncAgent.write() catch block
|
||||
→ agent.emitMulticaEvent({ type: "agent_error", error })
|
||||
→ subscribe() → ChannelManager subscriber
|
||||
→ if lastRoute exists:
|
||||
→ plugin.outbound.sendText(deliveryCtx, "[Error] ${errorMsg}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Comparison Table
|
||||
|
||||
| Aspect | Desktop (IPC) | Web (WebSocket) | Channel (Bot API) |
|
||||
|---------------------|------------------------|---------------------------|--------------------------|
|
||||
| **Transport** | Electron IPC | Socket.io via Gateway | Bot API (HTTP) |
|
||||
| **Send entry** | `localChat:send` | `client.send` → Gateway | `routeIncoming` |
|
||||
| **Receive method** | `agent.subscribe()` | `agent.read()` (iterable) | `agent.subscribe()` |
|
||||
| **Consumer** | ipc/hub.ts subscriber | hub.ts `consumeAgent()` | manager.ts subscriber |
|
||||
| **Frontend hook** | `use-local-chat.ts` | `use-gateway-chat.ts` | N/A (Bot API) |
|
||||
| **State hook** | `use-chat.ts` | `use-chat.ts` | N/A |
|
||||
| **Reply routing** | Always (IPC channel) | `agentSenders` Map | `lastRoute` pattern |
|
||||
| **clearLastRoute** | Yes (on send) | Yes (on send) | No (sets lastRoute) |
|
||||
| **Error display** | `agent_error` → UI | `agent_error` → UI | `agent_error` → Bot text |
|
||||
| **Tool results** | Rendered in UI | Rendered in UI | Skipped (text only) |
|
||||
| **Text chunking** | No (full stream) | No (full stream) | Yes (MessageAggregator) |
|
||||
|
||||
---
|
||||
|
||||
## lastRoute Pattern
|
||||
|
||||
The `lastRoute` tracks which channel last sent a message. When the agent replies:
|
||||
- If `lastRoute` is set → reply goes to that channel (e.g. Telegram)
|
||||
- If `lastRoute` is null → reply goes to Desktop/Web only (via their own mechanisms)
|
||||
|
||||
**Clearing**: Desktop and Web both call `channelManager.clearLastRoute()` before `agent.write()`, so channel replies stop when the user switches to desktop/web.
|
||||
|
||||
**Setting**: `routeIncoming()` sets `lastRoute` when a channel message arrives.
|
||||
|
||||
Desktop and Web always receive agent events regardless of `lastRoute` — they use their own independent delivery mechanisms (IPC subscribe / Gateway read).
|
||||
|
||||
---
|
||||
|
||||
## Event Filtering
|
||||
|
||||
All three paths filter raw agent events. Only these are forwarded to consumers:
|
||||
|
||||
| Event Type | Desktop | Web | Channel |
|
||||
|-------------------------|---------|-----|---------|
|
||||
| `message_start` | assistant only | assistant only | assistant only |
|
||||
| `message_update` | assistant only | assistant only | assistant only |
|
||||
| `message_end` | assistant only | assistant only | assistant only |
|
||||
| `tool_execution_start` | Yes | Yes | No |
|
||||
| `tool_execution_end` | Yes | Yes | No |
|
||||
| `compaction_start` | Yes (passthrough) | Yes (passthrough) | No |
|
||||
| `compaction_end` | Yes (passthrough) | Yes (passthrough) | No |
|
||||
| `agent_error` | Yes (passthrough) | Yes (passthrough) | Yes (→ text) |
|
||||
| User message events | Filtered out | Filtered out | Filtered out |
|
||||
|
|
@ -1,497 +0,0 @@
|
|||
# Mobile Development Guide
|
||||
|
||||
Complete lifecycle guide for developing, testing, and publishing the Expo React Native app — from first line of code to App Store / Google Play.
|
||||
|
||||
## Overview
|
||||
|
||||
```
|
||||
Phase 1: Environment Setup You are here if starting fresh
|
||||
↓
|
||||
Phase 2: Development & Testing Daily work loop
|
||||
↓
|
||||
Phase 3: Pre-Release Preparation Before your first submission
|
||||
↓
|
||||
Phase 4: Build & Submit Ship to stores
|
||||
↓
|
||||
Phase 5: Post-Launch Maintain and update
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Environment Setup
|
||||
|
||||
### 1.1 Required Software
|
||||
|
||||
| Tool | Purpose | Install |
|
||||
|------|---------|---------|
|
||||
| **Node.js** (LTS) | JS runtime | `brew install node` or [nodejs.org](https://nodejs.org) |
|
||||
| **pnpm** | Package manager | `corepack enable && corepack prepare pnpm@latest --activate` |
|
||||
| **Xcode** | iOS build toolchain | Mac App Store (free) |
|
||||
| **Xcode Command Line Tools** | Compilers, simulators | `xcode-select --install` |
|
||||
| **CocoaPods** | iOS dependency manager | `sudo gem install cocoapods` |
|
||||
| **Android Studio** | Android emulator + SDK (optional, iOS-first) | [developer.android.com](https://developer.android.com/studio) |
|
||||
| **EAS CLI** | Expo build & submit | `npm install -g eas-cli` |
|
||||
| **Expo CLI** | Dev server | Bundled with `npx expo` |
|
||||
|
||||
### 1.2 Xcode First-Time Setup
|
||||
|
||||
1. Open Xcode at least once to accept the license and install components
|
||||
2. **Add your Apple ID** (free account is enough for development):
|
||||
- Xcode → Settings → Accounts → `+` → Apple ID
|
||||
- This creates a "Personal Team" for free code signing
|
||||
3. Verify simulators are installed:
|
||||
- Xcode → Settings → Components → download an iOS Simulator runtime
|
||||
|
||||
### 1.3 iPhone First-Time Setup (for Real Device Testing)
|
||||
|
||||
1. **Enable Developer Mode** (required on iOS 16+):
|
||||
- Settings → Privacy & Security → Developer Mode → ON
|
||||
- Device will restart
|
||||
2. Connect iPhone to Mac via USB/USB-C cable
|
||||
3. When prompted "Trust This Computer?" → tap Trust
|
||||
|
||||
### 1.4 Project Setup
|
||||
|
||||
```bash
|
||||
# Install dependencies
|
||||
pnpm install
|
||||
|
||||
# Generate native project files (creates ios/ and android/ directories)
|
||||
npx expo prebuild
|
||||
|
||||
# Initialize EAS configuration (creates eas.json)
|
||||
eas build:configure
|
||||
```
|
||||
|
||||
### 1.5 Expo Account
|
||||
|
||||
```bash
|
||||
# Create account at expo.dev, then:
|
||||
eas login
|
||||
eas whoami # verify
|
||||
```
|
||||
|
||||
**No paid accounts needed at this stage.** Free Apple ID + free Expo account is enough for development.
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Development & Testing
|
||||
|
||||
### 2.1 Running on iOS Simulator
|
||||
|
||||
```bash
|
||||
# Start the app in iOS simulator (no real device needed)
|
||||
npx expo run:ios
|
||||
```
|
||||
|
||||
- Fastest iteration loop — code changes hot-reload instantly
|
||||
- Good for: UI layout, navigation, business logic, API calls
|
||||
- **Cannot test**: camera, barcode scanner, real push notifications, biometrics
|
||||
|
||||
### 2.2 Running on Real iPhone
|
||||
|
||||
```bash
|
||||
# Connect iPhone via USB, then:
|
||||
npx expo run:ios --device
|
||||
```
|
||||
|
||||
Expo CLI will:
|
||||
1. Detect your connected device
|
||||
2. Sign the app with your Personal Team (free Apple ID)
|
||||
3. Build, install, and launch the app
|
||||
|
||||
**First time only**: After installation, go to:
|
||||
- Settings → General → VPN & Device Management → Trust your developer certificate
|
||||
|
||||
#### Free Signing Limitations
|
||||
|
||||
| Limitation | Detail |
|
||||
|-----------|--------|
|
||||
| 7-day expiry | App stops launching after 7 days — just re-run `npx expo run:ios --device` |
|
||||
| 3 devices max | Can register up to 3 test devices per Apple ID |
|
||||
| Some entitlements unavailable | Push notifications, Apple Pay, iCloud require paid account |
|
||||
| Cannot distribute to others | Only works on your own registered devices |
|
||||
|
||||
**Camera, barcode scanner, GPS, sensors all work fine with free signing.**
|
||||
|
||||
### 2.3 Daily Development Workflow
|
||||
|
||||
```
|
||||
First time (or after native config changes):
|
||||
npx expo prebuild Generate/update native projects
|
||||
npx expo run:ios --device Build and install on device
|
||||
|
||||
Every day after that:
|
||||
npx expo start --dev-client Start dev server only (no rebuild)
|
||||
→ Open the app on device It connects automatically
|
||||
→ Edit code, save Hot-reload updates instantly
|
||||
```
|
||||
|
||||
**When do you need to rebuild?**
|
||||
|
||||
| Change | Rebuild needed? |
|
||||
|--------|----------------|
|
||||
| JS/TS code, React components | No — hot-reload |
|
||||
| Styles, images, assets | No — hot-reload |
|
||||
| Added new Expo SDK module | **Yes** — `npx expo prebuild && npx expo run:ios --device` |
|
||||
| Changed `app.json` permissions | **Yes** — rebuild |
|
||||
| Updated native dependency | **Yes** — rebuild |
|
||||
| Upgraded Expo SDK version | **Yes** — rebuild |
|
||||
|
||||
### 2.4 Testing Native Features (Camera, Scanner)
|
||||
|
||||
| Feature | Simulator | Real Device |
|
||||
|---------|-----------|-------------|
|
||||
| Camera preview | Not available | Works |
|
||||
| Barcode / QR scan | Not available | Works |
|
||||
| GPS location | Simulated location via Xcode menu | Real GPS |
|
||||
| Push notifications | Not available | Requires paid Apple Developer account |
|
||||
| Haptic feedback | Not available | Works |
|
||||
| Device sensors (accelerometer, gyroscope) | Not available | Works |
|
||||
|
||||
For camera/scanner features, **always test on a real device**.
|
||||
|
||||
### 2.5 Debugging Tools
|
||||
|
||||
#### Developer Menu
|
||||
|
||||
Press `m` in the terminal (or shake the device) to open:
|
||||
- Toggle Performance Monitor
|
||||
- Toggle Element Inspector
|
||||
- Open React Native DevTools
|
||||
|
||||
#### React Native DevTools
|
||||
|
||||
The primary debugging tool (replaced Chrome DevTools since RN 0.76):
|
||||
|
||||
| Tab | Use |
|
||||
|-----|-----|
|
||||
| Console | View logs, execute JS in app context |
|
||||
| Sources | Set breakpoints, step through code |
|
||||
| Network | Inspect API requests (Expo only) |
|
||||
| Components | Inspect React component tree and props |
|
||||
| Profiler | Measure render performance |
|
||||
|
||||
#### VS Code Integration
|
||||
|
||||
Install the **Expo Tools** extension for:
|
||||
- Breakpoint debugging directly in VS Code
|
||||
- `app.json` / `app.config.ts` IntelliSense
|
||||
|
||||
#### Native Crash Debugging
|
||||
|
||||
For crashes in native modules (not JS):
|
||||
- **iOS**: Open Xcode → Window → Devices and Simulators → View Device Logs
|
||||
- **Android**: `adb logcat` in terminal
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Pre-Release Preparation
|
||||
|
||||
**This is when you need to start spending money.**
|
||||
|
||||
### 3.1 Accounts & Fees
|
||||
|
||||
| Platform | Cost | Registration Time | Required For |
|
||||
|----------|------|-------------------|--------------|
|
||||
| **Apple Developer Program** | $99/year | 1-2 days review | App Store distribution |
|
||||
| **Google Play Console** | $25 one-time | Days to weeks review | Play Store distribution |
|
||||
| **Expo Account** | Free tier sufficient | Instant | EAS Build & Submit |
|
||||
|
||||
Register early — account review takes time, especially Google.
|
||||
|
||||
### 3.2 App Configuration
|
||||
|
||||
Update `app.json` or `app.config.ts`:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"name": "Multica",
|
||||
"slug": "multica",
|
||||
"version": "1.0.0",
|
||||
"ios": {
|
||||
"bundleIdentifier": "com.multica.app",
|
||||
"buildNumber": "1", // increment each submission
|
||||
"infoPlist": {
|
||||
"NSCameraUsageDescription": "Used to scan QR codes and take photos",
|
||||
"NSPhotoLibraryUsageDescription": "Used to save scanned images"
|
||||
}
|
||||
},
|
||||
"android": {
|
||||
"package": "com.multica.app",
|
||||
"versionCode": 1, // increment each submission
|
||||
"permissions": ["CAMERA"]
|
||||
},
|
||||
"icon": "./assets/icon.png", // 1024x1024 PNG, no transparency
|
||||
"splash": {
|
||||
"image": "./assets/splash.png"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3.3 EAS Build Profiles
|
||||
|
||||
`eas.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"cli": { "version": ">= 10.0.0" },
|
||||
"build": {
|
||||
"development": {
|
||||
"developmentClient": true,
|
||||
"distribution": "internal"
|
||||
},
|
||||
"preview": {
|
||||
"distribution": "internal"
|
||||
},
|
||||
"production": {}
|
||||
},
|
||||
"submit": {
|
||||
"production": {}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3.4 App Signing & Credentials
|
||||
|
||||
#### iOS
|
||||
|
||||
EAS auto-manages credentials (recommended):
|
||||
- Distribution Certificate
|
||||
- Provisioning Profile
|
||||
- Or create manually in [Apple Developer Portal](https://developer.apple.com)
|
||||
|
||||
#### Android
|
||||
|
||||
- EAS auto-generates Keystore, stored securely on EAS servers
|
||||
- **Back up your Keystore** — losing it means you can never update the published app
|
||||
- Play Store requires AAB (Android App Bundle) format
|
||||
|
||||
### 3.5 Required Assets
|
||||
|
||||
| Asset | Spec |
|
||||
|-------|------|
|
||||
| **App Icon** | 1024x1024 PNG, no alpha/transparency (iOS) |
|
||||
| **Splash Screen** | Platform-appropriate sizes |
|
||||
| **iOS Screenshots** | 6.7", 6.5", 5.5" iPhone sizes + iPad (if universal) |
|
||||
| **Android Screenshots** | 2-8 screenshots |
|
||||
|
||||
### 3.6 Required Metadata
|
||||
|
||||
#### Both Platforms
|
||||
|
||||
| Item | Notes |
|
||||
|------|-------|
|
||||
| **Privacy Policy URL** | Publicly accessible. Must disclose data collection, third-party sharing, AI usage, deletion rights |
|
||||
| **App Description** | Short (≤80 chars for Google) + full description |
|
||||
| **Support URL** | Where users can get help |
|
||||
| **Account Deletion** | If app has registration, must support in-app account + data deletion |
|
||||
|
||||
#### Apple App Store Connect
|
||||
|
||||
| Item | Details |
|
||||
|------|---------|
|
||||
| Privacy Nutrition Labels | Data collection practices per category |
|
||||
| App Review Information | Reviewer contact info, demo/test account |
|
||||
| Content Rating | Age classification |
|
||||
| Export Compliance | Encryption usage declaration |
|
||||
| Info.plist Permission Strings | Clear purpose description for each permission |
|
||||
|
||||
#### Google Play Console
|
||||
|
||||
| Item | Details |
|
||||
|------|---------|
|
||||
| Data Safety Form | Required even if no data is collected |
|
||||
| Content Rating Questionnaire | IARC rating |
|
||||
| Target Audience | Must declare if targeting children |
|
||||
| First Upload | Must upload AAB manually (Google API limitation) |
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Build & Submit
|
||||
|
||||
### 4.1 Production Build
|
||||
|
||||
```bash
|
||||
# iOS
|
||||
eas build --platform ios --profile production
|
||||
|
||||
# Android
|
||||
eas build --platform android --profile production
|
||||
|
||||
# Both platforms
|
||||
eas build --platform all --profile production
|
||||
```
|
||||
|
||||
Builds run in Expo cloud — no local Xcode or Android Studio needed for production builds.
|
||||
|
||||
### 4.2 Submit to Apple App Store
|
||||
|
||||
```bash
|
||||
eas submit --platform ios
|
||||
```
|
||||
|
||||
This uploads the build to **App Store Connect / TestFlight**. Then:
|
||||
|
||||
1. Log into [App Store Connect](https://appstoreconnect.apple.com)
|
||||
2. Select the uploaded build
|
||||
3. Associate it with a version
|
||||
4. Fill in all metadata, screenshots, privacy nutrition labels
|
||||
5. Submit for App Review
|
||||
|
||||
### 4.3 Submit to Google Play Store
|
||||
|
||||
```bash
|
||||
eas submit --platform android
|
||||
```
|
||||
|
||||
**First time**: Must upload AAB manually in [Play Console](https://play.google.com/console).
|
||||
|
||||
After initial upload:
|
||||
1. Navigate to Production → Create new release
|
||||
2. Upload AAB or use the EAS-submitted build
|
||||
3. Fill in description, screenshots, data safety form
|
||||
4. Submit for review
|
||||
|
||||
### 4.4 Auto-Submit (Optional)
|
||||
|
||||
Build and submit in one step:
|
||||
|
||||
```bash
|
||||
eas build --platform all --profile production --auto-submit
|
||||
```
|
||||
|
||||
### 4.5 App Review
|
||||
|
||||
| | Apple | Google |
|
||||
|---|---|---|
|
||||
| Review time | Typically 24-48 hours | Hours to 7 days |
|
||||
| Common rejections | Incomplete features, misleading screenshots, missing privacy policy, unclear permission strings | Data safety form mismatch, policy violations |
|
||||
| After rejection | Fix issues, resubmit | Fix issues, resubmit |
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Post-Launch
|
||||
|
||||
### 5.1 OTA Updates (No Re-Review)
|
||||
|
||||
For JS/asset-only changes, push updates without going through App Review:
|
||||
|
||||
```bash
|
||||
eas update --branch production
|
||||
```
|
||||
|
||||
- Instant delivery to users — no store review
|
||||
- Only works for JavaScript and asset changes
|
||||
- **Native code changes still require a new build + review**
|
||||
|
||||
### 5.2 Version Bumping
|
||||
|
||||
For each new store submission:
|
||||
- iOS: increment `buildNumber` in `app.json`
|
||||
- Android: increment `versionCode` in `app.json`
|
||||
- Bump `version` for user-visible version changes
|
||||
|
||||
### 5.3 CI/CD Automation
|
||||
|
||||
Create `.eas/workflows/build-and-submit.yml` to auto-build and submit on push to main.
|
||||
|
||||
#### Google Service Account Key (Automated Android Submissions)
|
||||
|
||||
1. EAS dashboard → Credentials → Android
|
||||
2. Click Application identifier → Service Credentials
|
||||
3. Add Google Service Account Key
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Common Commands
|
||||
|
||||
```bash
|
||||
# Development
|
||||
npx expo prebuild # Generate native projects
|
||||
npx expo run:ios # Run on iOS simulator
|
||||
npx expo run:ios --device # Run on connected iPhone
|
||||
npx expo start --dev-client # Start dev server (after initial install)
|
||||
|
||||
# Building
|
||||
eas build --platform ios --profile development # Dev build (for device testing)
|
||||
eas build --platform ios --profile production # Production build
|
||||
eas build --platform all --profile production # Both platforms
|
||||
|
||||
# Submitting
|
||||
eas submit --platform ios # Submit to App Store
|
||||
eas submit --platform android # Submit to Play Store
|
||||
|
||||
# OTA Updates
|
||||
eas update --branch production # Push JS update to users
|
||||
```
|
||||
|
||||
### Cost Summary
|
||||
|
||||
| Phase | Cost |
|
||||
|-------|------|
|
||||
| Development + local testing | **Free** (free Apple ID + Xcode) |
|
||||
| EAS cloud builds | Free tier: 30 iOS + 30 Android builds/month |
|
||||
| App Store submission | **$99/year** (Apple Developer Program) |
|
||||
| Play Store submission | **$25 one-time** (Google Play Console) |
|
||||
|
||||
---
|
||||
|
||||
## Master Checklist
|
||||
|
||||
### Development Phase
|
||||
- [ ] Install Node.js, pnpm, Xcode, EAS CLI
|
||||
- [ ] Add Apple ID to Xcode (Settings → Accounts)
|
||||
- [ ] Enable Developer Mode on iPhone
|
||||
- [ ] Run `npx expo prebuild`
|
||||
- [ ] Test on simulator: `npx expo run:ios`
|
||||
- [ ] Test on real device: `npx expo run:ios --device`
|
||||
- [ ] Trust developer certificate on device
|
||||
- [ ] Verify camera/scanner functionality on real device
|
||||
|
||||
### Pre-Release Phase
|
||||
- [ ] Register Apple Developer Program ($99/year)
|
||||
- [ ] Register Google Play Console ($25)
|
||||
- [ ] Configure `app.json` (bundleIdentifier, permissions, icon, splash)
|
||||
- [ ] Configure `eas.json` build profiles
|
||||
- [ ] Prepare app icon (1024x1024 PNG)
|
||||
- [ ] Prepare splash screen
|
||||
- [ ] Take App Store screenshots (all required sizes)
|
||||
- [ ] Write and host privacy policy URL
|
||||
- [ ] Write app description (short + full)
|
||||
- [ ] Set up support URL
|
||||
- [ ] Implement in-app account deletion (if registration exists)
|
||||
|
||||
### Submission Phase
|
||||
- [ ] Run `eas build --platform all --profile production`
|
||||
- [ ] iOS: `eas submit --platform ios`
|
||||
- [ ] iOS: Fill metadata + privacy labels in App Store Connect
|
||||
- [ ] iOS: Submit for App Review
|
||||
- [ ] Android: Upload first AAB manually in Play Console
|
||||
- [ ] Android: `eas submit --platform android`
|
||||
- [ ] Android: Fill data safety form + metadata in Play Console
|
||||
- [ ] Android: Submit for review
|
||||
- [ ] Wait for review approval → app goes live
|
||||
|
||||
### Post-Launch Phase
|
||||
- [ ] Set up `eas update` for OTA updates
|
||||
- [ ] Set up CI/CD workflow (optional)
|
||||
- [ ] Configure Google Service Account Key for automated Android submissions (optional)
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- [Expo: Getting Started](https://docs.expo.dev/get-started/introduction/)
|
||||
- [Expo: Development Builds](https://docs.expo.dev/develop/development-builds/introduction/)
|
||||
- [Expo: Local App Development](https://docs.expo.dev/guides/local-app-development/)
|
||||
- [Expo: Debugging Tools](https://docs.expo.dev/debugging/tools/)
|
||||
- [Expo: Submit to App Stores](https://docs.expo.dev/deploy/submit-to-app-stores/)
|
||||
- [Expo: EAS Submit](https://docs.expo.dev/submit/introduction/)
|
||||
- [Expo: EAS Update](https://docs.expo.dev/eas-update/introduction/)
|
||||
- [Apple App Review Guidelines](https://developer.apple.com/app-store/review/guidelines/)
|
||||
- [Apple App Privacy Details](https://developer.apple.com/app-store/app-privacy-details/)
|
||||
- [Google Play Data Safety](https://support.google.com/googleplay/android-developer/answer/10787469)
|
||||
- [Google Play Developer Policy Center](https://play.google/developer-content-policy/)
|
||||
|
|
@ -1,315 +0,0 @@
|
|||
# Package Management Guide
|
||||
|
||||
## Overview
|
||||
|
||||
Super Multica uses **pnpm workspaces** for monorepo management. This document covers package management, dependency handling, and merge conflict resolution.
|
||||
|
||||
---
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
super-multica/
|
||||
├── apps/ # Deployable applications
|
||||
│ ├── cli/ # @multica/cli
|
||||
│ ├── desktop/ # @multica/desktop (Electron)
|
||||
│ ├── gateway/ # @multica/gateway (NestJS WebSocket)
|
||||
│ ├── server/ # @multica/server (NestJS REST)
|
||||
│ ├── web/ # @multica/web (Next.js)
|
||||
│ └── mobile/ # @multica/mobile (React Native)
|
||||
│
|
||||
├── packages/ # Shared libraries
|
||||
│ ├── core/ # @multica/core (agent, hub, channels)
|
||||
│ ├── sdk/ # @multica/sdk (gateway client)
|
||||
│ ├── ui/ # @multica/ui (shared components)
|
||||
│ ├── store/ # @multica/store (Zustand)
|
||||
│ ├── hooks/ # @multica/hooks (React hooks)
|
||||
│ ├── types/ # @multica/types (TypeScript types)
|
||||
│ └── utils/ # @multica/utils (utility functions)
|
||||
│
|
||||
├── skills/ # Bundled agent skills
|
||||
├── pnpm-workspace.yaml # Workspace definition
|
||||
├── pnpm-lock.yaml # Lockfile (auto-generated)
|
||||
└── .npmrc # pnpm configuration
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Configuration Files
|
||||
|
||||
### pnpm-workspace.yaml
|
||||
|
||||
Defines which directories are workspace packages:
|
||||
|
||||
```yaml
|
||||
packages:
|
||||
- "apps/*"
|
||||
- "packages/*"
|
||||
```
|
||||
|
||||
### .npmrc
|
||||
|
||||
**Required configuration for Electron packaging:**
|
||||
|
||||
```ini
|
||||
shamefully-hoist=true
|
||||
```
|
||||
|
||||
**Why?** electron-builder requires all dependencies to be hoisted to the root `node_modules`. Without this, Electron builds will fail with "Cannot find module" errors.
|
||||
|
||||
### pnpm-lock.yaml
|
||||
|
||||
- Auto-generated lockfile
|
||||
- **Never manually edit**
|
||||
- Always regenerate on conflicts
|
||||
|
||||
---
|
||||
|
||||
## Common Commands
|
||||
|
||||
### Install Dependencies
|
||||
|
||||
```bash
|
||||
# Install all workspace dependencies
|
||||
pnpm install
|
||||
|
||||
# Clean install (after changing .npmrc or major updates)
|
||||
rm -rf node_modules apps/*/node_modules packages/*/node_modules
|
||||
rm pnpm-lock.yaml
|
||||
pnpm install
|
||||
```
|
||||
|
||||
### Add Dependencies
|
||||
|
||||
```bash
|
||||
# Add to root (shared dev tools)
|
||||
pnpm add -D typescript -w
|
||||
|
||||
# Add to specific package
|
||||
pnpm add lodash --filter @multica/core
|
||||
|
||||
# Add dev dependency to specific package
|
||||
pnpm add -D vitest --filter @multica/core
|
||||
|
||||
# Add workspace dependency (internal package)
|
||||
pnpm add @multica/utils --filter @multica/core --workspace
|
||||
```
|
||||
|
||||
### Update Dependencies
|
||||
|
||||
```bash
|
||||
# Update all
|
||||
pnpm update --recursive
|
||||
|
||||
# Update specific package
|
||||
pnpm update lodash --filter @multica/core
|
||||
|
||||
# Interactive update
|
||||
pnpm update --interactive --recursive
|
||||
```
|
||||
|
||||
### Run Scripts
|
||||
|
||||
```bash
|
||||
# Run script in specific package
|
||||
pnpm --filter @multica/desktop dev
|
||||
pnpm --filter @multica/core build
|
||||
|
||||
# Run script in all packages
|
||||
pnpm --recursive run build
|
||||
|
||||
# Run script in root
|
||||
pnpm multica --help
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Workspace Dependencies
|
||||
|
||||
### Internal References
|
||||
|
||||
Use `workspace:*` for internal dependencies:
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "@multica/desktop",
|
||||
"dependencies": {
|
||||
"@multica/core": "workspace:*",
|
||||
"@multica/ui": "workspace:*",
|
||||
"@multica/utils": "workspace:*"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Dependency Direction
|
||||
|
||||
```
|
||||
apps/ → depends on → packages/
|
||||
packages/ui → depends on → packages/core
|
||||
packages/core → depends on → packages/types, packages/utils
|
||||
|
||||
❌ Circular dependencies are forbidden
|
||||
```
|
||||
|
||||
### Catalog (Shared Versions)
|
||||
|
||||
`pnpm-workspace.yaml` defines shared versions:
|
||||
|
||||
```yaml
|
||||
catalog:
|
||||
react: "19.2.3"
|
||||
typescript: "^5.9.3"
|
||||
```
|
||||
|
||||
Use in package.json:
|
||||
|
||||
```json
|
||||
{
|
||||
"dependencies": {
|
||||
"react": "catalog:"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Branch Merge & Conflicts
|
||||
|
||||
### High-Conflict Files
|
||||
|
||||
| File | Conflict Type | Resolution Strategy |
|
||||
|------|---------------|---------------------|
|
||||
| `pnpm-lock.yaml` | Auto-generated | **Always regenerate** |
|
||||
| `*/package.json` | Version/deps | Manual merge |
|
||||
| `pnpm-workspace.yaml` | Catalog versions | Manual merge |
|
||||
| `turbo.json` | Pipeline config | Manual merge |
|
||||
|
||||
### Resolving pnpm-lock.yaml Conflicts
|
||||
|
||||
**Never manually resolve `pnpm-lock.yaml` conflicts.** It's a machine-generated file with complex checksums.
|
||||
|
||||
```bash
|
||||
# 1. Accept either version (doesn't matter which)
|
||||
git checkout --theirs pnpm-lock.yaml
|
||||
# or
|
||||
git checkout --ours pnpm-lock.yaml
|
||||
|
||||
# 2. Delete and regenerate
|
||||
rm pnpm-lock.yaml
|
||||
pnpm install
|
||||
|
||||
# 3. Stage the new lockfile
|
||||
git add pnpm-lock.yaml
|
||||
|
||||
# 4. Continue with merge
|
||||
git merge --continue
|
||||
# or
|
||||
git commit
|
||||
```
|
||||
|
||||
### Standard Merge Workflow
|
||||
|
||||
```bash
|
||||
# 1. Fetch and merge
|
||||
git fetch origin main
|
||||
git merge origin/main
|
||||
|
||||
# 2. If conflicts in pnpm-lock.yaml:
|
||||
git checkout --theirs pnpm-lock.yaml
|
||||
rm pnpm-lock.yaml
|
||||
pnpm install
|
||||
git add pnpm-lock.yaml
|
||||
|
||||
# 3. Resolve other conflicts manually
|
||||
# Edit conflicted files...
|
||||
git add <resolved-files>
|
||||
|
||||
# 4. Complete merge
|
||||
git commit
|
||||
|
||||
# 5. Verify build
|
||||
pnpm build
|
||||
pnpm test
|
||||
```
|
||||
|
||||
### After Major Merges
|
||||
|
||||
Always verify:
|
||||
|
||||
```bash
|
||||
pnpm install # Ensure deps are correct
|
||||
pnpm build # Verify build works
|
||||
pnpm test # Run tests
|
||||
pnpm typecheck # Check types
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Cannot find module" in Electron Build
|
||||
|
||||
**Cause:** electron-builder can't find hoisted dependencies.
|
||||
|
||||
**Solution:**
|
||||
|
||||
```bash
|
||||
# Ensure .npmrc has:
|
||||
echo 'shamefully-hoist=true' > .npmrc
|
||||
|
||||
# Clean reinstall
|
||||
rm -rf node_modules apps/*/node_modules packages/*/node_modules
|
||||
rm pnpm-lock.yaml
|
||||
pnpm install
|
||||
```
|
||||
|
||||
### Workspace Protocol Not Resolved
|
||||
|
||||
**Cause:** workspace:* not resolving correctly.
|
||||
|
||||
**Solution:**
|
||||
|
||||
```bash
|
||||
# Check pnpm-workspace.yaml includes the package
|
||||
# Ensure package name matches exactly
|
||||
pnpm install
|
||||
```
|
||||
|
||||
### Peer Dependency Warnings
|
||||
|
||||
**Cause:** Missing peer dependencies.
|
||||
|
||||
**Solution:**
|
||||
|
||||
```bash
|
||||
# Usually safe to ignore, but if causing issues:
|
||||
pnpm add <missing-peer> --filter <package>
|
||||
```
|
||||
|
||||
### Build Order Issues
|
||||
|
||||
**Cause:** Turborepo not building dependencies first.
|
||||
|
||||
**Solution:** Check `turbo.json` has correct `dependsOn`:
|
||||
|
||||
```json
|
||||
{
|
||||
"tasks": {
|
||||
"build": {
|
||||
"dependsOn": ["^build"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Always use pnpm** — Don't mix npm/yarn
|
||||
2. **Commit lockfile** — Always commit `pnpm-lock.yaml` changes
|
||||
3. **Don't edit lockfile manually** — Regenerate on conflicts
|
||||
4. **Use workspace:*** — For internal dependencies
|
||||
5. **Use catalog:** — For shared version management
|
||||
6. **Clean install after .npmrc changes** — Delete node_modules and lockfile
|
||||
7. **Verify after merge** — Run build and tests
|
||||
File diff suppressed because it is too large
Load diff
365
docs/rpc.md
365
docs/rpc.md
|
|
@ -1,365 +0,0 @@
|
|||
# Hub RPC Protocol
|
||||
|
||||
The Hub exposes an RPC (Remote Procedure Call) interface over the Gateway WebSocket transport. Clients can invoke methods on the Hub and receive structured responses, all routed through the same Gateway message layer used for regular chat.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
Client (SDK) Gateway (WebSocket) Hub
|
||||
| | |
|
||||
|-- send(RequestAction) ------->|-- route to Hub ----------->|
|
||||
| | |-- dispatch(method, params)
|
||||
| | |-- handler executes
|
||||
|<-- receive(ResponseAction) ---|<-- route to Client --------|
|
||||
| | |
|
||||
```
|
||||
|
||||
1. The **Client** calls `client.request(hubDeviceId, method, params)`.
|
||||
2. The SDK generates a `requestId` (UUIDv7), wraps it into a `RequestPayload`, and sends a message with `action = "request"` to the Hub via the Gateway.
|
||||
3. The **Gateway** routes the message to the Hub's socket (standard device-to-device routing).
|
||||
4. The **Hub** detects `action === "request"` in its `onMessage` handler and delegates to `RpcDispatcher.dispatch()`.
|
||||
5. The dispatcher looks up the registered handler for the given `method` and invokes it.
|
||||
6. The Hub sends back a message with `action = "response"` containing either a success or error payload, addressed to the original sender.
|
||||
7. The **Client SDK** intercepts incoming `"response"` messages in its `RECEIVE` listener, matches by `requestId`, and resolves (or rejects) the corresponding `Promise`.
|
||||
|
||||
## Message Format
|
||||
|
||||
All RPC messages use the standard `RoutedMessage` envelope:
|
||||
|
||||
```ts
|
||||
interface RoutedMessage<T> {
|
||||
id: string; // UUIDv7 message ID
|
||||
uid: string | null;
|
||||
from: string; // sender deviceId
|
||||
to: string; // recipient deviceId
|
||||
action: string; // "request" or "response"
|
||||
payload: T;
|
||||
}
|
||||
```
|
||||
|
||||
### Request Payload
|
||||
|
||||
```ts
|
||||
interface RequestPayload<T = unknown> {
|
||||
requestId: string; // UUIDv7, generated by the SDK
|
||||
method: string; // RPC method name
|
||||
params?: T; // method-specific parameters
|
||||
}
|
||||
```
|
||||
|
||||
### Response Payload (Success)
|
||||
|
||||
```ts
|
||||
interface ResponseSuccessPayload<T = unknown> {
|
||||
requestId: string; // matches the request
|
||||
ok: true;
|
||||
payload: T; // method-specific result
|
||||
}
|
||||
```
|
||||
|
||||
### Response Payload (Error)
|
||||
|
||||
```ts
|
||||
interface ResponseErrorPayload {
|
||||
requestId: string; // matches the request
|
||||
ok: false;
|
||||
error: {
|
||||
code: string; // machine-readable error code
|
||||
message: string; // human-readable description
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
## Error Codes
|
||||
|
||||
| Code | Description |
|
||||
|---|---|
|
||||
| `METHOD_NOT_FOUND` | The requested RPC method does not exist. |
|
||||
| `INVALID_PARAMS` | Missing or malformed parameters. |
|
||||
| `AGENT_NOT_FOUND` | No session file found for the given agent ID. |
|
||||
| `RPC_ERROR` | Catch-all for unexpected errors. |
|
||||
|
||||
## Client SDK Usage
|
||||
|
||||
The `GatewayClient` provides a `request()` method that handles the full request/response lifecycle:
|
||||
|
||||
```ts
|
||||
request<T = unknown>(
|
||||
to: string, // target deviceId (Hub's deviceId)
|
||||
method: string, // RPC method name
|
||||
params?: unknown, // method parameters
|
||||
timeout?: number, // timeout in ms (default: 10000)
|
||||
): Promise<T>
|
||||
```
|
||||
|
||||
The method:
|
||||
- Generates a `requestId` internally.
|
||||
- Sends a `RequestPayload` via the Gateway.
|
||||
- Returns a `Promise` that resolves with the response payload on success, or rejects with an `Error` on failure or timeout.
|
||||
- Automatically cleans up pending requests on disconnect.
|
||||
|
||||
### Example
|
||||
|
||||
```ts
|
||||
import { GatewayClient, type GetAgentMessagesResult } from "@multica/sdk";
|
||||
|
||||
const client = new GatewayClient({
|
||||
url: "http://localhost:3000",
|
||||
deviceId: "my-client",
|
||||
deviceType: "client",
|
||||
});
|
||||
|
||||
client.connect();
|
||||
|
||||
client.onRegistered(async () => {
|
||||
try {
|
||||
const result = await client.request<GetAgentMessagesResult>(
|
||||
"hub-device-id",
|
||||
"getAgentMessages",
|
||||
{ agentId: "019abc12-...", offset: 0, limit: 20 },
|
||||
);
|
||||
console.log(`Total: ${result.total}, returned: ${result.messages.length}`);
|
||||
} catch (err) {
|
||||
console.error("RPC failed:", err.message);
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
## Available RPC Methods
|
||||
|
||||
### `getAgentMessages`
|
||||
|
||||
Retrieves the message history for a given agent session. Works for both active and closed agents as long as the session file exists on disk.
|
||||
|
||||
**Parameters:**
|
||||
|
||||
```ts
|
||||
interface GetAgentMessagesParams {
|
||||
agentId: string; // required - the agent/session ID
|
||||
offset?: number; // starting index (default: 0)
|
||||
limit?: number; // max messages to return (default: 50)
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
|
||||
```ts
|
||||
interface GetAgentMessagesResult {
|
||||
messages: AgentMessage[]; // array of messages
|
||||
total: number; // total message count in the session
|
||||
offset: number; // the offset used
|
||||
limit: number; // the limit used
|
||||
}
|
||||
```
|
||||
|
||||
Each `AgentMessage` in the array is one of:
|
||||
|
||||
- **UserMessage** (`role: "user"`) - User input (text or multimodal content).
|
||||
- **AssistantMessage** (`role: "assistant"`) - LLM response, may contain `TextContent`, `ThinkingContent`, or `ToolCall` blocks. Includes `usage` (token counts and costs), `model`, `provider`, and `stopReason`.
|
||||
- **ToolResultMessage** (`role: "toolResult"`) - Result of a tool invocation, with `toolCallId`, `toolName`, `content`, and `isError`.
|
||||
|
||||
**Example request:**
|
||||
|
||||
```ts
|
||||
const result = await client.request<GetAgentMessagesResult>(
|
||||
hubDeviceId,
|
||||
"getAgentMessages",
|
||||
{ agentId: "019abc12-3def-7000-8000-000000000001", offset: 0, limit: 10 },
|
||||
);
|
||||
```
|
||||
|
||||
**Example success response payload:**
|
||||
|
||||
```json
|
||||
{
|
||||
"requestId": "019abc12-...",
|
||||
"ok": true,
|
||||
"payload": {
|
||||
"messages": [
|
||||
{ "role": "user", "content": "Hello", "timestamp": 1700000000000 },
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": [{ "type": "text", "text": "Hi! How can I help?" }],
|
||||
"model": "claude-sonnet-4-20250514",
|
||||
"provider": "anthropic",
|
||||
"usage": { "input": 10, "output": 15, "totalTokens": 25 },
|
||||
"stopReason": "end_turn",
|
||||
"timestamp": 1700000001000
|
||||
}
|
||||
],
|
||||
"total": 42,
|
||||
"offset": 0,
|
||||
"limit": 10
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Example error response payload:**
|
||||
|
||||
```json
|
||||
{
|
||||
"requestId": "019abc12-...",
|
||||
"ok": false,
|
||||
"error": {
|
||||
"code": "AGENT_NOT_FOUND",
|
||||
"message": "No session found for agent: 019abc12-bad-id"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### `getHubInfo`
|
||||
|
||||
Returns Hub status information. No parameters required.
|
||||
|
||||
**Response:**
|
||||
|
||||
```ts
|
||||
interface GetHubInfoResult {
|
||||
hubId: string; // Hub device ID
|
||||
url: string; // Current Gateway URL
|
||||
connectionState: string; // "disconnected" | "connecting" | "connected" | "registered"
|
||||
agentCount: number; // Number of active agents
|
||||
}
|
||||
```
|
||||
|
||||
**Example:**
|
||||
|
||||
```ts
|
||||
const info = await client.request<GetHubInfoResult>(hubDeviceId, "getHubInfo");
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### `listAgents`
|
||||
|
||||
Lists all active agents. No parameters required.
|
||||
|
||||
**Response:**
|
||||
|
||||
```ts
|
||||
interface ListAgentsResult {
|
||||
agents: { id: string; closed: boolean }[];
|
||||
}
|
||||
```
|
||||
|
||||
**Example:**
|
||||
|
||||
```ts
|
||||
const result = await client.request<ListAgentsResult>(hubDeviceId, "listAgents");
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### `createAgent`
|
||||
|
||||
Creates a new agent or restores an existing one.
|
||||
|
||||
**Parameters:**
|
||||
|
||||
```ts
|
||||
interface CreateAgentParams {
|
||||
id?: string; // optional - reuse existing session ID
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
|
||||
```ts
|
||||
interface CreateAgentResult {
|
||||
id: string; // the created/restored agent session ID
|
||||
}
|
||||
```
|
||||
|
||||
**Example:**
|
||||
|
||||
```ts
|
||||
const result = await client.request<CreateAgentResult>(hubDeviceId, "createAgent");
|
||||
// or with specific ID:
|
||||
const result = await client.request<CreateAgentResult>(hubDeviceId, "createAgent", { id: "existing-id" });
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### `deleteAgent`
|
||||
|
||||
Closes and removes an agent.
|
||||
|
||||
**Parameters:**
|
||||
|
||||
```ts
|
||||
interface DeleteAgentParams {
|
||||
id: string; // required - agent ID to delete
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
|
||||
```ts
|
||||
interface DeleteAgentResult {
|
||||
ok: boolean; // true if agent was found and deleted
|
||||
}
|
||||
```
|
||||
|
||||
**Example:**
|
||||
|
||||
```ts
|
||||
const result = await client.request<DeleteAgentResult>(hubDeviceId, "deleteAgent", { id: "019abc12-..." });
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### `updateGateway`
|
||||
|
||||
Reconnects the Hub to a different Gateway URL.
|
||||
|
||||
**Parameters:**
|
||||
|
||||
```ts
|
||||
interface UpdateGatewayParams {
|
||||
url: string; // required - new Gateway URL
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
|
||||
```ts
|
||||
interface UpdateGatewayResult {
|
||||
url: string; // the new URL
|
||||
connectionState: string; // connection state after reconnect
|
||||
}
|
||||
```
|
||||
|
||||
**Example:**
|
||||
|
||||
```ts
|
||||
const result = await client.request<UpdateGatewayResult>(hubDeviceId, "updateGateway", { url: "http://localhost:4000" });
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Adding New RPC Methods
|
||||
|
||||
1. Create a handler file in `src/hub/rpc/handlers/`:
|
||||
|
||||
```ts
|
||||
// src/hub/rpc/handlers/my-method.ts
|
||||
import { RpcError, type RpcHandler } from "../dispatcher.js";
|
||||
|
||||
export function createMyMethodHandler(): RpcHandler {
|
||||
return (params: unknown) => {
|
||||
if (!params || typeof params !== "object") {
|
||||
throw new RpcError("INVALID_PARAMS", "params must be an object");
|
||||
}
|
||||
// ... validate and handle
|
||||
return { /* result */ };
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
2. Register it in `src/hub/hub.ts` constructor:
|
||||
|
||||
```ts
|
||||
this.rpc.register("myMethod", createMyMethodHandler());
|
||||
```
|
||||
|
||||
3. (Optional) Add typed params/result interfaces in `packages/sdk/src/actions/rpc.ts` and export them from `packages/sdk/src/actions/index.ts` for client-side type safety.
|
||||
|
|
@ -1,19 +0,0 @@
|
|||
# Skills & Tools
|
||||
|
||||
## Skills
|
||||
|
||||
Skills extend agent functionality via `SKILL.md` files. See [Skills Documentation](../packages/core/src/agent/skills/README.md).
|
||||
|
||||
```bash
|
||||
multica skills list # List skills
|
||||
multica skills add owner/repo # Install from GitHub
|
||||
multica skills status # Check status
|
||||
```
|
||||
|
||||
Built-in: `commit`, `code-review`, `skill-creator`
|
||||
|
||||
## Tools
|
||||
|
||||
Available tools: `read`, `write`, `edit`, `glob`, `exec`, `process`, `web_fetch`, `web_search`, `memory_search`, `sessions_spawn`
|
||||
|
||||
See [Tools Documentation](../packages/core/src/agent/tools/README.md) for details.
|
||||
|
|
@ -1,253 +0,0 @@
|
|||
# SWE-bench: Agent Coding Benchmark
|
||||
|
||||
Run and evaluate the Multica agent against [SWE-bench](https://www.swebench.com/), the standard benchmark for AI coding agents. SWE-bench tasks are real GitHub issues from open-source Python projects — the agent must read the issue, explore the codebase, and produce a patch that fixes the bug.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# 1. Download dataset (requires: pip install datasets)
|
||||
python scripts/swe-bench/download-dataset.py --dataset lite --limit 5
|
||||
|
||||
# 2. Run the agent
|
||||
npx tsx scripts/swe-bench/run.ts --limit 5
|
||||
|
||||
# 3. Analyze results
|
||||
npx tsx scripts/swe-bench/analyze.ts
|
||||
```
|
||||
|
||||
## Scripts
|
||||
|
||||
```
|
||||
scripts/swe-bench/
|
||||
├── download-dataset.py # Download from HuggingFace → JSONL
|
||||
├── run.ts # Core runner: Agent API → git diff → predictions
|
||||
├── evaluate.sh # Official Docker evaluation harness wrapper
|
||||
├── analyze.ts # Summarize run results
|
||||
└── .gitignore # Ignores downloaded datasets and output files
|
||||
```
|
||||
|
||||
## Pipeline
|
||||
|
||||
```
|
||||
┌──────────────────┐
|
||||
HuggingFace ──download──► JSONL ──┤ For each task: │
|
||||
│ 1. git clone │
|
||||
│ 2. git checkout │
|
||||
│ 3. Agent.run() │
|
||||
│ 4. git diff │
|
||||
└────────┬─────────┘
|
||||
│
|
||||
predictions.jsonl (SWE-bench format)
|
||||
│
|
||||
┌───────────────┴───────────────┐
|
||||
│ swebench.harness (Docker) │
|
||||
│ Apply patch → run tests │
|
||||
│ → pass/fail verdict │
|
||||
└───────────────────────────────┘
|
||||
```
|
||||
|
||||
## Dataset Variants
|
||||
|
||||
| Variant | Size | HuggingFace ID | Recommended For |
|
||||
|---------|------|----------------|-----------------|
|
||||
| **Lite** | 300 tasks | `princeton-nlp/SWE-bench_Lite` | Quick iteration, development |
|
||||
| **Verified** | 500 tasks | `princeton-nlp/SWE-bench_Verified` | Official benchmarking, leaderboard |
|
||||
| **Full** | ~2294 tasks | `princeton-nlp/SWE-bench` | Comprehensive evaluation |
|
||||
|
||||
```bash
|
||||
# Download specific variant
|
||||
python scripts/swe-bench/download-dataset.py --dataset verified
|
||||
python scripts/swe-bench/download-dataset.py --dataset lite --limit 20
|
||||
```
|
||||
|
||||
## Runner Options
|
||||
|
||||
```bash
|
||||
npx tsx scripts/swe-bench/run.ts [options]
|
||||
|
||||
Options:
|
||||
--dataset PATH JSONL dataset path (default: scripts/swe-bench/lite.jsonl)
|
||||
--provider NAME LLM provider (default: kimi-coding)
|
||||
--model NAME Model override
|
||||
--limit N Max tasks to run (default: all)
|
||||
--offset N Skip first N tasks (default: 0)
|
||||
--output PATH Output predictions JSONL (default: scripts/swe-bench/predictions.jsonl)
|
||||
--workdir PATH Repo clone directory (default: /tmp/swe-bench)
|
||||
--timeout MS Per-task timeout (default: 300000 = 5min)
|
||||
--instance ID Run a single instance
|
||||
--debug Enable debug logging
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
```bash
|
||||
# Run 10 tasks with Anthropic Claude
|
||||
npx tsx scripts/swe-bench/run.ts --limit 10 --provider anthropic
|
||||
|
||||
# Run a specific instance
|
||||
npx tsx scripts/swe-bench/run.ts --instance "django__django-16379"
|
||||
|
||||
# Resume from task 50 with longer timeout
|
||||
npx tsx scripts/swe-bench/run.ts --offset 50 --limit 10 --timeout 600000
|
||||
|
||||
# Compare providers (run separately, different output files)
|
||||
npx tsx scripts/swe-bench/run.ts --provider kimi-coding --output scripts/swe-bench/pred-kimi.jsonl
|
||||
npx tsx scripts/swe-bench/run.ts --provider anthropic --output scripts/swe-bench/pred-claude.jsonl
|
||||
```
|
||||
|
||||
## How the Agent Solves Tasks
|
||||
|
||||
For each task, the runner:
|
||||
|
||||
1. **Clones the repository** to `/tmp/swe-bench/<instance_id>/` and checks out `base_commit`
|
||||
2. **Creates an Agent** with a focused system prompt and restricted tools (coding only — no web, no cron, no sessions)
|
||||
3. **Runs the agent** with the issue description as the prompt
|
||||
4. **Collects `git diff`** as the patch after the agent finishes
|
||||
5. **Appends** the prediction to `predictions.jsonl` in SWE-bench format
|
||||
|
||||
The agent has access to:
|
||||
- `read`, `write`, `edit` — file operations
|
||||
- `exec`, `process` — shell commands (for exploring code, running tests)
|
||||
- `glob` — file search
|
||||
|
||||
Tools explicitly denied: `web_fetch`, `web_search`, `cron`, `data`, `sessions_spawn`, `sessions_list`, `memory_search`, `send_file`.
|
||||
|
||||
## Output Files
|
||||
|
||||
After a run, two files are produced:
|
||||
|
||||
### `predictions.jsonl` — SWE-bench format
|
||||
|
||||
```json
|
||||
{"instance_id": "astropy__astropy-12907", "model_patch": "diff --git a/...", "model_name_or_path": "multica-kimi-coding"}
|
||||
```
|
||||
|
||||
This file is the input to the official evaluation harness.
|
||||
|
||||
### `predictions.results.jsonl` — detailed run metrics
|
||||
|
||||
```json
|
||||
{
|
||||
"instance_id": "astropy__astropy-12907",
|
||||
"success": true,
|
||||
"patch": "diff --git a/...",
|
||||
"error": null,
|
||||
"duration_ms": 141892,
|
||||
"session_id": "019c60c7-52ac-702a-9b9c-dc53c0daea6b"
|
||||
}
|
||||
```
|
||||
|
||||
## Analyzing Results
|
||||
|
||||
```bash
|
||||
# Summary report
|
||||
npx tsx scripts/swe-bench/analyze.ts
|
||||
|
||||
# Or specify a results file
|
||||
npx tsx scripts/swe-bench/analyze.ts scripts/swe-bench/pred-kimi.results.jsonl
|
||||
```
|
||||
|
||||
Output includes:
|
||||
- Patch rate (how many tasks produced a diff)
|
||||
- Duration statistics (avg/min/max)
|
||||
- Error breakdown
|
||||
- Per-repository stats
|
||||
- Slowest tasks
|
||||
|
||||
### Run-Log Analysis
|
||||
|
||||
Each agent session writes a structured `run-log.jsonl` to `~/.super-multica/sessions/<session-id>/`. This captures every LLM call, tool invocation, and timing:
|
||||
|
||||
```bash
|
||||
# Find a session's run log
|
||||
cat ~/.super-multica/sessions/<session-id>/run-log.jsonl | head -5
|
||||
|
||||
# Quick stats from a run log
|
||||
cat ~/.super-multica/sessions/<session-id>/run-log.jsonl | python3 -c "
|
||||
import json, sys
|
||||
events = [json.loads(l) for l in sys.stdin if l.strip()]
|
||||
tools = [e for e in events if e['event'] == 'tool_start']
|
||||
llm_ms = sum(e.get('duration_ms', 0) for e in events if e['event'] == 'llm_result')
|
||||
print(f'LLM time: {llm_ms/1000:.1f}s | Tool calls: {len(tools)}')
|
||||
"
|
||||
```
|
||||
|
||||
## Official Evaluation (Docker)
|
||||
|
||||
The runner produces patches, but **only the official SWE-bench harness determines pass/fail** by applying the patch and running the project's test suite.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Docker running (at least 120GB storage, 16GB RAM, 8 CPU cores)
|
||||
- `pip install swebench`
|
||||
|
||||
### Run Evaluation
|
||||
|
||||
```bash
|
||||
# Using the wrapper script
|
||||
bash scripts/swe-bench/evaluate.sh
|
||||
|
||||
# Or directly
|
||||
python -m swebench.harness.run_evaluation \
|
||||
--dataset_name princeton-nlp/SWE-bench_Lite \
|
||||
--predictions_path scripts/swe-bench/predictions.jsonl \
|
||||
--max_workers 4 \
|
||||
--run_id multica
|
||||
```
|
||||
|
||||
Results are written to `logs/` and `evaluation_results/`.
|
||||
|
||||
## Known Limitations and Improvements
|
||||
|
||||
### Current Limitations
|
||||
|
||||
1. **No Docker isolation for agent execution**: The agent runs on the host, so `pip install` and other commands affect the system Python. SWE-bench standard practice is to run each task in a Docker container.
|
||||
|
||||
2. **`SMC_DATA_DIR` timing**: Setting `SMC_DATA_DIR` at runtime doesn't affect `DATA_DIR` (resolved at module import time). Sessions currently write to `~/.super-multica/sessions/`. To isolate, set the env var before the process starts:
|
||||
```bash
|
||||
SMC_DATA_DIR=~/.swe-bench-eval npx tsx scripts/swe-bench/run.ts --limit 5
|
||||
```
|
||||
|
||||
3. **Sequential execution**: Tasks run one at a time. For large-scale runs, launch multiple processes with `--offset`/`--limit` to parallelize:
|
||||
```bash
|
||||
# Run 4 workers in parallel
|
||||
npx tsx scripts/swe-bench/run.ts --offset 0 --limit 75 --output pred-0.jsonl &
|
||||
npx tsx scripts/swe-bench/run.ts --offset 75 --limit 75 --output pred-1.jsonl &
|
||||
npx tsx scripts/swe-bench/run.ts --offset 150 --limit 75 --output pred-2.jsonl &
|
||||
npx tsx scripts/swe-bench/run.ts --offset 225 --limit 75 --output pred-3.jsonl &
|
||||
wait
|
||||
cat pred-*.jsonl > predictions.jsonl
|
||||
```
|
||||
|
||||
4. **Repo cloning per instance**: Each instance clones the full repo. For repos with many tasks (e.g., astropy, django), a shared clone with `git worktree` would be faster.
|
||||
|
||||
### Potential Improvements
|
||||
|
||||
- **Docker-per-task**: Run each agent in a Docker container matching the SWE-bench environment spec (correct Python version, pre-installed dependencies)
|
||||
- **Shared repo pool**: Clone each unique repo once, use `git worktree` for per-task isolation
|
||||
- **Cost tracking**: Parse run-log token counts for per-task and aggregate cost estimates
|
||||
- **Multi-turn retries**: If the agent produces no patch, retry with feedback
|
||||
- **System prompt tuning**: The current prompt is minimal; more detailed guidance (e.g., "search for related test files to understand expected behavior") could improve solve rate
|
||||
|
||||
## Related Benchmarks
|
||||
|
||||
| Benchmark | Focus | Notes |
|
||||
|-----------|-------|-------|
|
||||
| [SWE-bench Verified](https://openai.com/index/introducing-swe-bench-verified/) | Bug fixing (Python) | Gold standard, 500 human-verified tasks |
|
||||
| [SWE-bench Multilingual](https://github.com/SWE-bench/SWE-bench) | Bug fixing (7 languages) | Java, TS, JS, Go, Rust, C, C++ |
|
||||
| [Terminal-Bench](https://www.swebench.com/) | CLI workflows | Multi-step sandboxed terminal tasks |
|
||||
| [Aider Polyglot](https://aider.chat/docs/leaderboards/) | Code editing | 225 Exercism exercises, 6 languages |
|
||||
| [DPAI Arena](https://www.jetbrains.com/) | Full dev workflow | JetBrains: patch, test, review, analysis |
|
||||
| [HumanEval](https://github.com/openai/human-eval) | Function generation | 164 Python function tasks, largely saturated |
|
||||
|
||||
## Initial Results (kimi-coding, 3 tasks)
|
||||
|
||||
First run on 3 SWE-bench Lite tasks (all astropy):
|
||||
|
||||
| Task | Status | Duration | LLM Time | Tools | Fix |
|
||||
|------|--------|----------|----------|-------|-----|
|
||||
| `astropy__astropy-12907` | PATCHED | 141.9s | 125.1s | 30 | `_cstack`: `= 1` → `= right` |
|
||||
| `astropy__astropy-14182` | PATCHED | 192.0s | 166.9s | 56 | Added `header_rows` param to RST writer |
|
||||
| `astropy__astropy-14365` | PATCHED | 65.7s | 49.6s | 23 | `re.compile()` + `re.IGNORECASE` |
|
||||
|
||||
3/3 tasks produced patches. Formal evaluation pending (requires Docker harness).
|
||||
|
|
@ -1,36 +0,0 @@
|
|||
# Time Injection Design
|
||||
|
||||
Super Multica uses **message-level timestamp injection** for time awareness.
|
||||
Instead of placing dynamic time text in the system prompt, user turns are stamped at runtime.
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Incoming turn] --> B{Entry point}
|
||||
B -->|Desktop/Gateway/Cron/Subagent| C[AsyncAgent.write]
|
||||
B -->|Heartbeat poll| D[AsyncAgent.write injectTimestamp=false]
|
||||
C --> E{Already stamped or has 'Current time:'?}
|
||||
E -->|Yes| F[Keep original message]
|
||||
E -->|No| G[Prefix: [DOW YYYY-MM-DD HH:mm TZ]]
|
||||
D --> H[Keep original heartbeat prompt]
|
||||
F --> I[Agent.run]
|
||||
G --> I
|
||||
H --> I
|
||||
I --> J[LLM receives final turn text]
|
||||
```
|
||||
|
||||
## Injection Matrix
|
||||
|
||||
| Path | Runtime call | Timestamp injected? | Notes |
|
||||
| --- | --- | --- | --- |
|
||||
| Desktop direct chat | `agent.write(content)` | Yes | Default behavior |
|
||||
| Gateway/remote chat | `agent.write(content)` | Yes | Same entry path as desktop |
|
||||
| `sessions_spawn` child task | `childAgent.write(task)` | Yes | Child turn gets current time context |
|
||||
| Cron `agent-turn` payload | `agent.write(cronMessage)` | Yes (guarded) | Skips if message already carries `Current time:` |
|
||||
| Heartbeat runner | `agent.write(prompt, { injectTimestamp: false })` | No | Prevents heartbeat prompt matching from breaking |
|
||||
| Internal orchestration | `writeInternal(...)` | No | Uses separate internal run path |
|
||||
|
||||
## Why This Design
|
||||
|
||||
- Keeps system prompt cache-stable (no per-turn date churn in system prompt text)
|
||||
- Gives the model an explicit "now" reference on each user turn
|
||||
- Uses guardrails to avoid double-stamping and heartbeat regressions
|
||||
Loading…
Add table
Add a link
Reference in a new issue