diff --git a/README.md b/README.md index c6fd6d28..b72a0a72 100644 --- a/README.md +++ b/README.md @@ -35,22 +35,31 @@ skills/ # Bundled skills (commit, code-review) pnpm install ``` -### Credentials Configuration +### Development -The Agent reads credentials from JSON5 files (no `.env` required). +```bash +# Desktop app (recommended for local development) +pnpm dev -Create empty templates: +# Gateway + Web app (for remote/mobile clients) +pnpm dev:gateway # Start Gateway on :3000 +pnpm dev:web # Start Web app on :3001 +pnpm dev:all # Start both Gateway and Web app +``` + +The Desktop app runs a standalone Hub with embedded Agent Engine - no Gateway required for local use. + +### Credentials ```bash multica credentials init ``` -This creates: +Creates: +- `~/.super-multica/credentials.json5` — LLM providers + tools +- `~/.super-multica/skills.env.json5` — skill/plugin API keys -- `~/.super-multica/credentials.json5` — core config (LLM providers + built-in tools) -- `~/.super-multica/skills.env.json5` — dynamic keys (skills / plugins / integrations) - -Example `credentials.json5` (OpenAI): +Example `credentials.json5`: ```json5 { @@ -58,11 +67,7 @@ Example `credentials.json5` (OpenAI): llm: { provider: "openai", providers: { - openai: { - apiKey: "sk-xxx", - baseUrl: "https://api.openai.com/v1", - model: "gpt-4o" - } + openai: { apiKey: "sk-xxx", model: "gpt-4o" } } }, tools: { @@ -71,372 +76,92 @@ Example `credentials.json5` (OpenAI): } ``` -Example `skills.env.json5` (dynamic keys): - -```json5 -{ - env: { - LINEAR_API_KEY: "lin-...", - SLACK_BOT_TOKEN: "xoxb-..." - } -} -``` - -Start services directly (no `source .env`): - -```bash -multica dev # Start desktop app -multica run "hello" # Run CLI mode -``` - -Optional overrides: - -- `SMC_CREDENTIALS_PATH` — custom path for `credentials.json5` -- `SMC_SKILLS_ENV_PATH` — custom path for `skills.env.json5` - ### LLM Providers -Super Multica supports multiple LLM providers with two authentication methods: - -**OAuth Providers** (use external CLI login): -- `claude-code` — Claude Code OAuth (requires `claude login`) -- `openai-codex` — OpenAI Codex OAuth (requires `codex login`) +**OAuth Providers** (external CLI login): +- `claude-code` — requires `claude login` +- `openai-codex` — requires `codex login` **API Key Providers** (configure in `credentials.json5`): - `anthropic`, `openai`, `kimi-coding`, `google`, `groq`, `mistral`, `xai`, `openrouter` -#### Check Provider Status +Check status: `/provider` in interactive mode + +## CLI ```bash -# In interactive mode -/provider - -# Output shows all providers with status -🔌 Provider Status - -Current: kimi-coding - -Available Providers: - ID Name Auth Status - ────────────────────────────────────────────────────────────────────── - ✓ claude-code Claude Code (OAuth) OAuth ready - ✗ openai-codex Codex (OAuth) OAuth not logged in - ✓ kimi-coding Kimi Code API Key configured (current) - ... +multica # Interactive mode +multica run "prompt" # Single prompt +multica chat --profile my-agent # Use profile +multica --session abc123 # Continue session +multica session list # List sessions +multica profile list # List profiles +multica skills list # List skills +multica help # Show help ``` -#### Using OAuth Providers - -```bash -# 1. Install and login to Claude Code -npm install -g @anthropic-ai/claude-code -claude login - -# 2. Start multica with claude-code provider -multica --provider claude-code -``` - -#### Using API Key Providers - -Add your API key to `~/.super-multica/credentials.json5`: - -```json5 -{ - llm: { - provider: "openai", - providers: { - openai: { apiKey: "sk-xxx" } - } - } -} -``` - -### Configuration Priority - -Each setting is resolved in order (first match wins): - -1. **CLI argument** — `--provider`, `--model`, `--api-key`, `--base-url` -2. **Credentials file** — `credentials.json5` (`llm.provider` + `llm.providers[provider]`) -3. **Session metadata** — restored from previous session -4. **Default** — `kimi-coding` provider with `kimi-k2-thinking` model - -## Multica CLI - -The unified CLI provides access to all agent features through a single command. - -```bash -# Interactive mode (default) -multica -multica chat -multica chat --profile my-agent - -# Run a single prompt -multica run "hello" -multica run --session demo "remember my name is Alice" - -# Session management -multica session list -multica session show abc12345 -multica session delete abc12345 - -# Continue a session -multica --session abc12345 -multica run --session abc12345 "what did I say?" - -# Override provider/model -multica run --provider openai --model gpt-4o-mini "hi" - -# Use an agent profile -multica chat --profile my-agent - -# Set thinking level -multica run --thinking high "solve this complex problem" - -# Development servers -multica dev # Start desktop app (default) -multica dev gateway # Gateway only (:3000) - for remote clients -multica dev web # Web app only (:3001) -multica dev all # Start gateway + web - -# Help -multica help -multica run --help -multica session --help -``` - -Short alias: `mu` (same as `multica`) +Short alias: `mu` ## Sessions -Sessions persist conversation history to `~/.super-multica/sessions//`. Each session includes: +Sessions persist to `~/.super-multica/sessions//` with JSONL message history and JSON metadata. Context windows are automatically managed with token-aware compaction. -- `session.jsonl` - Message history in JSONL format -- `meta.json` - Session metadata (provider, model, thinking level) +## Profiles -Sessions use UUIDv7 for IDs by default, providing time-ordered unique identifiers. - -### Context Window Management - -The agent automatically manages context windows to prevent token overflow: - -- **Token-aware compaction** - Tracks token usage and compacts when approaching limits -- **Compaction modes**: `tokens` (default), `count` (legacy), `summary` (LLM-generated) -- **Configurable safety margins** - Ensures space for responses -- **Minimum message preservation** - Keeps recent context intact - -## Agent Profiles - -Agent profiles define identity, personality, tools, and memory for an agent. Profiles are stored as markdown files in `~/.super-multica/agent-profiles//`. - -### Profile CLI +Profiles define agent identity, personality, and memory in `~/.super-multica/agent-profiles//`. ```bash -# Create a new profile with default templates -multica profile new my-agent - -# List all profiles -multica profile list - -# Show profile contents -multica profile show my-agent - -# Open profile directory in file manager -multica profile edit my-agent - -# Delete a profile -multica profile delete my-agent +multica profile new my-agent # Create profile +multica profile list # List all +multica profile edit my-agent # Open in file manager ``` -### Profile Structure - -Each profile contains: - -- `identity.md` - Agent name and role -- `soul.md` - Personality and behavioral constraints -- `tools.md` - Tool usage instructions -- `memory.md` - Persistent knowledge -- `bootstrap.md` - Initial conversation context +Profile files: `soul.md`, `user.md`, `workspace.md`, `memory.md`, `memory/*.md` ## Skills -Skills are modular capabilities that extend agent functionality through `SKILL.md` definition files. For complete documentation, see [Skills System Documentation](./src/agent/skills/README.md). - -### Key Features - -- **Two-source loading** - Global skills (`~/.super-multica/skills/`) and profile-specific skills -- **GitHub installation** - `pnpm skills:cli add owner/repo` to install from GitHub -- **Slash command invocation** - `/skill-name args` in interactive mode -- **Eligibility filtering** - Auto-filter by platform, binaries, and environment -- **Hot reload** - File watcher for development - -### Quick Start +Skills extend agent functionality via `SKILL.md` files. See [Skills Documentation](./src/agent/skills/README.md). ```bash -# List all skills -multica skills list - -# Install skills from GitHub -multica skills add anthropics/skills - -# Check skill status with diagnostics -multica skills status -multica skills status pdf -v - -# Install skill dependencies -multica skills install nano-pdf - -# Remove installed skills -multica skills remove skills +multica skills list # List skills +multica skills add owner/repo # Install from GitHub +multica skills status # Check status ``` -### Built-in Skills +Built-in: `commit`, `code-review`, `skill-creator` -Located in `/skills/`: +## Tools -- **commit** - Git commit helper following conventional commits -- **code-review** - Code review assistance -- **skill-creator** - Create and manage custom skills (meta-skill for self-extension) +Available tools: `read`, `write`, `edit`, `glob`, `exec`, `process`, `web_fetch`, `web_search`, `memory_search`, `sessions_spawn` -### Creating Custom Skills - -The agent can create new skills to extend its own capabilities. Simply ask the agent to create a skill: - -``` -User: Create a skill that helps me format JSON -Agent: [Creates ~/.super-multica/skills/json-formatter/SKILL.md] -``` - -Skills are automatically loaded via hot-reload. See the [skill-creator SKILL.md](./skills/skill-creator/SKILL.md) for the complete guide. - -## Agent Tools - -### exec - -Execute short-lived shell commands and return output. Commands running longer than the timeout are automatically backgrounded. - -``` -exec({ command: "ls -la", cwd: "/path/to/dir", timeoutMs: 30000 }) -``` - -### process - -Manage long-running background processes (servers, watchers, daemons). Output is buffered (up to 64KB) and terminated processes are automatically cleaned up after 1 hour. - -``` -# Start a background process (returns immediately with process ID) -process({ action: "start", command: "npm run dev" }) - -# Check process status -process({ action: "status", id: "" }) - -# Read process output -process({ action: "output", id: "" }) - -# Stop a process -process({ action: "stop", id: "" }) - -# Clean up terminated processes -process({ action: "cleanup" }) -``` - -### glob - -Pattern-based file discovery using fast-glob. - -``` -glob({ pattern: "**/*.ts", cwd: "/path/to/dir" }) -``` - -### web_fetch - -Fetch and extract content from URLs with intelligent content extraction. - -``` -# Basic fetch (returns markdown) -web_fetch({ url: "https://example.com" }) - -# With options -web_fetch({ - url: "https://example.com", - outputFormat: "markdown", # or "text" - extractor: "readability" # or "turndown" for full page -}) -``` - -Features: SSRF protection, response caching, max 50KB output. - -### web_search - -Search the web using Brave or Perplexity AI. - -``` -# Basic search -web_search({ query: "typescript best practices" }) - -# With provider options -web_search({ - query: "latest AI news", - provider: "brave", # or "perplexity" - count: 5, - freshness: "pw" # past week (Brave: pd/pw/pm/py) -}) -``` +See [Tools Documentation](./src/agent/tools/README.md) for details. ## Architecture -### Desktop App (Recommended) +``` +Desktop App (standalone, recommended) + └─ Hub (embedded) + └─ Agent Engine -The Electron desktop app runs a standalone Hub with embedded Agent Engine: +Web/Mobile Clients + → Gateway (WebSocket, :3000) + → Hub + → Agent Engine +``` -- **No Gateway required** for local development -- Direct IPC communication for optimal performance -- QR code pairing for mobile remote access -- Optional Gateway connection for web/remote clients - -### Gateway - -The WebSocket gateway enables remote client access: - -- Real-time message routing between clients and Hub -- Streaming support for long-running operations -- RPC-style request/response patterns -- Device verification and authentication - -### Hub - -The Hub manages agents and communication: - -- Agent lifecycle management -- Multi-subscriber event distribution -- Device whitelist and token-based verification +- **Desktop App**: Electron app with embedded Hub, no Gateway needed +- **Gateway**: WebSocket server for remote clients +- **Hub**: Agent lifecycle and event distribution ## Scripts -### Multica CLI Commands +```bash +pnpm dev # Desktop app (recommended) +pnpm dev:gateway # Gateway only +pnpm dev:web # Web app only +pnpm dev:all # Gateway + Web -- `multica` / `mu` - Unified CLI entry point -- `multica run ` - Run a single prompt -- `multica chat` - Interactive REPL mode -- `multica session ` - Session management -- `multica profile ` - Profile management -- `multica skills ` - Skills management -- `multica tools ` - Tool policy inspection -- `multica credentials ` - Credentials management -- `multica dev [service]` - Development servers -- `multica help` - Show help - -### Development (shortcuts) - -- `pnpm dev` - Run desktop app (default, recommended) -- `pnpm dev:desktop` - Run desktop app -- `pnpm dev:gateway` - Run gateway only (for remote clients) -- `pnpm dev:web` - Run web app only -- `pnpm dev:all` - Run gateway + web - -### Build & Test - -- `pnpm build` - Build for production -- `pnpm build:sdk` - Build SDK package -- `pnpm build:cli` - Build CLI binary -- `pnpm start` - Run production build -- `pnpm typecheck` - Type check without emitting +pnpm build # Production build +pnpm typecheck # Type check +pnpm test # Run tests +``` diff --git a/apps/desktop/electron/electron-env.d.ts b/apps/desktop/electron/electron-env.d.ts index d111bb44..fb78d243 100644 --- a/apps/desktop/electron/electron-env.d.ts +++ b/apps/desktop/electron/electron-env.d.ts @@ -91,7 +91,7 @@ interface LocalChatEvent { type?: 'error' content?: string event?: { - type: 'message_start' | 'message_update' | 'message_end' | 'tool_execution_start' | 'tool_execution_end' + type: 'message_start' | 'message_update' | 'message_end' | 'tool_execution_start' | 'tool_execution_end' | 'compaction_start' | 'compaction_end' id?: string message?: { role: string diff --git a/apps/desktop/electron/ipc/agent.ts b/apps/desktop/electron/ipc/agent.ts index 61622286..385e9391 100644 --- a/apps/desktop/electron/ipc/agent.ts +++ b/apps/desktop/electron/ipc/agent.ts @@ -12,7 +12,7 @@ const TOOL_GROUPS: Record = { 'group:fs': ['read', 'write', 'edit', 'glob'], 'group:runtime': ['exec', 'process'], 'group:web': ['web_search', 'web_fetch'], - 'group:memory': ['memory_get', 'memory_set', 'memory_delete', 'memory_list'], + 'group:memory': ['memory_search'], 'group:subagent': ['sessions_spawn'], } diff --git a/apps/desktop/electron/ipc/hub.ts b/apps/desktop/electron/ipc/hub.ts index fe06cf69..09ee0319 100644 --- a/apps/desktop/electron/ipc/hub.ts +++ b/apps/desktop/electron/ipc/hub.ts @@ -268,6 +268,19 @@ export function registerHubIpcHandlers(): void { return } + // Compaction events: forward with no stream tracking + const isCompactionEvent = + event.type === 'compaction_start' || event.type === 'compaction_end' + if (isCompactionEvent) { + safeLog(`[IPC] Sending compaction event to renderer: ${event.type}`) + mainWindowRef.webContents.send('localChat:event', { + agentId, + streamId: null, + event, + }) + return + } + // Filter events same as Hub.consumeAgent() const maybeMessage = (event as { message?: { role?: string } }).message const isAssistantMessage = maybeMessage?.role === 'assistant' diff --git a/apps/desktop/electron/preload.ts b/apps/desktop/electron/preload.ts index dcd84e97..c6081c7c 100644 --- a/apps/desktop/electron/preload.ts +++ b/apps/desktop/electron/preload.ts @@ -72,7 +72,7 @@ export interface LocalChatEvent { type?: 'error' content?: string event?: { - type: 'message_start' | 'message_update' | 'message_end' | 'tool_execution_start' | 'tool_execution_end' + type: 'message_start' | 'message_update' | 'message_end' | 'tool_execution_start' | 'tool_execution_end' | 'compaction_start' | 'compaction_end' id?: string message?: { role: string diff --git a/apps/web/app/favicon.ico b/apps/web/app/favicon.ico new file mode 100644 index 00000000..47e9db6f Binary files /dev/null and b/apps/web/app/favicon.ico differ diff --git a/apps/web/app/icon.png b/apps/web/app/icon.png index ba5cedcb..e845388e 100644 Binary files a/apps/web/app/icon.png and b/apps/web/app/icon.png differ diff --git a/apps/web/app/layout.tsx b/apps/web/app/layout.tsx index 60c1655a..020bbdc6 100644 --- a/apps/web/app/layout.tsx +++ b/apps/web/app/layout.tsx @@ -32,7 +32,7 @@ export const metadata: Metadata = { title: "Multica", }, icons: { - apple: "/logo-192x192.png", + apple: "/apple-touch-icon.png", }, }; diff --git a/apps/web/app/manifest.ts b/apps/web/app/manifest.ts index 010f565b..94e93721 100644 --- a/apps/web/app/manifest.ts +++ b/apps/web/app/manifest.ts @@ -17,6 +17,7 @@ export default function manifest(): MetadataRoute.Manifest { src: "/logo-192x192.png", sizes: "192x192", type: "image/png", + purpose: "any", }, { src: "/logo-512x512.png", diff --git a/apps/web/public/apple-touch-icon.png b/apps/web/public/apple-touch-icon.png new file mode 100644 index 00000000..c01f6a61 Binary files /dev/null and b/apps/web/public/apple-touch-icon.png differ diff --git a/apps/web/public/logo-192x192.png b/apps/web/public/logo-192x192.png index ba718e6b..e845388e 100644 Binary files a/apps/web/public/logo-192x192.png and b/apps/web/public/logo-192x192.png differ diff --git a/apps/web/public/logo-512x512.png b/apps/web/public/logo-512x512.png index d21be718..2e8900e9 100644 Binary files a/apps/web/public/logo-512x512.png and b/apps/web/public/logo-512x512.png differ diff --git a/docs/client-streaming-protocol.md b/docs/client-streaming-protocol.md new file mode 100644 index 00000000..9fe1ce08 --- /dev/null +++ b/docs/client-streaming-protocol.md @@ -0,0 +1,338 @@ +# Client Streaming Protocol + +How clients receive real-time agent events via WebSocket (Gateway mode) or IPC (Desktop mode), and what data structures to use for rendering. + +## Transport Overview + +``` +Gateway mode (Web App): + Client ←──WebSocket──→ Gateway ←──→ Hub ←──→ Agent + +Desktop mode (Electron): + Renderer ←──IPC──→ Main Process (Hub + Agent) +``` + +Both transports deliver the same logical events. The client receives a `StreamPayload` envelope containing an event, and routes it to the store for rendering. + +## StreamPayload Envelope + +Every real-time event arrives wrapped in a `StreamPayload`: + +```ts +interface StreamPayload { + streamId: string; // groups events belonging to the same assistant turn + agentId: string; // which agent produced this event + event: AgentEvent | CompactionEvent; +} +``` + +In Gateway mode, these arrive as Socket.io messages with `action = "stream"`. In Desktop IPC mode, they arrive as `localChat:event` messages with the same structure. + +## Event Types + +### 1. Message Lifecycle Events (AgentEvent) + +These events represent an LLM response being generated in real time. + +#### `message_start` + +A new assistant message has begun streaming. + +```json +{ + "streamId": "019abc12-...", + "agentId": "019def34-...", + "event": { + "type": "message_start", + "message": { + "role": "assistant", + "content": [] + } + } +} +``` + +**Client action:** Create a new empty assistant message bubble. Use `streamId` as the message ID for subsequent updates. + +#### `message_update` + +Partial content has arrived for the current message. + +```json +{ + "streamId": "019abc12-...", + "agentId": "019def34-...", + "event": { + "type": "message_update", + "message": { + "role": "assistant", + "content": [ + { "type": "text", "text": "Here is the partial response so far..." }, + { "type": "thinking", "thinking": "Let me consider..." } + ] + } + } +} +``` + +**Client action:** Replace the message's `content` array with the new snapshot. Each update contains the full accumulated content, not a delta. + +#### `message_end` + +The assistant message is complete. + +```json +{ + "streamId": "019abc12-...", + "agentId": "019def34-...", + "event": { + "type": "message_end", + "message": { + "role": "assistant", + "content": [ + { "type": "text", "text": "Final complete response." } + ], + "stopReason": "end_turn" + } + } +} +``` + +**Client action:** Finalize the message. Mark streaming as complete. Extract `stopReason` if needed. + +### 2. Tool Execution Events (AgentEvent) + +These events track tool calls made by the assistant during a turn. + +#### `tool_execution_start` + +The agent has begun executing a tool. + +```json +{ + "streamId": "019abc12-...", + "agentId": "019def34-...", + "event": { + "type": "tool_execution_start", + "toolCallId": "toolu_01ABC...", + "toolName": "Bash", + "args": { "command": "ls -la" } + } +} +``` + +**Client action:** Create a tool result message with `toolStatus: "running"`. Display a spinner or loading indicator. + +#### `tool_execution_end` + +The tool has finished executing. + +```json +{ + "streamId": "019abc12-...", + "agentId": "019def34-...", + "event": { + "type": "tool_execution_end", + "toolCallId": "toolu_01ABC...", + "result": "file1.txt\nfile2.txt\n", + "isError": false + } +} +``` + +**Client action:** Update the matching tool result message. Set `toolStatus` to `"success"` or `"error"` based on `isError`. Render `result` as the tool output. + +### 3. Compaction Events (CompactionEvent) + +These events notify the client when context window compaction occurs. They use a synthetic `streamId` of `compaction:{agentId}` and do not belong to any message stream. + +#### `compaction_start` + +Context compaction has begun. The agent is removing old messages to free up context window space. + +```json +{ + "streamId": "compaction:019def34-...", + "agentId": "019def34-...", + "event": { + "type": "compaction_start" + } +} +``` + +**Client action:** Show a compaction indicator (e.g., "Compacting context..."). + +#### `compaction_end` + +Compaction is complete. Includes statistics about what was removed. + +```json +{ + "streamId": "compaction:019def34-...", + "agentId": "019def34-...", + "event": { + "type": "compaction_end", + "removed": 24, + "kept": 8, + "tokensRemoved": 45000, + "tokensKept": 12000, + "reason": "tokens" + } +} +``` + +| Field | Type | Description | +|-------|------|-------------| +| `removed` | `number` | Number of messages removed | +| `kept` | `number` | Number of messages retained | +| `tokensRemoved` | `number?` | Estimated tokens freed (absent in count mode) | +| `tokensKept` | `number?` | Estimated tokens remaining (absent in count mode) | +| `reason` | `string` | What triggered compaction: `"tokens"`, `"count"`, or `"summary"` | + +**Client action:** Hide the compaction indicator. Optionally display a toast or inline notice with the stats. + +## Content Block Types + +Message content is an array of `ContentBlock`, which is a union of: + +```ts +// Plain text +interface TextContent { + type: "text"; + text: string; +} + +// LLM reasoning (extended thinking) +interface ThinkingContent { + type: "thinking"; + thinking: string; +} + +// Tool invocation (appears in assistant messages) +interface ToolCall { + type: "toolCall"; + id: string; + name: string; + arguments: Record; +} + +// Image content (appears in user messages) +interface ImageContent { + type: "image"; + source: { type: "base64"; media_type: string; data: string }; +} +``` + +## Client-Side Store Structure + +The recommended Zustand store shape for rendering: + +```ts +interface Message { + id: string; + role: "user" | "assistant" | "toolResult"; + content: ContentBlock[]; + agentId: string; + stopReason?: string; + // Tool result fields (role === "toolResult" only) + toolCallId?: string; + toolName?: string; + toolArgs?: Record; + toolStatus?: "running" | "success" | "error" | "interrupted"; + isError?: boolean; +} + +interface CompactionStats { + removed: number; + kept: number; + tokensRemoved?: number; + tokensKept?: number; + reason: string; +} + +interface MessagesState { + messages: Message[]; + streamingIds: Set; // IDs of messages currently streaming + compacting: boolean; // true while compaction is in progress + lastCompaction: CompactionStats | null; // stats from most recent compaction +} +``` + +## Event Routing Pseudocode + +```ts +function handleStreamEvent(payload: StreamPayload) { + const { streamId, agentId, event } = payload; + + switch (event.type) { + case "message_start": + store.startStream(streamId, agentId); + break; + case "message_update": + store.appendStream(streamId, event.message.content); + break; + case "message_end": + store.endStream(streamId, event.message.content, event.message.stopReason); + break; + case "tool_execution_start": + store.startToolExecution(agentId, event.toolCallId, event.toolName, event.args); + break; + case "tool_execution_end": + store.endToolExecution(event.toolCallId, event.result, event.isError); + break; + case "compaction_start": + store.startCompaction(); + break; + case "compaction_end": + store.endCompaction({ + removed: event.removed, + kept: event.kept, + tokensRemoved: event.tokensRemoved, + tokensKept: event.tokensKept, + reason: event.reason, + }); + break; + } +} +``` + +## Message History via RPC + +Clients can also fetch historical messages using the `getAgentMessages` RPC method. See [rpc.md](./rpc.md) for details. + +The response returns `AgentMessage[]` which must be normalized into the `Message` format above. Key differences from streaming: + +- Historical messages don't have `toolStatus` — infer it from `isError` (`"error"` or `"success"`). +- Historical messages may have `content` as a plain `string` instead of `ContentBlock[]` — normalize by wrapping in `[{ type: "text", text: content }]`. +- Tool arguments are not stored on `toolResult` messages — build a lookup map from assistant `ToolCall` blocks by `toolCallId` to reconstruct `toolArgs`. + +## SDK Imports + +All types are available from `@multica/sdk`: + +```ts +import { + StreamAction, + type StreamPayload, + type AgentEvent, + type CompactionEvent, + type CompactionStartEvent, + type CompactionEndEvent, + type ContentBlock, + type TextContent, + type ThinkingContent, + type ToolCall, + type ImageContent, +} from "@multica/sdk"; +``` + +Store types are available from `@multica/store`: + +```ts +import { + useMessagesStore, + type Message, + type CompactionStats, + type ToolStatus, +} from "@multica/store"; +``` diff --git a/packages/sdk/src/actions/index.ts b/packages/sdk/src/actions/index.ts index 1ceffb73..a1fb3e9f 100644 --- a/packages/sdk/src/actions/index.ts +++ b/packages/sdk/src/actions/index.ts @@ -34,6 +34,9 @@ export { StreamAction, type StreamPayload, type AgentEvent, + type CompactionEvent, + type CompactionStartEvent, + type CompactionEndEvent, type ContentBlock, type TextContent, type ThinkingContent, diff --git a/packages/sdk/src/actions/stream.ts b/packages/sdk/src/actions/stream.ts index 810f7355..c0249353 100644 --- a/packages/sdk/src/actions/stream.ts +++ b/packages/sdk/src/actions/stream.ts @@ -25,16 +25,36 @@ export type { AgentEvent }; */ export type ContentBlock = TextContent | ThinkingContent | ToolCall | ImageContent; +// --- Compaction event types (Multica-specific, not from pi-agent-core) --- + +/** Emitted when context compaction begins */ +export type CompactionStartEvent = { + type: "compaction_start"; +}; + +/** Emitted when context compaction completes */ +export type CompactionEndEvent = { + type: "compaction_end"; + removed: number; + kept: number; + tokensRemoved?: number; + tokensKept?: number; + reason: string; +}; + +/** Union of all compaction events */ +export type CompactionEvent = CompactionStartEvent | CompactionEndEvent; + // --- Stream event types --- /** - * Hub forwards AgentEvent from pi-agent-core as-is. - * StreamPayload wraps it with routing metadata. + * Hub forwards AgentEvent from pi-agent-core and CompactionEvent as-is. + * StreamPayload wraps them with routing metadata. */ export interface StreamPayload { streamId: string; agentId: string; - event: AgentEvent; + event: AgentEvent | CompactionEvent; } /** Extract thinking/reasoning content from an AgentEvent that carries a message */ diff --git a/packages/store/src/connection-store.ts b/packages/store/src/connection-store.ts index 17b41a51..bd6893f1 100644 --- a/packages/store/src/connection-store.ts +++ b/packages/store/src/connection-store.ts @@ -22,6 +22,7 @@ import { type ConnectionState, type StreamPayload, type AgentEvent, + type CompactionEndEvent, type GetAgentMessagesResult, type ContentBlock, } from "@multica/sdk" @@ -143,6 +144,21 @@ function createClient( case "tool_execution_update": // Partial results — not rendered yet, ignored for now break + case "compaction_start": { + store.startCompaction() + break + } + case "compaction_end": { + const evt = event as CompactionEndEvent + store.endCompaction({ + removed: evt.removed, + kept: evt.kept, + tokensRemoved: evt.tokensRemoved, + tokensKept: evt.tokensKept, + reason: evt.reason, + }) + break + } } return } diff --git a/packages/store/src/index.ts b/packages/store/src/index.ts index f2ee2670..73c86225 100644 --- a/packages/store/src/index.ts +++ b/packages/store/src/index.ts @@ -2,6 +2,6 @@ export { useConnectionStore } from "./connection-store" export type { ConnectionStore } from "./connection-store" export { useAutoConnect } from "./use-auto-connect" export { useMessagesStore } from "./messages" -export type { Message, MessagesStore, SendContext, ToolStatus } from "./messages" +export type { Message, MessagesStore, SendContext, ToolStatus, CompactionStats } from "./messages" export { parseConnectionCode, saveConnection, loadConnection, clearConnection } from "./connection" export type { ConnectionInfo } from "./connection" diff --git a/packages/store/src/messages.ts b/packages/store/src/messages.ts index 45555dd6..c887e45f 100644 --- a/packages/store/src/messages.ts +++ b/packages/store/src/messages.ts @@ -15,6 +15,14 @@ import type { ContentBlock } from "@multica/sdk" export type ToolStatus = "running" | "success" | "error" | "interrupted" +export interface CompactionStats { + removed: number + kept: number + tokensRemoved?: number + tokensKept?: number + reason: string +} + export interface Message { id: string role: "user" | "assistant" | "toolResult" @@ -40,6 +48,8 @@ export interface SendContext { interface MessagesState { messages: Message[] streamingIds: Set + compacting: boolean + lastCompaction: CompactionStats | null } interface MessagesActions { @@ -56,6 +66,9 @@ interface MessagesActions { // Tool execution lifecycle startToolExecution: (agentId: string, toolCallId: string, toolName: string, args?: unknown) => void endToolExecution: (toolCallId: string, result?: unknown, isError?: boolean) => void + // Compaction lifecycle + startCompaction: () => void + endCompaction: (stats: CompactionStats) => void } export type MessagesStore = MessagesState & MessagesActions @@ -63,6 +76,8 @@ export type MessagesStore = MessagesState & MessagesActions export const useMessagesStore = create()((set, get) => ({ messages: [], streamingIds: new Set(), + compacting: false, + lastCompaction: null, sendMessage: (text, ctx) => { get().addUserMessage(text, ctx.agentId) @@ -102,7 +117,7 @@ export const useMessagesStore = create()((set, get) => ({ }, clearMessages: () => { - set({ messages: [], streamingIds: new Set() }) + set({ messages: [], streamingIds: new Set(), compacting: false, lastCompaction: null }) }, // --- Streaming: build assistant message incrementally --- @@ -180,4 +195,14 @@ export const useMessagesStore = create()((set, get) => ({ ), })) }, + + // --- Compaction lifecycle --- + + startCompaction: () => { + set({ compacting: true }) + }, + + endCompaction: (stats) => { + set({ compacting: false, lastCompaction: stats }) + }, })) diff --git a/src/agent/async-agent.ts b/src/agent/async-agent.ts index c0195083..db22b843 100644 --- a/src/agent/async-agent.ts +++ b/src/agent/async-agent.ts @@ -3,11 +3,12 @@ import type { AgentEvent, AgentMessage } from "@mariozechner/pi-agent-core"; import { Agent } from "./runner.js"; import { Channel } from "./channel.js"; import type { AgentOptions, Message } from "./types.js"; +import type { MulticaEvent } from "./events.js"; const devNull = { write: () => true } as unknown as NodeJS.WritableStream; -/** Discriminated union of legacy Message (error fallback) and raw AgentEvent */ -export type ChannelItem = Message | AgentEvent; +/** Discriminated union of legacy Message, raw AgentEvent, and MulticaEvent */ +export type ChannelItem = Message | AgentEvent | MulticaEvent; export class AsyncAgent { private readonly agent: Agent; @@ -24,8 +25,8 @@ export class AsyncAgent { }); this.sessionId = this.agent.sessionId; - // Forward raw AgentEvent into the channel - this.agent.subscribe((event: AgentEvent) => { + // Forward raw AgentEvent and MulticaEvent into the channel + this.agent.subscribeAll((event: AgentEvent | MulticaEvent) => { this.channel.send(event); }); } @@ -42,6 +43,9 @@ export class AsyncAgent { .then(async () => { if (this._closed) return; const result = await this.agent.run(content); + // Flush pending session writes so waitForIdle() callers + // can safely read session data from disk. + await this.agent.flushSession(); // Normal text is delivered via message_end event; only handle errors here if (result.error) { this.channel.send({ id: uuidv7(), content: `[error] ${result.error}` }); @@ -61,10 +65,11 @@ export class AsyncAgent { /** * Subscribe to agent events directly (supports multiple subscribers). * Unlike read(), this allows multiple consumers to receive the same events. + * Receives both pi-agent-core AgentEvent and MulticaEvent (e.g. compaction). */ - subscribe(callback: (event: AgentEvent) => void): () => void { + subscribe(callback: (event: AgentEvent | MulticaEvent) => void): () => void { console.log(`[AsyncAgent] Adding subscriber for agent: ${this.sessionId}`); - const unsubscribe = this.agent.subscribe((event) => { + const unsubscribe = this.agent.subscribeAll((event) => { console.log(`[AsyncAgent] Event received: ${event.type}`); callback(event); }); diff --git a/src/agent/context-window/index.ts b/src/agent/context-window/index.ts index ec57f696..440d6b28 100644 --- a/src/agent/context-window/index.ts +++ b/src/agent/context-window/index.ts @@ -44,3 +44,13 @@ export { compactMessagesWithSummary, compactMessagesWithChunkedSummary, } from "./summarization.js"; + +// Tool result pruning +export type { + ToolResultPruningSettings, + ToolResultPruningResult, +} from "./tool-result-pruning.js"; +export { + DEFAULT_TOOL_RESULT_PRUNING_SETTINGS, + pruneToolResults, +} from "./tool-result-pruning.js"; diff --git a/src/agent/context-window/tool-result-pruning.test.ts b/src/agent/context-window/tool-result-pruning.test.ts new file mode 100644 index 00000000..d2677c2a --- /dev/null +++ b/src/agent/context-window/tool-result-pruning.test.ts @@ -0,0 +1,285 @@ +import { describe, it, expect } from "vitest"; +import type { AgentMessage } from "@mariozechner/pi-agent-core"; +import { pruneToolResults, DEFAULT_TOOL_RESULT_PRUNING_SETTINGS } from "./tool-result-pruning.js"; + +// Helper to create a user message with tool result +function createToolResultMessage( + toolName: string, + content: string, + toolUseId: string = "tool-123", +): AgentMessage { + return { + role: "user", + content: [ + { + type: "tool_result", + tool_use_id: toolUseId, + name: toolName, + content: [{ type: "text", text: content }], + }, + ], + } as unknown as AgentMessage; +} + +// Helper to create an assistant message +function createAssistantMessage(text: string): AgentMessage { + return { + role: "assistant", + content: [{ type: "text", text }], + } as unknown as AgentMessage; +} + +// Helper to create a user message +function createUserMessage(text: string): AgentMessage { + return { + role: "user", + content: text, + } as unknown as AgentMessage; +} + +describe("pruneToolResults", () => { + it("returns unchanged if utilization is below softTrimRatio", () => { + const messages = [ + createUserMessage("Hello"), + createAssistantMessage("Hi there!"), + createToolResultMessage("read", "Short content"), + ]; + + const result = pruneToolResults({ + messages, + contextWindowTokens: 200_000, // Very large window + }); + + expect(result.changed).toBe(false); + expect(result.messages).toBe(messages); + expect(result.softTrimmed).toBe(0); + expect(result.hardCleared).toBe(0); + }); + + it("soft trims large tool results", () => { + // Create a message with a large tool result (5000 chars) + const largeContent = "A".repeat(5000); + const messages = [ + createUserMessage("Hello"), + createAssistantMessage("Processing..."), + createToolResultMessage("read", largeContent), + createAssistantMessage("Done!"), + createAssistantMessage("Follow up"), + createAssistantMessage("Another one"), + createAssistantMessage("Protected message"), // This is protected (keepLastAssistants=3) + ]; + + const result = pruneToolResults({ + messages, + contextWindowTokens: 10_000, // Small window to trigger pruning + settings: { + softTrimRatio: 0.1, // Low threshold to ensure pruning + }, + }); + + expect(result.changed).toBe(true); + expect(result.softTrimmed).toBe(1); + + // Check that the trimmed message contains head + tail + const trimmedMsg = result.messages[2] as any; + const trimmedText = trimmedMsg.content[0].content[0].text; + expect(trimmedText).toContain("A".repeat(100)); // Should have some head content + expect(trimmedText).toContain("..."); // Truncation marker + expect(trimmedText).toContain("[Tool result trimmed:"); + }); + + it("hard clears when utilization exceeds hardClearRatio", () => { + // Create multiple messages with large tool results + const largeContent = "X".repeat(10000); + const messages = [ + createUserMessage("Start"), + createAssistantMessage("Processing 1"), + createToolResultMessage("read", largeContent, "tool-1"), + createAssistantMessage("Processing 2"), + createToolResultMessage("exec", largeContent, "tool-2"), + createAssistantMessage("Processing 3"), + createToolResultMessage("glob", largeContent, "tool-3"), + createAssistantMessage("Done 1"), // Protected + createAssistantMessage("Done 2"), // Protected + createAssistantMessage("Done 3"), // Protected + ]; + + const result = pruneToolResults({ + messages, + contextWindowTokens: 5_000, // Very small window + settings: { + softTrimRatio: 0.1, + hardClearRatio: 0.2, + minPrunableToolChars: 1000, // Lower threshold for test + hardClear: { + enabled: true, + placeholder: "[Cleared]", + }, + }, + }); + + expect(result.changed).toBe(true); + // Should have cleared at least some tool results + expect(result.hardCleared).toBeGreaterThan(0); + expect(result.charsSaved).toBeGreaterThan(0); + }); + + it("protects last N assistant messages", () => { + const messages = [ + createUserMessage("Hello"), + createAssistantMessage("First"), + createToolResultMessage("read", "A".repeat(5000), "tool-1"), // Should be prunable + createAssistantMessage("Second"), // Protected (keepLastAssistants=3) + createToolResultMessage("read", "B".repeat(5000), "tool-2"), // In protected zone, should NOT be pruned + createAssistantMessage("Third"), // Protected + createAssistantMessage("Fourth"), // Protected + ]; + + const result = pruneToolResults({ + messages, + contextWindowTokens: 5_000, + settings: { + softTrimRatio: 0.1, + keepLastAssistants: 3, + }, + }); + + // The first tool result (before protected zone) may be pruned + // But the second one (after "Second" assistant which is in protected zone) should not be + if (result.changed) { + // Check that tool-2 result is NOT modified (it's in protected zone) + const tool2Msg = result.messages[4] as any; + const tool2Content = tool2Msg.content[0].content[0].text; + expect(tool2Content).toBe("B".repeat(5000)); // Unchanged + } + }); + + it("never prunes before first user message", () => { + const messages = [ + createAssistantMessage("Bootstrap read"), // Before first user message + createToolResultMessage("read", "A".repeat(5000), "tool-1"), // Should NOT be pruned + createUserMessage("Hello"), // First user message + createAssistantMessage("Response"), + createToolResultMessage("read", "B".repeat(5000), "tool-2"), // Can be pruned + createAssistantMessage("Done 1"), + createAssistantMessage("Done 2"), + createAssistantMessage("Done 3"), + ]; + + const result = pruneToolResults({ + messages, + contextWindowTokens: 5_000, + settings: { + softTrimRatio: 0.1, + }, + }); + + // The first tool result (before first user message) should NOT be modified + const tool1Msg = result.messages[1] as any; + const tool1Content = tool1Msg.content[0].content[0].text; + expect(tool1Content).toBe("A".repeat(5000)); // Unchanged - bootstrap protection + }); + + it("respects tool deny list", () => { + const messages = [ + createUserMessage("Hello"), + createAssistantMessage("Processing"), + createToolResultMessage("read", "A".repeat(5000), "tool-1"), + createAssistantMessage("Done 1"), + createAssistantMessage("Done 2"), + createAssistantMessage("Done 3"), + ]; + + const result = pruneToolResults({ + messages, + contextWindowTokens: 5_000, + settings: { + softTrimRatio: 0.1, + tools: { + deny: ["read"], // Don't prune read tool results + }, + }, + }); + + // read tool should not be pruned + expect(result.changed).toBe(false); + }); + + it("respects tool allow list", () => { + const messages = [ + createUserMessage("Hello"), + createAssistantMessage("Processing"), + createToolResultMessage("read", "A".repeat(5000), "tool-1"), + createToolResultMessage("exec", "B".repeat(5000), "tool-2"), + createAssistantMessage("Done 1"), + createAssistantMessage("Done 2"), + createAssistantMessage("Done 3"), + ]; + + const result = pruneToolResults({ + messages, + contextWindowTokens: 5_000, + settings: { + softTrimRatio: 0.1, + tools: { + allow: ["exec"], // Only prune exec tool results + }, + }, + }); + + if (result.changed) { + // read tool should not be pruned + const tool1Msg = result.messages[2] as any; + const tool1Content = tool1Msg.content[0].content[0].text; + expect(tool1Content).toBe("A".repeat(5000)); // Unchanged + } + }); + + it("skips tool results with images", () => { + const messages = [ + createUserMessage("Hello"), + createAssistantMessage("Processing"), + { + role: "user", + content: [ + { + type: "tool_result", + tool_use_id: "tool-1", + name: "screenshot", + content: [ + { type: "image", source: { type: "base64", data: "abc123" } }, + { type: "text", text: "A".repeat(5000) }, + ], + }, + ], + } as unknown as AgentMessage, + createAssistantMessage("Done 1"), + createAssistantMessage("Done 2"), + createAssistantMessage("Done 3"), + ]; + + const result = pruneToolResults({ + messages, + contextWindowTokens: 5_000, + settings: { + softTrimRatio: 0.1, + }, + }); + + // Image-containing tool result should not be pruned + expect(result.softTrimmed).toBe(0); + expect(result.hardCleared).toBe(0); + }); +}); + +describe("DEFAULT_TOOL_RESULT_PRUNING_SETTINGS", () => { + it("has expected default values", () => { + expect(DEFAULT_TOOL_RESULT_PRUNING_SETTINGS.softTrimRatio).toBe(0.3); + expect(DEFAULT_TOOL_RESULT_PRUNING_SETTINGS.hardClearRatio).toBe(0.5); + expect(DEFAULT_TOOL_RESULT_PRUNING_SETTINGS.keepLastAssistants).toBe(3); + expect(DEFAULT_TOOL_RESULT_PRUNING_SETTINGS.softTrim.maxChars).toBe(4000); + expect(DEFAULT_TOOL_RESULT_PRUNING_SETTINGS.softTrim.headChars).toBe(1500); + expect(DEFAULT_TOOL_RESULT_PRUNING_SETTINGS.softTrim.tailChars).toBe(1500); + expect(DEFAULT_TOOL_RESULT_PRUNING_SETTINGS.hardClear.enabled).toBe(true); + }); +}); diff --git a/src/agent/context-window/tool-result-pruning.ts b/src/agent/context-window/tool-result-pruning.ts new file mode 100644 index 00000000..ef7ac5f1 --- /dev/null +++ b/src/agent/context-window/tool-result-pruning.ts @@ -0,0 +1,510 @@ +/** + * Tool Result Pruning + * + * Smart pruning of tool results to reduce context window usage while preserving + * useful information. Implements two-phase pruning: + * + * 1. Soft Trim: Keep head + tail of large tool results + * 2. Hard Clear: Replace old tool results with placeholder + * + * Based on OpenClaw's microcompact-style context pruning. + */ + +import type { AgentMessage } from "@mariozechner/pi-agent-core"; + +// ─── Types ─────────────────────────────────────────────────────────────────── + +export type ToolResultPruningSettings = { + /** Utilization ratio to start soft trimming (default: 0.3) */ + softTrimRatio: number; + /** Utilization ratio to start hard clearing (default: 0.5) */ + hardClearRatio: number; + /** Minimum prunable tool result chars to consider hard clear (default: 50000) */ + minPrunableToolChars: number; + /** Number of recent assistant messages to protect from pruning (default: 3) */ + keepLastAssistants: number; + /** Soft trim settings */ + softTrim: { + /** Max chars before triggering soft trim (default: 4000) */ + maxChars: number; + /** Chars to keep from start (default: 1500) */ + headChars: number; + /** Chars to keep from end (default: 1500) */ + tailChars: number; + }; + /** Hard clear settings */ + hardClear: { + /** Whether hard clear is enabled (default: true) */ + enabled: boolean; + /** Placeholder text for cleared results */ + placeholder: string; + }; + /** Tool names to allow/deny pruning */ + tools?: { + allow?: string[]; + deny?: string[]; + }; +}; + +export const DEFAULT_TOOL_RESULT_PRUNING_SETTINGS: ToolResultPruningSettings = { + softTrimRatio: 0.3, + hardClearRatio: 0.5, + minPrunableToolChars: 50_000, + keepLastAssistants: 3, + softTrim: { + maxChars: 4_000, + headChars: 1_500, + tailChars: 1_500, + }, + hardClear: { + enabled: true, + placeholder: "[Tool result cleared to save context space]", + }, +}; + +export type ToolResultPruningResult = { + /** Pruned messages */ + messages: AgentMessage[]; + /** Whether any changes were made */ + changed: boolean; + /** Number of soft-trimmed results */ + softTrimmed: number; + /** Number of hard-cleared results */ + hardCleared: number; + /** Estimated chars saved */ + charsSaved: number; +}; + +// ─── Constants ─────────────────────────────────────────────────────────────── + +const CHARS_PER_TOKEN_ESTIMATE = 4; +const IMAGE_CHAR_ESTIMATE = 8_000; + +// ─── Helper Functions ──────────────────────────────────────────────────────── + +/** + * Extract text content from a tool result content block. + */ +function extractToolResultText(content: unknown): string { + if (typeof content === "string") return content; + if (Array.isArray(content)) { + const parts: string[] = []; + for (const block of content) { + if (typeof block === "string") { + parts.push(block); + } else if (block && typeof block === "object") { + if ("text" in block && typeof block.text === "string") { + parts.push(block.text); + } + } + } + return parts.join("\n"); + } + return ""; +} + +/** + * Check if content contains images. + */ +function hasImageContent(content: unknown): boolean { + if (!Array.isArray(content)) return false; + for (const block of content) { + if (block && typeof block === "object" && "type" in block) { + if (block.type === "image") return true; + } + } + return false; +} + +/** + * Estimate character count for a message. + */ +function estimateMessageChars(message: AgentMessage): number { + const msgAny = message as any; + + if (message.role === "user") { + const content = msgAny.content; + if (typeof content === "string") return content.length; + if (!Array.isArray(content)) return 0; + + let chars = 0; + for (const block of content) { + if (typeof block === "string") { + chars += block.length; + } else if (block && typeof block === "object") { + if (block.type === "text" && typeof block.text === "string") { + chars += block.text.length; + } else if (block.type === "tool_result") { + chars += extractToolResultText(block.content).length; + } else if (block.type === "image") { + chars += IMAGE_CHAR_ESTIMATE; + } + } + } + return chars; + } + + if (message.role === "assistant") { + const content = msgAny.content; + if (typeof content === "string") return content.length; + if (!Array.isArray(content)) return 0; + + let chars = 0; + for (const block of content) { + if (typeof block === "string") { + chars += block.length; + } else if (block && typeof block === "object") { + if (block.type === "text" && typeof block.text === "string") { + chars += block.text.length; + } else if (block.type === "thinking" && typeof block.thinking === "string") { + chars += block.thinking.length; + } else if (block.type === "toolCall" || block.type === "tool_use") { + try { + chars += JSON.stringify(block.arguments ?? block.input ?? {}).length; + } catch { + chars += 128; + } + } + } + } + return chars; + } + + return 256; +} + +/** + * Estimate total character count for messages. + */ +function estimateContextChars(messages: AgentMessage[]): number { + return messages.reduce((sum, m) => sum + estimateMessageChars(m), 0); +} + +/** + * Find the index where we should stop protecting assistant messages. + * Returns null if not enough assistant messages exist. + */ +function findAssistantCutoffIndex( + messages: AgentMessage[], + keepLastAssistants: number, +): number | null { + if (keepLastAssistants <= 0) return messages.length; + + let remaining = keepLastAssistants; + for (let i = messages.length - 1; i >= 0; i--) { + if (messages[i]?.role !== "assistant") continue; + remaining--; + if (remaining === 0) return i; + } + + return null; +} + +/** + * Check if a user message is a "real" user message (not just tool results). + * Tool results are sent as user messages but they're not real user input. + */ +function isRealUserMessage(message: AgentMessage): boolean { + if (message.role !== "user") return false; + + const msgAny = message as any; + const content = msgAny.content; + + // String content is a real user message + if (typeof content === "string") return true; + + // Array content - check if it has any non-tool-result blocks + if (Array.isArray(content)) { + for (const block of content) { + if (typeof block === "string") return true; + if (block && typeof block === "object") { + // Any type other than tool_result is real user content + if (block.type !== "tool_result") return true; + } + } + // Only tool_result blocks - not a real user message + return false; + } + + return true; +} + +/** + * Find the index of the first real user message (not tool results). + * This is used for bootstrap protection - we never prune before the first real user input. + */ +function findFirstUserIndex(messages: AgentMessage[]): number | null { + for (let i = 0; i < messages.length; i++) { + const msg = messages[i]; + if (msg && isRealUserMessage(msg)) return i; + } + return null; +} + +/** + * Check if a tool should be pruned based on settings. + */ +function isToolPrunable(toolName: string, settings: ToolResultPruningSettings): boolean { + const { tools } = settings; + if (!tools) return true; + + // If deny list exists and tool is in it, don't prune + if (tools.deny?.includes(toolName)) return false; + + // If allow list exists, only prune if tool is in it + if (tools.allow && tools.allow.length > 0) { + return tools.allow.includes(toolName); + } + + return true; +} + +/** + * Take first N characters from text. + */ +function takeHead(text: string, maxChars: number): string { + if (maxChars <= 0) return ""; + if (text.length <= maxChars) return text; + return text.slice(0, maxChars); +} + +/** + * Take last N characters from text. + */ +function takeTail(text: string, maxChars: number): string { + if (maxChars <= 0) return ""; + if (text.length <= maxChars) return text; + return text.slice(text.length - maxChars); +} + +/** + * Soft trim a tool result text. + */ +function softTrimText( + text: string, + settings: ToolResultPruningSettings, +): { trimmed: string; saved: number } | null { + const { maxChars, headChars, tailChars } = settings.softTrim; + + if (text.length <= maxChars) return null; + if (headChars + tailChars >= text.length) return null; + + const head = takeHead(text, headChars); + const tail = takeTail(text, tailChars); + const note = `\n\n[Tool result trimmed: kept first ${headChars} chars and last ${tailChars} chars of ${text.length} chars.]`; + const trimmed = `${head}\n...\n${tail}${note}`; + + return { + trimmed, + saved: text.length - trimmed.length, + }; +} + +/** + * Process a user message containing tool results. + * Returns modified message if any tool results were trimmed/cleared. + */ +function processUserMessageToolResults( + message: AgentMessage, + settings: ToolResultPruningSettings, + mode: "soft" | "hard", +): { message: AgentMessage; changed: boolean; charsSaved: number } { + const msgAny = message as any; + const content = msgAny.content; + + if (!Array.isArray(content)) { + return { message, changed: false, charsSaved: 0 }; + } + + let changed = false; + let charsSaved = 0; + const newContent: any[] = []; + + for (const block of content) { + if (!block || typeof block !== "object" || block.type !== "tool_result") { + newContent.push(block); + continue; + } + + const toolName = block.name ?? "unknown"; + + // Skip non-prunable tools + if (!isToolPrunable(toolName, settings)) { + newContent.push(block); + continue; + } + + // Skip image-containing tool results + if (hasImageContent(block.content)) { + newContent.push(block); + continue; + } + + const originalText = extractToolResultText(block.content); + + if (mode === "soft") { + const result = softTrimText(originalText, settings); + if (result) { + newContent.push({ + ...block, + content: [{ type: "text", text: result.trimmed }], + }); + changed = true; + charsSaved += result.saved; + } else { + newContent.push(block); + } + } else { + // Hard clear + newContent.push({ + ...block, + content: [{ type: "text", text: settings.hardClear.placeholder }], + }); + changed = true; + charsSaved += originalText.length - settings.hardClear.placeholder.length; + } + } + + if (!changed) { + return { message, changed: false, charsSaved: 0 }; + } + + return { + message: { ...message, content: newContent } as AgentMessage, + changed: true, + charsSaved, + }; +} + +// ─── Main Functions ────────────────────────────────────────────────────────── + +/** + * Prune tool results in messages to reduce context window usage. + * + * Two-phase approach: + * 1. Soft Trim (at softTrimRatio): Keep head + tail of large tool results + * 2. Hard Clear (at hardClearRatio): Replace old tool results with placeholder + * + * Protections: + * - Never prunes before first user message (protects bootstrap/identity reads) + * - Protects last N assistant messages and their corresponding tool results + * - Skips image-containing tool results + * - Respects tool allow/deny lists + */ +export function pruneToolResults(params: { + messages: AgentMessage[]; + contextWindowTokens: number; + settings?: Partial; +}): ToolResultPruningResult { + const { messages, contextWindowTokens } = params; + const settings: ToolResultPruningSettings = { + ...DEFAULT_TOOL_RESULT_PRUNING_SETTINGS, + ...params.settings, + softTrim: { + ...DEFAULT_TOOL_RESULT_PRUNING_SETTINGS.softTrim, + ...params.settings?.softTrim, + }, + hardClear: { + ...DEFAULT_TOOL_RESULT_PRUNING_SETTINGS.hardClear, + ...params.settings?.hardClear, + }, + }; + + const charWindow = contextWindowTokens * CHARS_PER_TOKEN_ESTIMATE; + if (charWindow <= 0) { + return { messages, changed: false, softTrimmed: 0, hardCleared: 0, charsSaved: 0 }; + } + + // Find cutoff index for protected assistant messages + const cutoffIndex = findAssistantCutoffIndex(messages, settings.keepLastAssistants); + if (cutoffIndex === null) { + return { messages, changed: false, softTrimmed: 0, hardCleared: 0, charsSaved: 0 }; + } + + // Never prune before first user message (bootstrap protection) + const firstUserIndex = findFirstUserIndex(messages); + const pruneStartIndex = firstUserIndex === null ? messages.length : firstUserIndex; + + // Calculate current utilization + let totalChars = estimateContextChars(messages); + let ratio = totalChars / charWindow; + + // No pruning needed + if (ratio < settings.softTrimRatio) { + return { messages, changed: false, softTrimmed: 0, hardCleared: 0, charsSaved: 0 }; + } + + let result = messages.slice(); + let changed = false; + let softTrimmed = 0; + let hardCleared = 0; + let charsSaved = 0; + + // Track which messages have prunable tool results + const prunableIndexes: number[] = []; + + // Phase 1: Soft Trim + for (let i = pruneStartIndex; i < cutoffIndex; i++) { + const msg = result[i]; + if (!msg || msg.role !== "user") continue; + + const msgAny = msg as any; + if (!Array.isArray(msgAny.content)) continue; + + // Check if this message has tool results + const hasToolResult = msgAny.content.some( + (b: any) => b && typeof b === "object" && b.type === "tool_result", + ); + if (!hasToolResult) continue; + + prunableIndexes.push(i); + + const processed = processUserMessageToolResults(msg, settings, "soft"); + if (processed.changed) { + result[i] = processed.message; + changed = true; + softTrimmed++; + charsSaved += processed.charsSaved; + totalChars -= processed.charsSaved; + } + } + + // Recalculate ratio after soft trim + ratio = totalChars / charWindow; + + // Phase 2: Hard Clear (if needed) + if (ratio >= settings.hardClearRatio && settings.hardClear.enabled) { + // Check if we have enough prunable content to make hard clear worthwhile + let prunableChars = 0; + for (const i of prunableIndexes) { + prunableChars += estimateMessageChars(result[i]!); + } + + if (prunableChars >= settings.minPrunableToolChars) { + for (const i of prunableIndexes) { + if (ratio < settings.hardClearRatio) break; + + const msg = result[i]!; + const beforeChars = estimateMessageChars(msg); + + const processed = processUserMessageToolResults(msg, settings, "hard"); + if (processed.changed) { + result[i] = processed.message; + changed = true; + hardCleared++; + charsSaved += processed.charsSaved; + totalChars -= processed.charsSaved; + ratio = totalChars / charWindow; + } + } + } + } + + return { + messages: result, + changed, + softTrimmed, + hardCleared, + charsSaved, + }; +} diff --git a/src/agent/events.ts b/src/agent/events.ts new file mode 100644 index 00000000..8eb8b422 --- /dev/null +++ b/src/agent/events.ts @@ -0,0 +1,30 @@ +/** + * Super Multica custom events (parallel to pi-agent-core's AgentEvent) + * + * These events extend the agent's event system with Multica-specific + * lifecycle events that pi-agent-core does not provide. + */ + +/** Emitted when context compaction begins */ +export type CompactionStartEvent = { + type: "compaction_start"; +}; + +/** + * Emitted when context compaction completes. + * + * Note: `reason` uses a narrow union here for type safety within the agent. + * The SDK's `CompactionEndEvent` uses `string` to allow future extensions + * without requiring SDK version bumps. + */ +export type CompactionEndEvent = { + type: "compaction_end"; + removed: number; + kept: number; + tokensRemoved?: number; + tokensKept?: number; + reason: "count" | "tokens" | "summary" | "pruning"; +}; + +/** Union of all Multica-specific events */ +export type MulticaEvent = CompactionStartEvent | CompactionEndEvent; diff --git a/src/agent/index.ts b/src/agent/index.ts index 16dac2f9..720eb5b6 100644 --- a/src/agent/index.ts +++ b/src/agent/index.ts @@ -1,5 +1,6 @@ export * from "./runner.js"; export * from "./types.js"; +export * from "./events.js"; export * from "./profile/index.js"; export * from "./context-window/index.js"; export * from "./skills/index.js"; diff --git a/src/agent/profile/index.ts b/src/agent/profile/index.ts index 54e0b0f7..cf7423e0 100644 --- a/src/agent/profile/index.ts +++ b/src/agent/profile/index.ts @@ -292,10 +292,10 @@ export class ProfileManager { updateStyle(style: string): void { const profile = this.getOrCreateProfile(false); const currentConfig = profile.config ?? {}; - const newConfig: ProfileConfig = { - ...currentConfig, + // Use Object.assign to avoid exactOptionalPropertyTypes issues with spread + const newConfig: ProfileConfig = Object.assign({}, currentConfig, { style: style as ProfileConfig["style"], - }; + }); profile.config = newConfig; this.profile = profile; writeProfileConfig(this.profileId, newConfig, { baseDir: this.baseDir }); diff --git a/src/agent/runner.ts b/src/agent/runner.ts index 310dabfb..65fd528b 100644 --- a/src/agent/runner.ts +++ b/src/agent/runner.ts @@ -1,8 +1,9 @@ import { Agent as PiAgentCore, type AgentEvent, type AgentMessage } from "@mariozechner/pi-agent-core"; import { v7 as uuidv7 } from "uuid"; import type { AgentOptions, AgentRunResult, ReasoningMode } from "./types.js"; +import type { MulticaEvent } from "./events.js"; import { createAgentOutput } from "./cli/output.js"; -import { resolveModel, resolveTools } from "./tools.js"; +import { resolveModel, resolveTools, type ResolveToolsOptions } from "./tools.js"; import { resolveApiKey, resolveApiKeyForProfile, @@ -78,11 +79,15 @@ export class Agent { private readonly contextWindowGuard: ContextWindowGuardResult; private readonly debug: boolean; private reasoningMode: ReasoningMode; - private toolsOptions: AgentOptions; + private toolsOptions: ResolveToolsOptions; private readonly originalToolsConfig?: ToolsConfig; private readonly stderr: NodeJS.WritableStream; private initialized = false; + // MulticaEvent subscribers (parallel to PiAgentCore's subscriber list) + // Typed as AgentEvent | MulticaEvent to match subscribeAll() callback signature + private multicaListeners: Array<(event: AgentEvent | MulticaEvent) => void> = []; + // Auth profile rotation state private resolvedProvider: string; private currentApiKey: string | undefined; @@ -280,7 +285,10 @@ export class Agent { // Merge Profile tools config with options.tools (options takes precedence) const profileToolsConfig = this.profile?.getToolsConfig(); const mergedToolsConfig = mergeToolsConfig(profileToolsConfig, options.tools); - this.toolsOptions = mergedToolsConfig ? { ...options, tools: mergedToolsConfig } : options; + const profileDir = this.profile?.getProfileDir(); + this.toolsOptions = mergedToolsConfig + ? { ...options, tools: mergedToolsConfig, profileDir } + : { ...options, profileDir }; const tools = resolveTools(this.toolsOptions); if (this.debug) { @@ -323,6 +331,27 @@ export class Agent { return this.agent.subscribe(fn); } + /** Subscribe to both AgentEvent and MulticaEvent streams */ + subscribeAll(fn: (event: AgentEvent | MulticaEvent) => void): () => void { + const unsubCore = this.agent.subscribe(fn); + this.multicaListeners.push(fn); + return () => { + unsubCore(); + const idx = this.multicaListeners.indexOf(fn); + if (idx >= 0) this.multicaListeners.splice(idx, 1); + }; + } + + private emitMulticaEvent(event: MulticaEvent): void { + for (const fn of this.multicaListeners) { + try { + fn(event); + } catch { + // Don't let listener errors break the agent loop + } + } + } + async run(prompt: string): Promise { await this.ensureInitialized(); this.output.state.lastAssistantText = ""; @@ -422,12 +451,36 @@ export class Agent { private async maybeCompact() { const messages = this.agent.state.messages.slice(); - const result = await this.session.maybeCompact(messages); - if (result?.kept) { - this.agent.replaceMessages(result.kept); + if (!this.session.needsCompaction(messages)) return; + + try { + const result = await this.session.maybeCompact(messages); + if (!result) return; + + this.emitMulticaEvent({ type: "compaction_start" }); + if (result?.kept) { + this.agent.replaceMessages(result.kept); + } + this.emitMulticaEvent({ + type: "compaction_end", + removed: result?.removedCount ?? 0, + kept: result?.kept.length ?? messages.length, + tokensRemoved: result?.tokensRemoved, + tokensKept: result?.tokensKept, + reason: result?.reason ?? "tokens", + }); + } catch (err) { + throw err; } } + /** + * Wait for all pending session storage writes to complete. + */ + async flushSession(): Promise { + await this.session.flush(); + } + /** * Reload tools from profile config. * Call this after updating tool status to apply changes diff --git a/src/agent/session/compaction.ts b/src/agent/session/compaction.ts index 1ce60ae1..18aadb3a 100644 --- a/src/agent/session/compaction.ts +++ b/src/agent/session/compaction.ts @@ -19,7 +19,8 @@ export type CompactionResult = { tokensKept?: number | undefined; /** Summary generated in summary mode */ summary?: string | undefined; - reason: "count" | "tokens" | "summary"; + /** Reason for compaction: count, tokens, summary, or pruning (tool result trimming only) */ + reason: "count" | "tokens" | "summary" | "pruning"; }; /** diff --git a/src/agent/session/session-manager.ts b/src/agent/session/session-manager.ts index a820072a..eece48b1 100644 --- a/src/agent/session/session-manager.ts +++ b/src/agent/session/session-manager.ts @@ -2,10 +2,15 @@ import type { AgentMessage } from "@mariozechner/pi-agent-core"; import { getModel, type Model } from "@mariozechner/pi-ai"; import type { SessionEntry, SessionMeta } from "./types.js"; import { appendEntry, readEntries, resolveSessionPath, writeEntries } from "./storage.js"; -import { compactMessages, compactMessagesAsync } from "./compaction.js"; +import { compactMessages, compactMessagesAsync, type CompactionResult } from "./compaction.js"; +import { estimateTokenUsage, shouldCompact as shouldCompactTokens } from "../context-window/index.js"; import { credentialManager } from "../credentials.js"; import { repairSessionFileIfNeeded, type RepairReport } from "./session-file-repair.js"; import { sanitizeToolCallInputs, sanitizeToolUseResultPairing } from "./session-transcript-repair.js"; +import { + pruneToolResults, + type ToolResultPruningSettings, +} from "../context-window/tool-result-pruning.js"; /** Get Kimi model for summarization (use a cheaper model than k2-thinking) */ function getSummaryModel(): Model { @@ -53,6 +58,12 @@ export type SessionManagerOptions = { apiKey?: string | undefined; /** Custom summary instructions */ customInstructions?: string | undefined; + + // Tool result pruning + /** Whether to enable tool result pruning before compaction (default: true in tokens/summary mode) */ + enableToolResultPruning?: boolean | undefined; + /** Tool result pruning settings */ + toolResultPruning?: Partial | undefined; }; export class SessionManager { @@ -73,6 +84,9 @@ export class SessionManager { private apiKey: string | undefined; private readonly customInstructions: string | undefined; private previousSummary: string | undefined; + // Tool result pruning + private readonly enableToolResultPruning: boolean; + private readonly toolResultPruning: Partial | undefined; private queue: Promise = Promise.resolve(); private meta: SessionMeta | undefined; @@ -100,6 +114,12 @@ export class SessionManager { this.apiKey = options.apiKey; this.customInstructions = options.customInstructions; + // Tool result pruning (enabled by default in tokens/summary mode) + this.enableToolResultPruning = + options.enableToolResultPruning ?? + (this.compactionMode === "tokens" || this.compactionMode === "summary"); + this.toolResultPruning = options.toolResultPruning; + this.meta = this.loadMeta(); } @@ -193,7 +213,48 @@ export class SessionManager { ); } - async maybeCompact(messages: AgentMessage[]) { + /** Check whether compaction would trigger for the given messages (without executing it) */ + needsCompaction(messages: AgentMessage[]): boolean { + if (this.compactionMode === "count") { + return messages.length > this.maxMessages; + } + // Token and summary modes use the same token-based threshold + const estimation = estimateTokenUsage({ + messages, + systemPrompt: this.systemPrompt, + contextWindowTokens: this.contextWindowTokens, + reserveTokens: this.reserveTokens, + }); + return shouldCompactTokens(estimation); + } + + async maybeCompact(messages: AgentMessage[]): Promise { + let workingMessages = messages; + let toolResultPruningApplied = false; + + // Phase 1: Tool result pruning (soft trim / hard clear) + // This reduces token usage without removing messages + if (this.enableToolResultPruning) { + const pruneResult = pruneToolResults({ + messages: workingMessages, + contextWindowTokens: this.contextWindowTokens, + settings: this.toolResultPruning, + }); + + if (pruneResult.changed) { + workingMessages = pruneResult.messages; + toolResultPruningApplied = true; + // Log pruning stats + if (pruneResult.softTrimmed > 0 || pruneResult.hardCleared > 0) { + console.error( + `[SessionManager] Tool result pruning: ${pruneResult.softTrimmed} soft-trimmed, ` + + `${pruneResult.hardCleared} hard-cleared, ~${Math.round(pruneResult.charsSaved / 1000)}k chars saved`, + ); + } + } + } + + // Phase 2: Message compaction (remove old messages if still needed) let result; if (this.compactionMode === "summary") { @@ -203,7 +264,7 @@ export class SessionManager { if (!apiKey) { // No API key available, downgrade to tokens mode - result = compactMessages(messages, { + result = compactMessages(workingMessages, { mode: "tokens", contextWindowTokens: this.contextWindowTokens, systemPrompt: this.systemPrompt, @@ -212,7 +273,7 @@ export class SessionManager { minKeepMessages: this.minKeepMessages, }); } else { - result = await compactMessagesAsync(messages, { + result = await compactMessagesAsync(workingMessages, { mode: "summary", model, apiKey, @@ -231,7 +292,7 @@ export class SessionManager { } } } else { - result = compactMessages(messages, { + result = compactMessages(workingMessages, { mode: this.compactionMode, // Count mode parameters maxMessages: this.maxMessages, @@ -245,7 +306,14 @@ export class SessionManager { }); } - if (!result) return null; + // If no message compaction needed but tool result pruning was applied, + // still return the pruned messages + if (!result) { + if (toolResultPruningApplied) { + return { kept: workingMessages, removedCount: 0, reason: "pruning" as const }; + } + return null; + } const entries: SessionEntry[] = []; if (this.meta) { @@ -272,8 +340,19 @@ export class SessionManager { return result; } + /** + * Wait for all pending storage writes to complete. + */ + async flush(): Promise { + await this.queue; + } + private enqueue(task: () => Promise) { - this.queue = this.queue.then(task, task); + this.queue = this.queue.then(task, task).catch((err) => { + // Log for debuggability, but preserve failure for awaiters. + console.error("[SessionManager] storage write failed:", err); + throw err; + }); return this.queue; } } diff --git a/src/agent/session/session-write-lock.ts b/src/agent/session/session-write-lock.ts index c6364e32..307b99cd 100644 --- a/src/agent/session/session-write-lock.ts +++ b/src/agent/session/session-write-lock.ts @@ -130,7 +130,19 @@ export async function acquireSessionWriteLock(params: { const staleMs = params.staleMs ?? 30 * 60 * 1000; const sessionFile = path.resolve(params.sessionFile); const sessionDir = path.dirname(sessionFile); - await fs.mkdir(sessionDir, { recursive: true }); + // Retry mkdir to handle transient ENOENT on macOS APFS race conditions + for (let attempt = 0; ; attempt++) { + try { + await fs.mkdir(sessionDir, { recursive: true }); + break; + } catch (err) { + if ((err as NodeJS.ErrnoException).code === "ENOENT" && attempt < 2) { + await new Promise((r) => setTimeout(r, 50)); + continue; + } + throw err; + } + } let normalizedDir = sessionDir; try { normalizedDir = await fs.realpath(sessionDir); diff --git a/src/agent/session/storage.ts b/src/agent/session/storage.ts index 63915783..9b104a24 100644 --- a/src/agent/session/storage.ts +++ b/src/agent/session/storage.ts @@ -23,8 +23,17 @@ export function resolveSessionPath(sessionId: string, options?: SessionStorageOp export function ensureSessionDir(sessionId: string, options?: SessionStorageOptions) { const dir = resolveSessionDir(sessionId, options); - if (!existsSync(dir)) { + // mkdirSync with recursive is idempotent (no-op if dir exists), + // so skip the existsSync check to avoid a TOCTOU race. + try { mkdirSync(dir, { recursive: true }); + } catch (err) { + // Retry once on transient ENOENT (macOS APFS race condition) + if ((err as NodeJS.ErrnoException).code === "ENOENT") { + mkdirSync(dir, { recursive: true }); + } else { + throw err; + } } } diff --git a/src/agent/session/types.ts b/src/agent/session/types.ts index 72c810ab..2d311902 100644 --- a/src/agent/session/types.ts +++ b/src/agent/session/types.ts @@ -23,5 +23,5 @@ export type SessionEntry = tokensKept?: number | undefined; /** 摘要模式生成的摘要 */ summary?: string | undefined; - reason?: "count" | "tokens" | "summary" | undefined; + reason?: "count" | "tokens" | "summary" | "pruning" | undefined; }; diff --git a/src/agent/system-prompt/builder.test.ts b/src/agent/system-prompt/builder.test.ts index 09a29467..426e857a 100644 --- a/src/agent/system-prompt/builder.test.ts +++ b/src/agent/system-prompt/builder.test.ts @@ -10,7 +10,7 @@ const PROFILE = { config: { name: "TestAgent" }, }; -const TOOLS = ["read", "write", "edit", "glob", "exec", "memory_get", "memory_set", "sessions_spawn", "web_search"]; +const TOOLS = ["read", "write", "edit", "glob", "exec", "sessions_spawn", "web_search"]; describe("buildSystemPrompt", () => { // ── Full mode ───────────────────────────────────────────────────────── @@ -42,12 +42,6 @@ describe("buildSystemPrompt", () => { expect(result).toContain("## Tool Call Style"); }); - it("full mode includes memory section when memory tools present", () => { - const result = buildSystemPrompt({ mode: "full", tools: ["memory_get", "memory_set"] }); - expect(result).toContain("## Memory"); - expect(result).toContain("search memory first"); - }); - it("full mode includes sub-agents section when sessions_spawn present", () => { const result = buildSystemPrompt({ mode: "full", tools: ["sessions_spawn"] }); expect(result).toContain("## Sub-Agents"); diff --git a/src/agent/system-prompt/sections.test.ts b/src/agent/system-prompt/sections.test.ts index cfd25f6f..7d09b0ce 100644 --- a/src/agent/system-prompt/sections.test.ts +++ b/src/agent/system-prompt/sections.test.ts @@ -163,11 +163,6 @@ describe("buildToolCallStyleSection", () => { }); describe("buildConditionalToolSections", () => { - it("includes memory section when memory tools present", () => { - const result = buildConditionalToolSections(["memory_get", "read"], "full"); - expect(result.join("\n")).toContain("## Memory"); - }); - it("includes sub-agents section when sessions_spawn present in full mode", () => { const result = buildConditionalToolSections(["sessions_spawn"], "full"); expect(result.join("\n")).toContain("## Sub-Agents"); @@ -189,7 +184,7 @@ describe("buildConditionalToolSections", () => { }); it("returns empty in none mode", () => { - expect(buildConditionalToolSections(["memory_get"], "none")).toEqual([]); + expect(buildConditionalToolSections(["read"], "none")).toEqual([]); }); }); diff --git a/src/agent/system-prompt/sections.ts b/src/agent/system-prompt/sections.ts index a0126d56..6d4033dd 100644 --- a/src/agent/system-prompt/sections.ts +++ b/src/agent/system-prompt/sections.ts @@ -6,7 +6,12 @@ import { SAFETY_CONSTITUTION } from "./constitution.js"; import { formatRuntimeLine } from "./runtime-info.js"; -import type { ProfileContent, RuntimeInfo, SubagentContext, SystemPromptMode } from "./types.js"; +import type { + ProfileContent, + RuntimeInfo, + SubagentContext, + SystemPromptMode, +} from "./types.js"; // ─── Core tool summaries ──────────────────────────────────────────────────── @@ -20,10 +25,7 @@ const CORE_TOOL_SUMMARIES: Record = { process: "Manage background exec sessions", web_search: "Search the web", web_fetch: "Fetch and extract readable content from a URL", - memory_get: "Read from agent memory", - memory_set: "Write to agent memory", - memory_list: "List memory entries", - memory_delete: "Delete memory entries", + memory_search: "Search memory files by keyword", sessions_spawn: "Spawn a sub-agent session", }; @@ -37,10 +39,7 @@ const TOOL_ORDER = [ "process", "web_search", "web_fetch", - "memory_get", - "memory_set", - "memory_list", - "memory_delete", + "memory_search", "sessions_spawn", ]; @@ -98,6 +97,7 @@ export function buildWorkspaceSection( "## Profile", "", `Your profile directory: \`${profileDir}\``, + "Use this as the base path for profile files (soul.md, user.md, memory.md, memory/*.md).", "", "Profile files:", "- `soul.md` — Your identity and values", @@ -170,7 +170,9 @@ export function buildToolingSummary( seen.add(tool); const displayName = resolveToolName(tool); const summary = CORE_TOOL_SUMMARIES[tool]; - toolLines.push(summary ? `- ${displayName}: ${summary}` : `- ${displayName}`); + toolLines.push( + summary ? `- ${displayName}: ${summary}` : `- ${displayName}`, + ); } } @@ -217,16 +219,16 @@ export function buildConditionalToolSections( const lines: string[] = []; // Memory tools - const hasMemory = - toolSet.has("memory_get") || - toolSet.has("memory_set") || - toolSet.has("memory_list") || - toolSet.has("memory_delete"); - if (hasMemory) { + if (toolSet.has("memory_search")) { lines.push( - "## Memory", - "Before answering anything about prior work, decisions, dates, people, preferences, or todos: search memory first, then pull only the needed entries.", - "Update memory when the user shares important information, decisions, or preferences.", + "## Memory Recall", + "Before answering anything about prior work, decisions, dates, people, preferences, or todos:", + "1. Run `memory_search` to find relevant entries in memory files", + "2. Use `read` to pull needed context", + "", + "To update memory, use `edit` on the appropriate file:", + "- `memory.md` — Long-term knowledge (decisions, preferences, important context)", + "- `memory/YYYY-MM-DD.md` — Daily logs and session notes", "", ); } @@ -345,6 +347,7 @@ export function buildExtraPromptSection( const trimmed = extraSystemPrompt?.trim(); if (!trimmed) return []; - const header = mode === "minimal" ? "## Subagent Context" : "## Additional Context"; + const header = + mode === "minimal" ? "## Subagent Context" : "## Additional Context"; return [header, trimmed]; } diff --git a/src/agent/tools.ts b/src/agent/tools.ts index 4019a34d..0a33e039 100644 --- a/src/agent/tools.ts +++ b/src/agent/tools.ts @@ -6,8 +6,8 @@ import { createExecTool } from "./tools/exec.js"; import { createProcessTool } from "./tools/process.js"; import { createGlobTool } from "./tools/glob.js"; import { createWebFetchTool, createWebSearchTool } from "./tools/web/index.js"; -import { createMemoryTools } from "./tools/memory/index.js"; import { createSessionsSpawnTool } from "./tools/sessions-spawn.js"; +import { createMemorySearchTool } from "./tools/memory-search.js"; import { filterTools } from "./tools/policy.js"; import { isMulticaError, isRetryableError } from "../shared/errors.js"; import type { ExecApprovalCallback } from "./tools/exec-approval-types.js"; @@ -18,10 +18,8 @@ export { resolveModel } from "./providers/index.js"; /** Options for creating tools */ export interface CreateToolsOptions { cwd: string; - /** Profile ID for memory tools (optional) */ - profileId?: string | undefined; - /** Base directory for profiles (optional) */ - profileBaseDir?: string | undefined; + /** Profile directory for memory_search tool (optional) */ + profileDir?: string | undefined; /** Whether this agent is a subagent (passed to sessions_spawn tool) */ isSubagent?: boolean | undefined; /** Session ID of the agent (passed to sessions_spawn tool) */ @@ -97,7 +95,7 @@ function wrapTool( export function createAllTools(options: CreateToolsOptions | string): AgentTool[] { // Support legacy string argument for backwards compatibility const opts: CreateToolsOptions = typeof options === "string" ? { cwd: options } : options; - const { cwd, profileId, profileBaseDir, isSubagent, sessionId } = opts; + const { cwd, profileDir, isSubagent, sessionId } = opts; const baseTools = createCodingTools(cwd).filter( (tool) => tool.name !== "bash", @@ -118,13 +116,10 @@ export function createAllTools(options: CreateToolsOptions | string): AgentTool< webSearchTool as AgentTool, ]; - // Add memory tools if profileId is provided - if (profileId) { - const memoryTools = createMemoryTools({ - profileId, - baseDir: profileBaseDir, - }); - tools.push(...memoryTools); + // Add memory_search tool if profileDir is provided + if (profileDir) { + const memorySearchTool = createMemorySearchTool(profileDir); + tools.push(memorySearchTool as AgentTool); } // Add sessions_spawn tool (will be filtered by policy for subagents) @@ -137,6 +132,12 @@ export function createAllTools(options: CreateToolsOptions | string): AgentTool< return tools; } +/** Extended options for resolveTools that includes profileDir */ +export interface ResolveToolsOptions extends AgentOptions { + /** Profile directory for memory_search tool (computed from profileId if not provided) */ + profileDir?: string | undefined; +} + /** * Resolve tools for an agent with policy filtering. * @@ -146,14 +147,13 @@ export function createAllTools(options: CreateToolsOptions | string): AgentTool< * 3. Provider-specific rules * 4. Subagent restrictions */ -export function resolveTools(options: AgentOptions): AgentTool[] { +export function resolveTools(options: ResolveToolsOptions): AgentTool[] { const cwd = options.cwd ?? process.cwd(); - // Create all tools (including memory tools if profileId is provided) + // Create all tools const allTools = createAllTools({ cwd, - profileId: options.profileId, - profileBaseDir: options.profileBaseDir, + profileDir: options.profileDir, isSubagent: options.isSubagent, sessionId: options.sessionId, onExecApprovalNeeded: options.onExecApprovalNeeded, @@ -171,20 +171,8 @@ export function resolveTools(options: AgentOptions): AgentTool[] { /** * Get all available tool names (for debugging/listing). - * Note: Memory tools require profileId, so they are not included by default. */ export function getAllToolNames(cwd?: string): string[] { const tools = createAllTools({ cwd: cwd ?? process.cwd() }); return tools.map((t) => t.name); } - -/** - * Get all available tool names including memory tools (for debugging/listing). - */ -export function getAllToolNamesWithMemory(cwd?: string, profileId?: string): string[] { - const tools = createAllTools({ - cwd: cwd ?? process.cwd(), - profileId: profileId ?? "test-profile", - }); - return tools.map((t) => t.name); -} diff --git a/src/agent/tools/README.md b/src/agent/tools/README.md index db5b5266..ddbd4c08 100644 --- a/src/agent/tools/README.md +++ b/src/agent/tools/README.md @@ -49,34 +49,31 @@ The tools system provides LLM agents with capabilities to interact with the exte ## Available Tools -| Tool | Name | Description | -| ------------- | --------------- | --------------------------------------- | -| Read | `read` | Read file contents | -| Write | `write` | Write content to files | -| Edit | `edit` | Edit existing files | -| Glob | `glob` | Find files by pattern | -| Exec | `exec` | Execute shell commands | -| Process | `process` | Manage long-running processes | -| Web Fetch | `web_fetch` | Fetch and extract content from URLs | -| Web Search | `web_search` | Search the web (requires API key) | -| Memory Get | `memory_get` | Retrieve a value from persistent memory | -| Memory Set | `memory_set` | Store a value in persistent memory | -| Memory Delete | `memory_delete` | Delete a value from persistent memory | -| Memory List | `memory_list` | List all keys in persistent memory | +| Tool | Name | Description | +| -------------- | ---------------- | ----------------------------------- | +| Read | `read` | Read file contents | +| Write | `write` | Write content to files | +| Edit | `edit` | Edit existing files | +| Glob | `glob` | Find files by pattern | +| Exec | `exec` | Execute shell commands | +| Process | `process` | Manage long-running processes | +| Web Fetch | `web_fetch` | Fetch and extract content from URLs | +| Web Search | `web_search` | Search the web (requires API key) | +| Sessions Spawn | `sessions_spawn` | Spawn a sub-agent session | -> **Note**: Memory tools require a `profileId` to be specified. They store data in the profile's memory directory. +> **Note**: Agents use file-based memory (`memory.md`, `memory/*.md`) via `read` and `edit` tools instead of dedicated memory tools. ## Tool Groups Groups provide shortcuts for allowing/denying multiple tools at once: -| Group | Tools | -| --------------- | -------------------------------------------------- | -| `group:fs` | read, write, edit, glob | -| `group:runtime` | exec, process | -| `group:web` | web_search, web_fetch | -| `group:memory` | memory_get, memory_set, memory_delete, memory_list | -| `group:core` | All of the above (excluding memory) | +| Group | Tools | +| ---------------- | ------------------------------------ | +| `group:fs` | read, write, edit, glob | +| `group:runtime` | exec, process | +| `group:web` | web_search, web_fetch | +| `group:subagent` | sessions_spawn | +| `group:core` | All fs, runtime, and web tools | ## Usage diff --git a/src/agent/tools/README.zh-CN.md b/src/agent/tools/README.zh-CN.md index 80d84815..675fc6e2 100644 --- a/src/agent/tools/README.zh-CN.md +++ b/src/agent/tools/README.zh-CN.md @@ -49,34 +49,33 @@ ## 可用工具 -| 工具 | 名称 | 描述 | -| ------------- | --------------- | ------------------------ | -| Read | `read` | 读取文件内容 | -| Write | `write` | 写入文件内容 | -| Edit | `edit` | 编辑现有文件 | -| Glob | `glob` | 按模式查找文件 | -| Exec | `exec` | 执行 Shell 命令 | -| Process | `process` | 管理长时间运行的进程 | -| Web Fetch | `web_fetch` | 从 URL 获取并提取内容 | -| Web Search | `web_search` | 搜索网络(需要 API Key) | -| Memory Get | `memory_get` | 从持久化内存中获取值 | -| Memory Set | `memory_set` | 向持久化内存中存储值 | -| Memory Delete | `memory_delete` | 从持久化内存中删除值 | -| Memory List | `memory_list` | 列出持久化内存中的所有键 | +| 工具 | 名称 | 描述 | +| -------------- | ---------------- | ------------------------------ | +| Read | `read` | 读取文件内容 | +| Write | `write` | 写入文件内容 | +| Edit | `edit` | 编辑现有文件 | +| Glob | `glob` | 按模式查找文件 | +| Exec | `exec` | 执行 Shell 命令 | +| Process | `process` | 管理长时间运行的进程 | +| Web Fetch | `web_fetch` | 从 URL 获取并提取内容 | +| Web Search | `web_search` | 搜索网络(需要 API Key) | +| Memory Search | `memory_search` | 搜索 memory 文件(需要 Profile)| +| Sessions Spawn | `sessions_spawn` | 创建子 Agent 会话 | -> **注意**: Memory 工具需要指定 `profileId`。数据存储在 Profile 的 memory 目录中。 +> **注意**: `memory_search` 工具通过关键词搜索 `memory.md` 和 `memory/*.md` 文件。Agent 通过 `read` 和 `edit` 工具操作 memory 文件内容。 ## 工具组 工具组提供了一次性允许/禁止多个工具的快捷方式: -| 组 | 工具 | -| --------------- | -------------------------------------------------- | -| `group:fs` | read, write, edit, glob | -| `group:runtime` | exec, process | -| `group:web` | web_search, web_fetch | -| `group:memory` | memory_get, memory_set, memory_delete, memory_list | -| `group:core` | 以上所有(不包括 memory) | +| 组 | 工具 | +| ---------------- | ------------------------------ | +| `group:fs` | read, write, edit, glob | +| `group:runtime` | exec, process | +| `group:web` | web_search, web_fetch | +| `group:memory` | memory_search | +| `group:subagent` | sessions_spawn | +| `group:core` | 所有 fs、runtime 和 web 工具 | ## 使用方法 diff --git a/src/agent/tools/groups.ts b/src/agent/tools/groups.ts index dde6b5a0..821cee91 100644 --- a/src/agent/tools/groups.ts +++ b/src/agent/tools/groups.ts @@ -30,8 +30,8 @@ export const TOOL_GROUPS: Record = { // Web tools "group:web": ["web_search", "web_fetch"], - // Memory tools (requires profileId) - "group:memory": ["memory_get", "memory_set", "memory_delete", "memory_list"], + // Memory tools (requires profile) + "group:memory": ["memory_search"], // Subagent tools "group:subagent": ["sessions_spawn"], diff --git a/src/agent/tools/memory-search.test.ts b/src/agent/tools/memory-search.test.ts new file mode 100644 index 00000000..db955bb6 --- /dev/null +++ b/src/agent/tools/memory-search.test.ts @@ -0,0 +1,154 @@ +import { describe, it, expect, beforeEach, afterEach } from "vitest"; +import { mkdirSync, writeFileSync, rmSync } from "fs"; +import { join } from "path"; +import { tmpdir } from "os"; +import { createMemorySearchTool } from "./memory-search.js"; + +describe("memory_search tool", () => { + let testDir: string; + + beforeEach(() => { + testDir = join(tmpdir(), `memory-search-test-${Date.now()}`); + mkdirSync(testDir, { recursive: true }); + }); + + afterEach(() => { + rmSync(testDir, { recursive: true, force: true }); + }); + + it("creates tool with correct name and description", () => { + const tool = createMemorySearchTool(testDir); + expect(tool.name).toBe("memory_search"); + expect(tool.label).toBe("Memory Search"); + expect(tool.description).toContain("memory files"); + }); + + it("returns no matches when no memory files exist", async () => { + const tool = createMemorySearchTool(testDir); + const result = await tool.execute("test-call", { query: "test" }, undefined); + expect(result.details?.matches).toHaveLength(0); + expect(result.details?.filesSearched).toBe(0); + }); + + it("searches memory.md file", async () => { + // Create memory.md with test content + writeFileSync( + join(testDir, "memory.md"), + "# Memory\n\nUser prefers TypeScript over JavaScript.\n\nDecision: Use ESLint for linting.\n", + ); + + const tool = createMemorySearchTool(testDir); + const result = await tool.execute("test-call", { query: "TypeScript" }, undefined); + + expect(result.details?.matches).toHaveLength(1); + expect(result.details?.matches[0]?.file).toBe("memory.md"); + expect(result.details?.matches[0]?.content).toContain("TypeScript"); + }); + + it("searches memory/*.md files", async () => { + // Create memory directory with daily logs + const memoryDir = join(testDir, "memory"); + mkdirSync(memoryDir); + writeFileSync( + join(memoryDir, "2024-01-15.md"), + "# 2024-01-15\n\nDiscussed API design with team.\n", + ); + writeFileSync( + join(memoryDir, "2024-01-16.md"), + "# 2024-01-16\n\nImplemented user authentication.\n", + ); + + const tool = createMemorySearchTool(testDir); + const result = await tool.execute("test-call", { query: "API" }, undefined); + + expect(result.details?.matches).toHaveLength(1); + expect(result.details?.matches[0]?.file).toBe("memory/2024-01-15.md"); + }); + + it("searches both memory.md and memory/*.md", async () => { + // Create memory.md + writeFileSync(join(testDir, "memory.md"), "Important: Always test code.\n"); + + // Create memory directory + const memoryDir = join(testDir, "memory"); + mkdirSync(memoryDir); + writeFileSync(join(memoryDir, "2024-01-15.md"), "Remember to test before deploy.\n"); + + const tool = createMemorySearchTool(testDir); + const result = await tool.execute("test-call", { query: "test" }, undefined); + + expect(result.details?.matches).toHaveLength(2); + expect(result.details?.filesSearched).toBe(2); + }); + + it("is case-insensitive by default", async () => { + writeFileSync(join(testDir, "memory.md"), "User prefers TYPESCRIPT.\n"); + + const tool = createMemorySearchTool(testDir); + const result = await tool.execute("test-call", { query: "typescript" }, undefined); + + expect(result.details?.matches).toHaveLength(1); + }); + + it("supports case-sensitive search", async () => { + writeFileSync(join(testDir, "memory.md"), "User prefers TYPESCRIPT.\n"); + + const tool = createMemorySearchTool(testDir); + + // Case-sensitive search should not match + const result1 = await tool.execute( + "test-call", + { query: "typescript", caseSensitive: true }, + undefined, + ); + expect(result1.details?.matches).toHaveLength(0); + + // Case-sensitive search should match + const result2 = await tool.execute( + "test-call", + { query: "TYPESCRIPT", caseSensitive: true }, + undefined, + ); + expect(result2.details?.matches).toHaveLength(1); + }); + + it("includes context lines in results", async () => { + writeFileSync( + join(testDir, "memory.md"), + "Line 1\nLine 2\nMatch here\nLine 4\nLine 5\n", + ); + + const tool = createMemorySearchTool(testDir); + const result = await tool.execute("test-call", { query: "Match" }, undefined); + + expect(result.details?.matches).toHaveLength(1); + expect(result.details?.matches[0]?.context.before).toContain("Line 2"); + expect(result.details?.matches[0]?.context.after).toContain("Line 4"); + }); + + it("respects maxResults limit", async () => { + // Create file with multiple matches + writeFileSync( + join(testDir, "memory.md"), + "test line 1\ntest line 2\ntest line 3\ntest line 4\ntest line 5\n", + ); + + const tool = createMemorySearchTool(testDir); + const result = await tool.execute( + "test-call", + { query: "test", maxResults: 2 }, + undefined, + ); + + expect(result.details?.matches).toHaveLength(2); + expect(result.details?.totalMatches).toBe(5); + expect(result.details?.truncated).toBe(true); + }); + + it("throws error for empty query", async () => { + const tool = createMemorySearchTool(testDir); + await expect(tool.execute("test-call", { query: "" }, undefined)).rejects.toThrow( + "Query must not be empty", + ); + }); +}); diff --git a/src/agent/tools/memory-search.ts b/src/agent/tools/memory-search.ts new file mode 100644 index 00000000..666f0426 --- /dev/null +++ b/src/agent/tools/memory-search.ts @@ -0,0 +1,276 @@ +import { Type } from "@sinclair/typebox"; +import type { AgentTool } from "@mariozechner/pi-agent-core"; +import * as fs from "fs/promises"; +import * as path from "path"; +import fg from "fast-glob"; + +const MemorySearchSchema = Type.Object({ + query: Type.String({ + description: "Search query - keywords or phrases to find in memory files.", + }), + maxResults: Type.Optional( + Type.Number({ + description: "Maximum number of results to return. Defaults to 10.", + minimum: 1, + maximum: 50, + }), + ), + caseSensitive: Type.Optional( + Type.Boolean({ + description: "Whether the search is case-sensitive. Defaults to false.", + }), + ), +}); + +type MemorySearchArgs = { + query: string; + maxResults?: number; + caseSensitive?: boolean; +}; + +export type MemorySearchMatch = { + file: string; + line: number; + content: string; + context: { + before: string[]; + after: string[]; + }; +}; + +export type MemorySearchResult = { + matches: MemorySearchMatch[]; + totalMatches: number; + filesSearched: number; + truncated: boolean; +}; + +const DEFAULT_MAX_RESULTS = 10; +const CONTEXT_LINES = 2; + +/** + * Create a memory_search tool for searching memory files. + * + * @param profileDir - Profile directory containing memory.md and memory/ folder + */ +export function createMemorySearchTool( + profileDir: string, +): AgentTool { + return { + name: "memory_search", + label: "Memory Search", + description: + "Search through memory files (memory.md and memory/*.md) for keywords or phrases. " + + "Use this before answering questions about prior work, decisions, dates, people, preferences, or todos. " + + "Returns matching lines with context.", + parameters: MemorySearchSchema, + execute: async (_toolCallId, args, _signal) => { + const { query, maxResults, caseSensitive } = args as MemorySearchArgs; + + if (!query || query.trim() === "") { + throw new Error("Query must not be empty"); + } + + const limit = Math.min(maxResults || DEFAULT_MAX_RESULTS, 50); + const searchQuery = caseSensitive ? query : query.toLowerCase(); + + // Find all memory files + const memoryFiles = await findMemoryFiles(profileDir); + + if (memoryFiles.length === 0) { + return { + content: [{ type: "text", text: "No memory files found." }], + details: { + matches: [], + totalMatches: 0, + filesSearched: 0, + truncated: false, + }, + }; + } + + // Search each file + const allMatches: MemorySearchMatch[] = []; + + for (const file of memoryFiles) { + const matches = await searchFile(file, searchQuery, caseSensitive ?? false, profileDir); + allMatches.push(...matches); + } + + // Sort by relevance (files with more matches first, then by line number) + allMatches.sort((a, b) => { + if (a.file !== b.file) { + // Count matches per file + const aCount = allMatches.filter((m) => m.file === a.file).length; + const bCount = allMatches.filter((m) => m.file === b.file).length; + return bCount - aCount; + } + return a.line - b.line; + }); + + const totalMatches = allMatches.length; + const truncated = allMatches.length > limit; + const limitedMatches = allMatches.slice(0, limit); + + // Format output + const output = formatSearchResults(limitedMatches, totalMatches, truncated, memoryFiles.length); + + return { + content: [{ type: "text", text: output }], + details: { + matches: limitedMatches, + totalMatches, + filesSearched: memoryFiles.length, + truncated, + }, + }; + }, + }; +} + +/** + * Find all memory files in the profile directory. + */ +async function findMemoryFiles(profileDir: string): Promise { + const files: string[] = []; + + // Check for memory.md in profile root + const memoryMd = path.join(profileDir, "memory.md"); + try { + await fs.access(memoryMd); + files.push(memoryMd); + } catch { + // File doesn't exist + } + + // Check for memory/*.md files + const memoryDir = path.join(profileDir, "memory"); + try { + await fs.access(memoryDir); + const mdFiles = await fg("*.md", { + cwd: memoryDir, + onlyFiles: true, + absolute: true, + }); + files.push(...mdFiles); + } catch { + // Directory doesn't exist + } + + return files; +} + +/** + * Search a single file for the query. + */ +async function searchFile( + filePath: string, + query: string, + caseSensitive: boolean, + profileDir: string, +): Promise { + const matches: MemorySearchMatch[] = []; + + try { + const content = await fs.readFile(filePath, "utf-8"); + const lines = content.split("\n"); + + for (let i = 0; i < lines.length; i++) { + const line = lines[i]!; + const searchLine = caseSensitive ? line : line.toLowerCase(); + + if (searchLine.includes(query)) { + // Get context lines + const beforeLines: string[] = []; + const afterLines: string[] = []; + + for (let j = Math.max(0, i - CONTEXT_LINES); j < i; j++) { + beforeLines.push(lines[j]!); + } + + for (let j = i + 1; j <= Math.min(lines.length - 1, i + CONTEXT_LINES); j++) { + afterLines.push(lines[j]!); + } + + // Get relative path for display + const relativePath = path.relative(profileDir, filePath); + + matches.push({ + file: relativePath, + line: i + 1, // 1-indexed + content: line, + context: { + before: beforeLines, + after: afterLines, + }, + }); + } + } + } catch (err) { + // Skip files that can't be read + console.error(`Failed to read ${filePath}:`, err); + } + + return matches; +} + +/** + * Format search results for display. + */ +function formatSearchResults( + matches: MemorySearchMatch[], + totalMatches: number, + truncated: boolean, + filesSearched: number, +): string { + if (matches.length === 0) { + return `No matches found in ${filesSearched} memory file(s).`; + } + + const lines: string[] = []; + lines.push(`Found ${totalMatches} match(es) in ${filesSearched} file(s):`); + + if (truncated) { + lines.push(`(Showing first ${matches.length} results)`); + } + + lines.push(""); + + // Group by file + const byFile = new Map(); + for (const match of matches) { + const existing = byFile.get(match.file) || []; + existing.push(match); + byFile.set(match.file, existing); + } + + for (const [file, fileMatches] of byFile) { + lines.push(`## ${file}`); + lines.push(""); + + for (const match of fileMatches) { + lines.push(`**Line ${match.line}:**`); + + // Show context before + if (match.context.before.length > 0) { + for (const ctx of match.context.before) { + lines.push(` ${ctx}`); + } + } + + // Show matching line (highlighted) + lines.push(`> ${match.content}`); + + // Show context after + if (match.context.after.length > 0) { + for (const ctx of match.context.after) { + lines.push(` ${ctx}`); + } + } + + lines.push(""); + } + } + + return lines.join("\n"); +} diff --git a/src/agent/tools/memory/index.ts b/src/agent/tools/memory/index.ts deleted file mode 100644 index f2d14e6e..00000000 --- a/src/agent/tools/memory/index.ts +++ /dev/null @@ -1,6 +0,0 @@ -/** - * Memory Tools Module - */ - -export { createMemoryTools } from "./memory-tools.js"; -export type { MemoryEntry, MemoryStorageOptions, MemoryListResult } from "./types.js"; diff --git a/src/agent/tools/memory/memory-tools.ts b/src/agent/tools/memory/memory-tools.ts deleted file mode 100644 index 19fe5804..00000000 --- a/src/agent/tools/memory/memory-tools.ts +++ /dev/null @@ -1,175 +0,0 @@ -/** - * Memory Tools - * - * Provides persistent key-value storage for agents. - */ - -import { Type } from "@sinclair/typebox"; -import type { AgentTool } from "@mariozechner/pi-agent-core"; -import { memoryDelete, memoryGet, memoryList, memorySet, validateKey } from "./storage.js"; -import type { MemoryStorageOptions } from "./types.js"; - -// ============================================================================ -// Schemas -// ============================================================================ - -const MemoryGetSchema = Type.Object({ - key: Type.String({ description: "The key to retrieve" }), -}); - -const MemorySetSchema = Type.Object({ - key: Type.String({ description: "The key to set (alphanumeric, underscore, dot, hyphen)" }), - value: Type.Unknown({ description: "The value to store (will be JSON serialized)" }), - description: Type.Optional( - Type.String({ description: "Optional description of this memory entry" }), - ), -}); - -const MemoryDeleteSchema = Type.Object({ - key: Type.String({ description: "The key to delete" }), -}); - -const MemoryListSchema = Type.Object({ - prefix: Type.Optional(Type.String({ description: "Filter keys by prefix" })), - limit: Type.Optional(Type.Number({ description: "Maximum number of keys to return (default 100)" })), -}); - -// ============================================================================ -// Helper -// ============================================================================ - -function jsonResult(data: T): { - content: Array<{ type: "text"; text: string }>; - details: T; -} { - return { - content: [{ type: "text", text: JSON.stringify(data, null, 2) }], - details: data, - }; -} - -// ============================================================================ -// Tools -// ============================================================================ - -export function createMemoryGetTool( - options: MemoryStorageOptions, -): AgentTool { - return { - name: "memory_get", - label: "Memory Get", - description: "Retrieve a value from persistent memory by key.", - parameters: MemoryGetSchema, - execute: async (_toolCallId, params) => { - const key = typeof params.key === "string" ? params.key.trim() : ""; - - const validation = validateKey(key); - if (!validation.valid) { - return jsonResult({ found: false, error: validation.error }); - } - - const result = memoryGet(key, options); - if (!result.found) { - return jsonResult({ found: false, key }); - } - - return jsonResult({ - found: true, - key, - value: result.entry.value, - description: result.entry.description, - updatedAt: result.entry.updatedAt, - }); - }, - }; -} - -export function createMemorySetTool( - options: MemoryStorageOptions, -): AgentTool { - return { - name: "memory_set", - label: "Memory Set", - description: - "Store a value in persistent memory. The value will be JSON serialized. " + - "Keys can contain letters, numbers, underscores, dots, and hyphens.", - parameters: MemorySetSchema, - execute: async (_toolCallId, params) => { - const key = typeof params.key === "string" ? params.key.trim() : ""; - const value = params.value; - const description = typeof params.description === "string" ? params.description : undefined; - - const result = memorySet(key, value, description, options); - if (!result.success) { - return jsonResult({ success: false, error: result.error }); - } - - return jsonResult({ success: true, key }); - }, - }; -} - -export function createMemoryDeleteTool( - options: MemoryStorageOptions, -): AgentTool { - return { - name: "memory_delete", - label: "Memory Delete", - description: "Delete a value from persistent memory by key.", - parameters: MemoryDeleteSchema, - execute: async (_toolCallId, params) => { - const key = typeof params.key === "string" ? params.key.trim() : ""; - - const validation = validateKey(key); - if (!validation.valid) { - return jsonResult({ success: false, error: validation.error }); - } - - const result = memoryDelete(key, options); - if (!result.success) { - return jsonResult({ success: false, error: result.error }); - } - - return jsonResult({ success: true, key, existed: result.existed }); - }, - }; -} - -export function createMemoryListTool( - options: MemoryStorageOptions, -): AgentTool { - return { - name: "memory_list", - label: "Memory List", - description: - "List all keys in persistent memory, sorted by most recently updated. " + - "Optionally filter by prefix.", - parameters: MemoryListSchema, - execute: async (_toolCallId, params) => { - const prefix = typeof params.prefix === "string" ? params.prefix : undefined; - const limit = typeof params.limit === "number" ? params.limit : undefined; - - const result = memoryList(prefix, limit, options); - - return jsonResult({ - keys: result.keys, - total: result.total, - truncated: result.truncated, - }); - }, - }; -} - -/** - * Create all memory tools for a profile - */ -export function createMemoryTools( - options: MemoryStorageOptions, -): Array> { - return [ - createMemoryGetTool(options), - createMemorySetTool(options), - createMemoryDeleteTool(options), - createMemoryListTool(options), - ]; -} diff --git a/src/agent/tools/memory/storage.test.ts b/src/agent/tools/memory/storage.test.ts deleted file mode 100644 index c67592aa..00000000 --- a/src/agent/tools/memory/storage.test.ts +++ /dev/null @@ -1,224 +0,0 @@ -import { describe, it, expect, beforeEach, afterEach } from "vitest"; -import { existsSync, mkdirSync, rmSync } from "node:fs"; -import { join } from "node:path"; -import { tmpdir } from "node:os"; -import { - validateKey, - memoryGet, - memorySet, - memoryDelete, - memoryList, - getMemoryDir, -} from "./storage.js"; -import type { MemoryStorageOptions } from "./types.js"; - -describe("memory storage", () => { - const testBaseDir = join(tmpdir(), `multica-memory-test-${Date.now()}`); - const profileId = "test-profile"; - - const options: MemoryStorageOptions = { - profileId, - baseDir: testBaseDir, - }; - - beforeEach(() => { - if (existsSync(testBaseDir)) { - rmSync(testBaseDir, { recursive: true }); - } - mkdirSync(testBaseDir, { recursive: true }); - }); - - afterEach(() => { - if (existsSync(testBaseDir)) { - rmSync(testBaseDir, { recursive: true }); - } - }); - - describe("validateKey", () => { - it("should accept valid alphanumeric keys", () => { - expect(validateKey("mykey")).toEqual({ valid: true }); - expect(validateKey("my_key")).toEqual({ valid: true }); - expect(validateKey("my-key")).toEqual({ valid: true }); - expect(validateKey("my.key")).toEqual({ valid: true }); - expect(validateKey("MyKey123")).toEqual({ valid: true }); - }); - - it("should reject empty keys", () => { - expect(validateKey("")).toMatchObject({ valid: false, error: "Key is required" }); - expect(validateKey(" ")).toMatchObject({ valid: false, error: "Key cannot be empty" }); - }); - - it("should reject keys with invalid characters", () => { - const result = validateKey("my key"); - expect(result.valid).toBe(false); - if (!result.valid) { - expect(result.error).toContain("can only contain"); - } - }); - - it("should reject keys that are too long", () => { - const longKey = "a".repeat(129); - const result = validateKey(longKey); - expect(result.valid).toBe(false); - if (!result.valid) { - expect(result.error).toContain("exceeds maximum length"); - } - }); - }); - - describe("memorySet and memoryGet", () => { - it("should set and get a string value", () => { - const result = memorySet("test-key", "test-value", undefined, options); - expect(result).toEqual({ success: true }); - - const getResult = memoryGet("test-key", options); - expect(getResult.found).toBe(true); - if (getResult.found) { - expect(getResult.entry.value).toBe("test-value"); - } - }); - - it("should set and get a complex object", () => { - const value = { name: "test", count: 42, nested: { a: 1 } }; - memorySet("complex-key", value, "A complex object", options); - - const getResult = memoryGet("complex-key", options); - expect(getResult.found).toBe(true); - if (getResult.found) { - expect(getResult.entry.value).toEqual(value); - expect(getResult.entry.description).toBe("A complex object"); - } - }); - - it("should update existing key and preserve createdAt", async () => { - memorySet("update-key", "initial", undefined, options); - const firstGet = memoryGet("update-key", options); - expect(firstGet.found).toBe(true); - - // Wait a bit to ensure different timestamp - await new Promise((resolve) => setTimeout(resolve, 10)); - - memorySet("update-key", "updated", undefined, options); - const secondGet = memoryGet("update-key", options); - - expect(secondGet.found).toBe(true); - if (firstGet.found && secondGet.found) { - expect(secondGet.entry.value).toBe("updated"); - expect(secondGet.entry.createdAt).toBe(firstGet.entry.createdAt); - expect(secondGet.entry.updatedAt).toBeGreaterThan(firstGet.entry.createdAt); - } - }); - - it("should return not found for non-existent key", () => { - const result = memoryGet("non-existent", options); - expect(result.found).toBe(false); - }); - - it("should handle keys with dots", () => { - memorySet("user.settings.theme", "dark", undefined, options); - - const result = memoryGet("user.settings.theme", options); - expect(result.found).toBe(true); - if (result.found) { - expect(result.entry.value).toBe("dark"); - } - }); - - it("should reject value that is too large", () => { - const largeValue = "x".repeat(1024 * 1024 + 1); - const result = memorySet("large-key", largeValue, undefined, options); - expect(result).toMatchObject({ success: false }); - if (!result.success) { - expect(result.error).toContain("exceeds maximum size"); - } - }); - }); - - describe("memoryDelete", () => { - it("should delete existing key", () => { - memorySet("delete-me", "value", undefined, options); - expect(memoryGet("delete-me", options).found).toBe(true); - - const result = memoryDelete("delete-me", options); - expect(result).toEqual({ success: true, existed: true }); - - expect(memoryGet("delete-me", options).found).toBe(false); - }); - - it("should handle deleting non-existent key", () => { - const result = memoryDelete("non-existent", options); - expect(result).toEqual({ success: true, existed: false }); - }); - - it("should reject invalid key", () => { - const result = memoryDelete("invalid key", options); - expect(result.success).toBe(false); - }); - }); - - describe("memoryList", () => { - beforeEach(() => { - // Create some test keys - memorySet("project.config", { name: "test" }, "Project config", options); - memorySet("project.settings", { theme: "dark" }, "Settings", options); - memorySet("user.name", "Alice", "User name", options); - }); - - it("should list all keys", () => { - const result = memoryList(undefined, undefined, options); - - expect(result.total).toBe(3); - expect(result.truncated).toBe(false); - expect(result.keys.map((k) => k.key)).toContain("project.config"); - expect(result.keys.map((k) => k.key)).toContain("project.settings"); - expect(result.keys.map((k) => k.key)).toContain("user.name"); - }); - - it("should filter by prefix", () => { - const result = memoryList("project", undefined, options); - - expect(result.total).toBe(2); - expect(result.keys.map((k) => k.key)).toContain("project.config"); - expect(result.keys.map((k) => k.key)).toContain("project.settings"); - expect(result.keys.map((k) => k.key)).not.toContain("user.name"); - }); - - it("should respect limit", () => { - const result = memoryList(undefined, 2, options); - - expect(result.keys.length).toBe(2); - expect(result.total).toBe(3); - expect(result.truncated).toBe(true); - }); - - it("should sort by updatedAt descending", async () => { - // Wait and update one key - await new Promise((resolve) => setTimeout(resolve, 10)); - memorySet("project.config", { name: "updated" }, "Updated config", options); - - const result = memoryList(undefined, undefined, options); - - // project.config should be first as it was updated most recently - expect(result.keys[0]?.key).toBe("project.config"); - }); - - it("should return empty array for non-existent directory", () => { - const emptyOptions: MemoryStorageOptions = { - profileId: "non-existent-profile", - baseDir: testBaseDir, - }; - - const result = memoryList(undefined, undefined, emptyOptions); - expect(result.keys).toEqual([]); - expect(result.total).toBe(0); - }); - }); - - describe("getMemoryDir", () => { - it("should return correct memory directory path", () => { - const dir = getMemoryDir(options); - expect(dir).toContain(profileId); - expect(dir).toContain("memory"); - }); - }); -}); diff --git a/src/agent/tools/memory/storage.ts b/src/agent/tools/memory/storage.ts deleted file mode 100644 index e7c29758..00000000 --- a/src/agent/tools/memory/storage.ts +++ /dev/null @@ -1,240 +0,0 @@ -/** - * Memory Storage Layer - * - * Handles file-based storage for agent memory in the profile directory. - */ - -import { existsSync, mkdirSync, readFileSync, readdirSync, rmSync, writeFileSync } from "node:fs"; -import { join } from "node:path"; -import { getProfileDir } from "../../profile/storage.js"; -import { - DEFAULT_LIST_LIMIT, - KEY_PATTERN, - MAX_KEY_LENGTH, - MAX_LIST_LIMIT, - MAX_VALUE_SIZE, - type MemoryEntry, - type MemoryListResult, - type MemoryStorageOptions, -} from "./types.js"; - -/** - * Validate a memory key - */ -export function validateKey(key: string): { valid: true } | { valid: false; error: string } { - if (!key || typeof key !== "string") { - return { valid: false, error: "Key is required" }; - } - - const trimmed = key.trim(); - if (trimmed.length === 0) { - return { valid: false, error: "Key cannot be empty" }; - } - - if (trimmed.length > MAX_KEY_LENGTH) { - return { valid: false, error: `Key exceeds maximum length of ${MAX_KEY_LENGTH}` }; - } - - if (!KEY_PATTERN.test(trimmed)) { - return { - valid: false, - error: "Key can only contain letters, numbers, underscores, dots, and hyphens", - }; - } - - return { valid: true }; -} - -/** - * Get the memory directory for a profile - */ -export function getMemoryDir(options: MemoryStorageOptions): string { - const profileDir = getProfileDir(options.profileId, { baseDir: options.baseDir }); - return join(profileDir, "memory"); -} - -/** - * Ensure the memory directory exists - */ -export function ensureMemoryDir(options: MemoryStorageOptions): string { - const memoryDir = getMemoryDir(options); - if (!existsSync(memoryDir)) { - mkdirSync(memoryDir, { recursive: true }); - } - return memoryDir; -} - -/** - * Get the file path for a memory key - */ -function getKeyFilePath(key: string, options: MemoryStorageOptions): string { - const memoryDir = getMemoryDir(options); - // Sanitize key for filename (replace dots with double underscore to avoid extension issues) - const safeKey = key.replace(/\./g, "__DOT__"); - return join(memoryDir, `${safeKey}.json`); -} - -/** - * Decode a sanitized filename back to the original key - */ -function decodeKeyFromFilename(filename: string): string { - // Remove .json extension and decode - const base = filename.replace(/\.json$/, ""); - return base.replace(/__DOT__/g, "."); -} - -/** - * Get a memory value by key - */ -export function memoryGet( - key: string, - options: MemoryStorageOptions, -): { found: true; entry: MemoryEntry } | { found: false } { - const validation = validateKey(key); - if (!validation.valid) { - return { found: false }; - } - - const filePath = getKeyFilePath(key.trim(), options); - if (!existsSync(filePath)) { - return { found: false }; - } - - try { - const content = readFileSync(filePath, "utf-8"); - const entry = JSON.parse(content) as MemoryEntry; - return { found: true, entry }; - } catch { - return { found: false }; - } -} - -/** - * Set a memory value - */ -export function memorySet( - key: string, - value: unknown, - description: string | undefined, - options: MemoryStorageOptions, -): { success: true } | { success: false; error: string } { - const validation = validateKey(key); - if (validation.valid === false) { - return { success: false, error: validation.error }; - } - - // Check value size - const serialized = JSON.stringify(value); - if (serialized.length > MAX_VALUE_SIZE) { - return { success: false, error: `Value exceeds maximum size of ${MAX_VALUE_SIZE} bytes` }; - } - - const trimmedKey = key.trim(); - ensureMemoryDir(options); - - const now = Date.now(); - const existing = memoryGet(trimmedKey, options); - - const trimmedDescription = description?.trim(); - const entry: MemoryEntry = { - value, - ...(trimmedDescription ? { description: trimmedDescription } : {}), - createdAt: existing.found ? existing.entry.createdAt : now, - updatedAt: now, - }; - - const filePath = getKeyFilePath(trimmedKey, options); - - try { - writeFileSync(filePath, JSON.stringify(entry, null, 2), "utf-8"); - return { success: true }; - } catch (err) { - const message = err instanceof Error ? err.message : String(err); - return { success: false, error: `Failed to write memory: ${message}` }; - } -} - -/** - * Delete a memory key - */ -export function memoryDelete( - key: string, - options: MemoryStorageOptions, -): { success: true; existed: boolean } | { success: false; error: string } { - const validation = validateKey(key); - if (validation.valid === false) { - return { success: false, error: validation.error }; - } - - const filePath = getKeyFilePath(key.trim(), options); - const existed = existsSync(filePath); - - if (existed) { - try { - rmSync(filePath); - } catch (err) { - const message = err instanceof Error ? err.message : String(err); - return { success: false, error: `Failed to delete memory: ${message}` }; - } - } - - return { success: true, existed }; -} - -/** - * List memory keys - */ -export function memoryList( - prefix: string | undefined, - limit: number | undefined, - options: MemoryStorageOptions, -): MemoryListResult { - const memoryDir = getMemoryDir(options); - - if (!existsSync(memoryDir)) { - return { keys: [], total: 0, truncated: false }; - } - - const effectiveLimit = Math.min( - Math.max(1, limit ?? DEFAULT_LIST_LIMIT), - MAX_LIST_LIMIT, - ); - - try { - const files = readdirSync(memoryDir).filter((f) => f.endsWith(".json")); - const entries: Array<{ key: string; description?: string; updatedAt: number }> = []; - - for (const file of files) { - const key = decodeKeyFromFilename(file); - - // Apply prefix filter - if (prefix && !key.startsWith(prefix)) { - continue; - } - - const filePath = join(memoryDir, file); - try { - const content = readFileSync(filePath, "utf-8"); - const entry = JSON.parse(content) as MemoryEntry; - entries.push({ - key, - ...(entry.description ? { description: entry.description } : {}), - updatedAt: entry.updatedAt, - }); - } catch { - // Skip invalid files - } - } - - // Sort by updatedAt descending (most recent first) - entries.sort((a, b) => b.updatedAt - a.updatedAt); - - const total = entries.length; - const truncated = total > effectiveLimit; - const keys = entries.slice(0, effectiveLimit); - - return { keys, total, truncated }; - } catch { - return { keys: [], total: 0, truncated: false }; - } -} diff --git a/src/agent/tools/memory/types.ts b/src/agent/tools/memory/types.ts deleted file mode 100644 index 58bc15bf..00000000 --- a/src/agent/tools/memory/types.ts +++ /dev/null @@ -1,67 +0,0 @@ -/** - * Memory Tool Type Definitions - */ - -/** Memory entry stored in JSON file */ -export interface MemoryEntry { - /** The stored value */ - value: unknown; - /** Optional description of this memory entry */ - description?: string; - /** Timestamp when created */ - createdAt: number; - /** Timestamp when last updated */ - updatedAt: number; -} - -/** Memory index structure */ -export interface MemoryIndex { - /** Version for future migrations */ - version: 1; - /** Map of key to metadata */ - keys: Record; -} - -/** Metadata for each key in the index */ -export interface MemoryKeyMeta { - /** Optional description */ - description?: string; - /** Created timestamp */ - createdAt: number; - /** Updated timestamp */ - updatedAt: number; -} - -/** Options for memory storage */ -export interface MemoryStorageOptions { - /** Profile ID (required for storage path) */ - profileId: string; - /** Base directory for profiles */ - baseDir?: string | undefined; -} - -/** Result from memory_list */ -export interface MemoryListResult { - keys: Array<{ - key: string; - description?: string; - updatedAt: number; - }>; - total: number; - truncated: boolean; -} - -/** Valid key pattern: alphanumeric, underscore, dot, hyphen */ -export const KEY_PATTERN = /^[a-zA-Z0-9_.-]+$/; - -/** Maximum key length */ -export const MAX_KEY_LENGTH = 128; - -/** Maximum value size in bytes (1MB) */ -export const MAX_VALUE_SIZE = 1024 * 1024; - -/** Default list limit */ -export const DEFAULT_LIST_LIMIT = 100; - -/** Maximum list limit */ -export const MAX_LIST_LIMIT = 1000; diff --git a/src/hub/hub.ts b/src/hub/hub.ts index ca6b42d7..8ba68d64 100644 --- a/src/hub/hub.ts +++ b/src/hub/hub.ts @@ -313,6 +313,18 @@ export class Hub { content: item.content, }); } else { + // Compaction events: forward with synthetic streamId (no stream tracking) + const isCompactionEvent = + item.type === "compaction_start" || item.type === "compaction_end"; + if (isCompactionEvent) { + this.client.send(targetDeviceId, StreamAction, { + streamId: `compaction:${agent.sessionId}`, + agentId: agent.sessionId, + event: item, + }); + continue; + } + // Filter: only forward events useful for frontend rendering const maybeMessage = (item as { message?: { role?: string } }).message; const isAssistantMessage = maybeMessage?.role === "assistant"; diff --git a/turbo.json b/turbo.json index d91a416f..22c06f3b 100644 --- a/turbo.json +++ b/turbo.json @@ -1,5 +1,6 @@ { "$schema": "https://turbo.build/schema.json", + "globalDependencies": ["src/**"], "tasks": { "build": { "dependsOn": ["^build"],