9.6 KiB
9.6 KiB
Remote Daemon Spec (SSH + Proxy)
Last updated: February 21, 2026
Tracking issue: https://github.com/manaflow-ai/cmux/issues/151
1. Goals
- Make
cmux ssh <target>reusable and reliable for repeated connections. - Reuse a single SSH transport for identical normalized host configs.
- Run a remote Go daemon (
cmuxd-remote) for session control and proxying. - Treat web proxying (HTTP CONNECT + SOCKS5 + websocket traffic) as core behavior.
- Keep plain shell usage (
ssh <target>) unchanged.
2. Non-Goals (v1)
- Full remote filesystem sync.
- TLS interception/MITM.
- Cross-user multi-tenant daemon sharing.
3. Architecture
3.1 Components
cmuxCLI and local app runtime.- Local SSH connection pool manager.
- Remote daemon:
cmuxd-remote(Go, cross-compiled). - Local proxy listener(s) for browser and tool traffic.
3.2 Reuse Model
- One active SSH transport per
ConnectionKey. - One SSH transport can host multiple logical remote sessions/workspaces.
- Reuse decision is based on normalized SSH config, not raw alias text.
3.3 ConnectionKey Normalization
Source: ssh -G <target> output plus explicit CLI flags.
Required key fields:
- resolved
hostname - resolved
user - resolved
port - ordered
identityfilelist +identitiesonly proxyjumpproxycommand- host key trust policy knobs (
stricthostkeychecking, user known hosts path, global known hosts path) - auth-impacting extra options passed by
cmux ssh --ssh-option
Rules:
- All key names lowercased.
- Whitespace trimmed.
- Multi-value fields normalized to deterministic order where OpenSSH order is not semantic.
- Hash with stable format to form
connection_key_hash.
3.4 Remote Daemon Bootstrap
Remote install path:
~/.cmux/bin/cmuxd-remote/<version>/<os>-<arch>/cmuxd-remote- metadata:
~/.cmux/bin/cmuxd-remote/<version>/manifest.json
Bootstrap flow:
- resolve target + connection key
- open SSH transport (or reuse existing)
- check remote daemon binary + checksum
- upload if missing/mismatched
- exec
cmuxd-remote serve --stdio - perform version/capability handshake
3.5 Local/Remote Protocol
Transport:
- framed multiplexed protocol over SSH stdio
- one control channel + N data channels
Required control RPCs:
hellosession.createsession.attachsession.detachsession.closesession.resizesession.signalservice.watchproxy.openproxy.closeheartbeat
Required observability fields in status APIs:
connection_key_hashtransport_idtransport_refcountlast_heartbeat_atreconnect_attemptsproxy_channels_active
3.6 Proxying Model
Proxy roles:
- local HTTP CONNECT endpoint bound to loopback
- local SOCKS5 endpoint bound to loopback
- optional explicit local forward binds for known remote ports
Behavior:
- CONNECT/SOCKS requests are tunneled to remote daemon, which dials remote destinations.
- Daemon may enforce allow/deny policy (default allow loopback targets + discovered listening services).
- Websocket traffic must pass transparently through both proxy modes.
- Local bind conflicts are surfaced as structured errors and trigger next-port fallback where configured.
3.7 Reconnect Semantics
States:
connecteddegradedreconnectingdisconnectedfatal
Rules:
- Transport loss moves all attached logical sessions to
reconnecting. - If reattach succeeds, restore
connectedwithout creating duplicate sessions. - Persistent sessions survive local app restart and reconnect.
- Ephemeral sessions may be GC'd by daemon after TTL if no client reattaches.
4. Security Requirements
- SSH remains the auth boundary.
- Remote binary integrity must be checksum-verified before exec.
- Daemon listens only on stdio/unix socket/loopback (never public interfaces by default).
- No plaintext persistence of SSH secrets outside normal SSH tooling.
5. Test Strategy
Three layers:
- unit tests: normalization, key hashing, state machine transitions
- integration tests: dockerized ssh targets + proxy fixtures
- end-to-end tests: cmux CLI + UI socket methods + process-level assertions
Required test fixtures:
- existing SSH fixture (
tests/fixtures/ssh-remote/) - HTTP CONNECT target fixture (HTTP service behind daemon)
- websocket fixture (echo server behind daemon)
- fault fixture (transport kill, delayed network, remote daemon restart)
6. Test Matrix
Pass criteria convention:
- every case defines deterministic assertions
- all
MUSTassertions pass on CI - flaky cases are not allowed for merge gates
6.1 Terminal Session Cases
| ID | Scenario | Setup | Steps | MUST Assertions |
|---|---|---|---|---|
| T-001 | Single connect baseline | fresh app, no pooled transport | cmux ssh cmux-vm |
one transport created; one remote session attached; workspace shows remote state connected |
| T-002 | Reuse identical host | existing connected transport for key K | run cmux ssh cmux-vm twice |
both workspaces map to same transport_id; transport_refcount == 2; only one SSH transport process for key K |
| T-003 | Do not reuse changed identity | key file A then key file B | run cmux ssh host --identity A, then B |
two distinct connection_key_hash values; two transport processes |
| T-004 | Do not reuse changed proxyjump | host via jump1 then jump2 | run with different jump options | no reuse across different normalized proxy settings |
| T-005 | Optional name behavior | none | run cmux ssh host (no --name) |
workspace is created; title non-empty; no CLI error |
| T-006 | Scoped ssh niceties | none | run cmux ssh host --json and inspect emitted command metadata |
emitted ssh_command includes scoped GHOSTTY_SHELL_FEATURES ... ssh-env,ssh-terminfo; plain shell default features remain unchanged |
| T-007 | Session detach/attach | persistent session enabled | create session, detach local workspace, reattach | same remote session_id; shell state/history retained |
| T-008 | Explicit close | active session + transport refcount 1 | close workspace | remote session closes; transport released when refcount reaches 0 |
6.2 Web Proxy Traffic Cases
| ID | Scenario | Setup | Steps | MUST Assertions |
|---|---|---|---|---|
| W-001 | HTTP CONNECT basic | remote HTTP service on loopback | open local CONNECT proxy; fetch remote URL through proxy | 200 response body matches fixture payload |
| W-002 | SOCKS5 basic | same as W-001 | fetch remote URL through SOCKS5 endpoint | response matches direct remote response |
| W-003 | Websocket through CONNECT | remote websocket echo service | connect websocket via CONNECT proxy and exchange messages | echo payload integrity; no unexpected close frames |
| W-004 | Websocket through SOCKS5 | same as W-003 | connect via SOCKS5 | echo payload integrity |
| W-005 | Concurrent browser + terminal traffic | active terminal workload + browser requests | run high-volume stdout in session while proxying requests | no stalled PTY stream; proxy p95 latency below threshold |
| W-006 | Service discovery to local exposure | remote daemon detects listening app port | start remote web app, observe status payload | detected port listed; local forwarded/proxy route becomes reachable |
| W-007 | Local port conflict handling | reserve desired local bind port beforehand | request proxy/forward for conflicting port | conflict is reported structurally; allocator picks fallback if enabled |
| W-008 | Large response streaming | remote serves large payload | fetch 100MB file through proxy | byte count matches; no truncation/corruption |
6.3 Reconnect + Failure Cases
| ID | Scenario | Setup | Steps | MUST Assertions |
|---|---|---|---|---|
| R-001 | Transport process killed | active shared transport with 2 sessions | kill local SSH process | both sessions enter reconnecting; auto-reconnect starts |
| R-002 | Reconnect success reattach | continue R-001 with healthy remote | wait for reconnect | both sessions return connected; same remote session_ids; no duplicate shells |
| R-003 | Reconnect failure exhaustion | block network to host during reconnect | wait past retry budget | state becomes disconnected with actionable error; no busy-loop retries |
| R-004 | Remote daemon restart | kill cmuxd-remote but keep SSH transport |
observe client recovery | daemon restarts or re-exec path runs; sessions reattached per policy |
| R-005 | Persistent session across app restart | persistent session active | quit/relaunch cmux and reattach | session state preserved; command history/output continuity verified |
| R-006 | Ephemeral session GC | ephemeral session detached | wait TTL expiration | session removed remotely; subsequent attach gets not-found and creates fresh session |
| R-007 | Proxy channels during reconnect | active websocket + HTTP requests | induce transport flap | in-flight streams fail cleanly; new streams succeed after reconnect |
| R-008 | Heartbeat timeout | drop packets without killing process | observe heartbeat | timeout transitions to degraded/reconnecting; recovery after network restore |
7. CI Gate Proposal
Gate suites:
remote-terminal-core= T-001..T-006remote-proxy-core= W-001..W-004, W-006remote-reconnect-core= R-001..R-004
Nightly suites:
- high-load and large payload tests (W-005, W-008)
- long-running durability and GC tests (R-005..R-008)
8. Open Design Decisions
- Whether proxy endpoint is per transport (
connection_key_hash) or per workspace by default. - Default session policy (
ephemeralvspersistent) forcmux ssh. - Exact retry/backoff budgets for reconnect on laptop sleep/wake.
- Whether daemon upgrades are eager (on connect) or lazy (on capability miss).