* Add E2E test workflow with video recording and issue posting
New workflow_dispatch workflow (test-e2e.yml) that runs XCUITests on
GitHub-hosted macos-15 runners, records the virtual display, uploads the
video as an artifact, and posts results as an issue on cmux-dev-artifacts.
Includes scripts/run-e2e.sh for convenient triggering from the terminal.
* Print issue URL in workflow annotation and run-e2e.sh output
- Capture gh issue create output URL, print as ::notice annotation
- Search issues by run ID instead of grabbing most recent
Build immediately on merge instead of waiting for the hourly cron.
Concurrency group cancels in-progress builds when new commits land.
Depot macos runner replaced with GitHub macos-15 (similar perf, simpler).
* Split CI: GitHub runners for tests, Depot for perf regression
Unit/UI tests move to macos-15 (no queue wait, fast enough for test
suites). Typing-lag regression stays on Depot in a new tests-depot job
(needs stronger hardware). No duplicated test work between the two.
* Fix Xcode selection: add pipefail guard, use sort|tail for consistency
Address review comments:
- tests job: add || true to ls pipeline so fallback works under pipefail
- tests-depot job: use sort | tail -n 1 instead of head -n 1
* Move XCUITests from GitHub runner to Depot
tests (macos-15) now runs unit tests only. tests-depot (Depot) runs
UI tests and the typing-lag regression, reusing the same build.
* Add workspace-churn typing lag regression and fix
* Fix CI build for debug stress split calls
* Stabilize lag regression gate for low baseline latency
* Add macOS compatibility CI: unit tests + smoke test on macos-14/15
New workflow runs on GitHub-hosted macos-14 and macos-15 runners
(matrix strategy). Each run: unit tests via cmux-unit scheme, then
a smoke test that builds the app, launches it, sends a command via
the socket, and verifies it stays alive for 15 seconds.
* Select latest Xcode on runner (fix macos-14 Swift tools version)
macos-14 runners default to Xcode 15.4, but sentry-cocoa needs
Swift tools version 6.0 (Xcode 16+). Pick the latest Xcode_*.app
instead of the default symlink.
* Launch app binary directly in smoke test for better CI compatibility
Using `open` can fail silently on CI runners. Launch the binary
directly with env vars set, capture stdout/stderr, and add process
health checks with diagnostic output (debug log tail, crash reports)
on failure.
* Migrate all workflows from self-hosted Mac Mini to Depot runners
Move CI, nightly, and release workflows to depot-macos-latest. Replace
zig GhosttyKit builds with pre-built xcframework downloads. Add virtual
display for CI UI tests. Remove concurrency groups (ephemeral VMs don't
need them).
* Add per-test timeout to CI UI tests to prevent hangs on Depot
SidebarResizeUITests hangs on headless Depot runners due to mouse drag
simulation issues. Adding -maximum-test-execution-time-allowance 120
(matching test-depot.yml) ensures individual tests timeout after 2 min
instead of blocking the entire run.
* Skip SidebarResizeUITests in CI on Depot runners
Mouse drag simulation hangs on headless Depot runners even with a
virtual display. The per-test timeout doesn't prevent the hang either.
Skip this test class in CI; it still runs fine on local machines.
* Handle XCTExpectFailure in CI UI tests (exit 65 with 0 unexpected)
xcodebuild exits 65 even when all failures use XCTExpectFailure. Add
the same expected-failure handling from the unit test step so browser
focus tests (which are expected to fail on headless runners) don't
break CI.
* Add virtual display for headless Depot runners
Depot macOS runners have no physical display, causing XCUITests to fail
with "Failed to activate application (current state: Running Background)".
This adds a small ObjC tool that creates a virtual display using the
private CGVirtualDisplay API before tests run.
* Split self-hosted concurrency groups per workflow
CI, nightly, and release all shared `self-hosted-build`, so the hourly
nightly cancelled in-progress CI runs. Now each workflow has its own
group (self-hosted-ci, self-hosted-nightly, self-hosted-release).
CI also gets cancel-in-progress: true so rapid pushes cancel stale runs.
Depot macOS runners have no physical display, causing XCUITests to fail
with "Failed to activate application (current state: Running Background)".
This adds a small ObjC tool that creates a virtual display using the
private CGVirtualDisplay API before tests run.
- Display diagnostics step to debug headless display issues
- test_filter input to run specific XCUITest classes
- test_timeout input (default 120s) to prevent test hangs
* Run all XCUITests in CI instead of just UpdatePillUITests
Removes the -only-testing:cmuxUITests/UpdatePillUITests filter so
new test classes are picked up automatically. No more workflow edits
needed when adding tests.
* Add skip_unit_tests input to test-depot workflow
After rm -rf of the SPM cache dir, recreate it as an empty directory
so binary target downloads (e.g. Sentry.xcframework.zip) don't hit
"already exists in file system" errors from stale artifacts on the
self-hosted runner.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
build-ghosttykit shared the self-hosted-build concurrency group with
CI tests, causing one to get cancelled when both trigger on the same
push. Most runs are no-ops (xcframework already exists), so Depot is
a good fit. Eliminates the red X on pushes to main.
* Switch CI and Build GhosttyKit to Depot macOS runners
Moves the tests job (unit + UI) and build-ghosttykit job from
self-hosted to depot-macos-latest so CI doesn't block the local
Mac Mini. CI tests now download the pre-built xcframework from
the ghostty releases instead of building with zig. Nightly and
release workflows stay on self-hosted for signing/notarization.
* Add on-demand Depot test workflow, revert CI changes
Adds test-depot.yml (workflow_dispatch) for running tests on Depot
macOS runners during local dev without tying up the Mac Mini.
Reverts ci.yml and build-ghosttykit.yml back to self-hosted.
* Add `cmux <path>` to open directories and Homebrew binary stanza
CLI: `cmux .` or `cmux /path/to/dir` opens a new workspace at the
given directory. If the app isn't running, it launches first and waits
for the socket. Also adds `--cwd` flag to `new-workspace`.
Server: `workspace.create` now accepts an optional `cwd` parameter,
passed through to `TabManager.addWorkspace(workingDirectory:)`.
Homebrew: adds `binary` stanza to the cask so `cmux` CLI is globally
available after `brew install --cask cmux`. Updated both the cask file,
the CI workflow template, and the manual release script so automated
version bumps preserve the stanza.
* Address review: validate cwd type, fix socket detection, propagate errors
- looksLikePath now also matches paths containing `/` (e.g. `foo/bar`)
- openPath uses socket connection attempt instead of fileExists to detect
whether the app is running (Unix sockets may not appear on filesystem)
- launchApp/activateApp now throw instead of swallowing errors with try?
- Server validates that cwd param is a string, returns invalid_params error
if wrong type is passed
* Set cmux TestAction to Debug for UI tests
* Broaden XCTest detection for debug launch gate
* Fix AutomationSocketUITests launch hang in CI
* Stabilize CI Swift package resolution for test jobs
* Stabilize Xcode Cloud UI test focus and socket handling
* Add Xcode Cloud pre-xcodebuild submodule bootstrap
* Harden Xcode Cloud bonsplit bootstrap fallback
* Set up full test suite in CI and Xcode Cloud
Add build-ghosttykit.yml workflow to pre-build and publish
GhosttyKit.xcframework as a GitHub release on manaflow-ai/ghostty,
keyed by submodule SHA. Add ci_scripts/ci_post_clone.sh for Xcode
Cloud to download the pre-built xcframework with retry logic. Create
cmux-ci scheme that runs both cmuxTests and cmuxUITests. Switch the
CI tests job from running a single UI test class to the full suite.
* Run unit tests + single UI test class on self-hosted runner
The self-hosted runner can't launch the full app for UI tests (no GUI
session), so run all unit tests via cmux-unit scheme and keep the
original UpdatePillUITests as a smoke test. Full UI test suite runs
on Xcode Cloud which has proper macOS GUI support.
* Handle expected test failures in unit tests step
xcodebuild returns exit code 65 even for expected failures
(XCTExpectFailure). Parse the summary line to only fail the CI job
when there are unexpected failures.
XCUITest launches the app as a separate process that doesn't inherit
XCTest env vars (XCTestConfigurationFilePath, etc.), so
isRunningUnderXCTest() returns false. The app then hits
shouldBlockUntaggedDebugLaunch() and exits with code 64, causing the
test runner to hang waiting for the app to launch.
Fix: detect CMUX_UI_TEST_* env vars set via XCUIApplication.launchEnvironment
and skip the launch guard. Also revert the failed CMUX_TAG ci.yml workaround.
Debug builds refuse to launch without CMUX_TAG to prevent accidental
untagged launches. XCUITest launches the app as a separate process
without XCTest env vars, so the app's isRunningUnderXCTest() check
fails and the app calls exit(64). The test runner then waits forever
for the app to report back. Set CMUX_TAG=ci in the CI env.
The Mac Mini runner has no dev certificates, so xcodebuild produces
an ad-hoc signed app. Gatekeeper rejects it, causing XCUITest to hang
forever at app.launch(). Split build and test phases, strip the ad-hoc
signature between them so the app can launch.
Org policy now requires actions pinned to immutable SHAs instead of
mutable version tags. Pin actions/checkout, actions/github-script,
softprops/action-gh-release, and oven-sh/setup-bun across all workflows.
* Upgrade Sentry: tracing, breadcrumbs, dSYM upload
- Enhanced Sentry SDK init with performance tracing (10% sample),
explicit app hang timeout, stack trace attachment, and HTTP
failure capture
- Added breadcrumbs for key user actions: workspace switch/create/close,
split creation, command palette open/close, app focus — these give
context to hang/crash reports
- Added dSYM upload step to nightly and release CI workflows so hang
stacks are fully symbolicated (requires SENTRY_AUTH_TOKEN secret)
- Created SentryHelper.swift with lightweight breadcrumb helper
Closes https://github.com/manaflow-ai/cmux/issues/365
* Remove command palette breadcrumbs
Not useful for hang diagnosis — keep only workspace/tab/split/focus
breadcrumbs that correlate with heavy operations.
The decide job already skips when main HEAD matches the nightly tag,
so this only builds when there are actual changes. Hourly means users
get nightly updates within an hour of merging to main.
Follows the same pattern as AppIcon-Debug (orange DEV banner) but with
a purple banner and "NIGHTLY" text. The nightly CI workflow now passes
ASSETCATALOG_COMPILER_APPICON_NAME=AppIcon-Nightly to xcodebuild so
the nightly app gets its own distinct icon.
Includes scripts/generate_nightly_icon.py for regenerating the icons
from the production AppIcon source files.
The nightly build is now a distinct app called "cmux NIGHTLY" with
bundle ID com.cmuxterm.app.nightly, allowing side-by-side installation
with the stable release. The nightly appcast URL is baked into the
app's Info.plist by CI, so no in-app channel switching is needed.
- Nightly workflow: rename app to "cmux NIGHTLY", set bundle ID to
com.cmuxterm.app.nightly, hardcode nightly Sparkle feed URL, publish
DMG as cmux-nightly-macos.dmg
- Remove "Receive Nightly Builds" toggle from settings
- Remove UpdateChannelSettings enum and simplify feed URL resolution
to just use SUFeedURL from Info.plist
- Remove UpdateChannelSettingsTests (no longer applicable)
Root cause: update-homebrew.yml triggered on release:published, which fires
before softprops/action-gh-release finishes uploading assets. The workflow
downloaded a 404 page instead of the DMG and committed its SHA.
Fix:
- Change trigger from release:published to workflow_run (fires after the
release workflow completes, guaranteeing assets are uploaded)
- Add download validation with retries and file size checks
- Add SHA verification step before committing to the cask
- Add homebrew cask update to build-sign-upload.sh for local releases
- Add regression test (tests/test_homebrew_sha.sh)
- Update /release and /release-local skills with homebrew verification steps
Fixes#110
Remove -Dxcframework-target=native from CI and release workflows,
defaulting to universal (matching upstream ghostty). The native target
produces a macos-arm64 xcframework slice that causes Xcode to link the
final binary differently (~70KB more __text), resulting in menubar and
right-click lag on M1 Max. The arm64 static libraries are byte-for-byte
identical between native and universal builds - the difference is purely
in how Xcode resolves the xcframework slice.
The -Dxcframework-target flag controls platform slices (native=macOS
only, universal=macOS+iOS), not CPU microarchitecture. Removing it
caused CI to attempt iOS builds which fail due to missing Metal
iOS toolchain on the runner.
* Fix zsh git branch refresh race after cwd change
* Clarify intentional duplicate cwd check in git refresh path
* Add Metal Toolchain download step to CI and release workflows
Fixes build failure when compiling Metal shaders for iOS xcframework
targets — the self-hosted runner needs `xcodebuild -downloadComponent
MetalToolchain` installed before `xcrun metal` can run.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The self-hosted runner (M4 Mac Mini) was building GhosttyKit with
-Dxcframework-target=native, producing M4-tuned binaries. This caused
menubar and right-click lag on M1 machines. Dropping the flag defaults
to universal, which works well across all Apple Silicon chips.
Hardened runtime's library validation was verifying every dylib on load,
causing noticeable UI lag. Add entitlements file with
disable-library-validation to fix while keeping notarization support.
* Fix blank terminal after split operations and add visual tests
## Blank Terminal Fix
- Add `needsRefreshAfterWindowChange` flag in GhosttyTerminalView
- Force terminal refresh when view is added to window, even if size unchanged
- Add `ghostty_surface_refresh()` call in attachToView for same-view reattachment
- Add debug logging for surface attachment lifecycle (DEBUG builds only)
## Bonsplit Migration
- Add bonsplit as local Swift package (vendor/bonsplit submodule)
- Replace custom SplitTree with BonsplitController
- Add Panel protocol with TerminalPanel and BrowserPanel implementations
- Add SidebarTab as main tab container with BonsplitController
- Remove old Splits/ directory (SplitTree, SplitView, TerminalSplitTreeView)
## Visual Screenshot Tests
- Add test_visual_screenshots.py for automated visual regression testing
- Uses in-app screenshot API (CGWindowListCreateImage) - no screen recording needed
- Generates HTML report with before/after comparisons
- Tests: splits, browser panels, focus switching, close operations, rapid cycles
- Includes annotation fields for easy feedback
## Browser Shortcut (⌘⇧B)
- Add keyboard shortcut to open browser panel in current pane
- Add openBrowser() method to TabManager
- Add shortcut configuration in KeyboardShortcutSettings
## Screenshot Command
- Add 'screenshot' command to TerminalController for in-app window capture
- Returns OK with screenshot ID and path
## Other
- Add tests/visual_output/ and tests/visual_report.html to .gitignore
* Add browser title subscription and set tab height to 30px
- Subscribe to BrowserPanel.$pageTitle changes to update bonsplit tabs
- Update tab titles in real-time as page navigation occurs
- Clean up subscriptions when panels are removed
- Set bonsplit tab bar and tab height to 30px (in submodule)
* Fix socket API regressions in list_surfaces, list_bonsplit_tabs, focus_pane
- list_surfaces: Remove [terminal]/[browser] suffix to keep UUID-only format
that clients and tests expect for parsing
- list_bonsplit_tabs --pane: Properly look up pane by UUID instead of
creating a new PaneID (requires bonsplit PaneID.id to be public)
- focus_pane: Accept both UUID strings and integer indices as documented
* Fix browser panel stability and keyboard shortcuts
- Prevent WKWebView focus lifecycle crashes during split/view reshuffles
- Match bracket shortcuts via keyCode (Cmd+Shift+[ / ], Cmd+Ctrl+[ / ])
- Support Ghostty config goto_split:* keybinds when WebView is focused
- Add focus_webview/is_webview_focused socket commands and regression tests
- Rename SidebarTab to Workspace and update docs
* Make ctrl+enter keybind test skippable
Skip when the Ghostty keybind isn't configured or when osascript can't send keystrokes (no Accessibility permission), so VM runs stay green.
* Auto-focus browser omnibar when blank
When a browser surface is focused but no URL is loaded yet, focus the address bar instead of the WKWebView.
* Stabilize socket surface indexing
* Focus browser omnibar escape; add webview keybind UI tests
- Escape in omnibar now returns focus to WKWebView\n- Add UI tests for Cmd+Ctrl+H pane navigation with WebKit focused (including Ghostty config)\n- Avoid flaky element screenshots in UpdatePillUITests on the UTM VM
* Fix browser drag-to-split blanks and socket parsing
* Fix webview-focused shortcuts and stabilize browser splits
- Match ctrl/shift shortcuts by keyCode where needed (Ctrl+H, bracket keys)
- Load Ghostty goto_split triggers reliably and refresh on config load
- Add debug socket helpers: set_shortcut + simulate_shortcut for tests
- Convert browser goto_split/keybind tests to socket-based injection (no osascript)
- Bump bonsplit for drag-to-split fixes
* Fix split layout collapse and harden socket pane APIs
* Stabilize OSC 99 notification test timing
* Fix terminal focus routing after split reparent
* Support simulate_shortcut enter for focus routing test
* Stabilize terminal focus routing test
* Fix frozen new terminal tabs after many splits
* Fix frozen new terminal tabs after splits
* Fix terminal freeze on launch/new tabs
* Update ghostty submodule
* Fix terminal focus/render stalls after split churn
* Fix nested split collapsing existing pane
* Fix nested split collapse + stabilize new-surface focus
* Update bonsplit submodule
* Fix SIGINT test flake
* Remove bonsplit tab-switch crossfade
* Remove PROJECTS.md
* Remove bonsplit tab selection animation
* Ignore generated test reports
* Middle click closes tab
* Revert unintended .gitignore change
* Fix build after main merge
* Revert "Fix build after main merge"
This reverts commit 16bf9816d0856b5385d52f886aa5eb50f3c9d9a4.
* Revert "Merge remote-tracking branch 'origin/main' into fix/blank-terminal-and-visual-tests"
This reverts commit 7c20fb53fd71fea7a19a3673f2dd73e5f0c783c4, reversing
changes made to 0aff107d787bc9d8bbc28220090b4ca7af72e040.
* Remove tab close fade animation
* Use terminal.fill icon
* Make terminal tab icon smaller
* Match browser globe tab icon size
* Bonsplit: tab min width 48 and tighter close button
* Bonsplit: smaller tab title font
* Show unread notification badge in bonsplit tabs and improve UI polish
Sync unread notification state to bonsplit tab badges (blue dot).
Improve EmptyPanelView with Terminal/Browser buttons and shortcut hints.
Add tooltips to close tab button and search overlay buttons.
* Fix reload.sh single-instance safety check on macOS
Replace GNU-only `ps -o etimes=` with portable `ps -o etime=` and
parse the dd-hh:mm:ss format manually for macOS compatibility.
* Centralize keyboard shortcut definitions into Action enum
Replace per-shortcut boilerplate with a single Action enum that holds
the label, defaults key, and default binding for each shortcut. All
call sites now use shortcut(for:). Settings UI is data-driven via
ForEach(Action.allCases). Titlebar tooltips update dynamically when
shortcuts are changed. Remove duplicate .keyboardShortcut() modifiers
from menu items that are already handled by the event monitor.
* Fix WKWebView consuming app menu shortcuts and close panel confirmation
Add CmuxWebView subclass that routes key equivalents through the main
menu before WebKit, so Cmd+N/Cmd+W/tab switching work when a browser
pane is focused. Fix Cmd+W close-panel path: bypass Bonsplit delegate
gating after the user confirms the running-process dialog by tracking
forceCloseTabIds. Add unit tests (CmuxWebViewKeyEquivalentTests) and
UI test scaffolding (MenuKeyEquivalentRoutingUITests) with a new
cmux-unit Xcode scheme.
* Update CLAUDE.md and PROJECTS.md with recent changes
CLAUDE.md: enforce --tag for reload commands, add cleanup safety rules.
PROJECTS.md: log notification badge, reload.sh fix, Cmd+W fix, WebView
key equiv fix, and centralized shortcuts work.
* Keep selection index stable on close
* Add concepts page documenting terminology hierarchy
New docs page explaining Window > Workspace > Pane > Surface > Panel
hierarchy with aligned ASCII diagram. Updated tabs.mdx and splits.mdx
to use consistent terminology (workspace instead of tab, surface
instead of panel) and corrected outdated CLI command references.
* Update bonsplit submodule
* WIP: improve split close stability and UI regressions
* Close terminal panel on child exit; hide terminal dirty dot
* Fix split close/focus regressions and stabilize UI tests
* Add unread Dock/Cmd+Tab badge with settings toggle
* Fix browser-surface shortcuts and Cmd+L browser opening
* Snapshot current workspace state before regression fixes
* Update bonsplit submodule snapshot
* Stabilize split-close regression capture and sidebar resize assertions
* Change default Show Notifications shortcut from Cmd+Shift+I to Cmd+I
* Fix update check readiness race, enable release update logging, and improve checking spinner
* Restore terminal file drop, fix browser omnibar click focus, and add panel workspace ID mutation for surface moves
* Add Cmd+digit workspace hints, titlebar shortcut pills, sidebar drag-reorder, and workspace placement settings
* Add v2 browser automation API, surface move/reorder commands, and short-handle ref system to TerminalController
* Add CLI browser command surface, --id-format flag, and move/reorder commands
* Extend test clients with move/reorder APIs, ref-handle support, and increased timeouts
* Harden test runner scripts with deterministic builds, retry logic, and robust socket readiness
* Stabilize existing test suites with focus-wait helpers, increased timeouts, and API shape updates
* Add terminal file drop e2e regression test
* Add v2 browser API, CLI ref resolution, and surface move/reorder test suites
* Add unit tests for shortcut hints, workspace reorder, drop planner, and update UI test stabilization
* Add cmux-debug-windows skill with snapshot script and agent config
* Update project docs: mark browser parity and move/reorder phases complete, add parallel agent workflow guidelines
* Update bonsplit submodule: re-entrant setPosition guard, tab shortcut hints, and moveTab/reorderTab API
* Add browser agent UX improvements: snapshot refs, placement reuse, diagnostics, and skill docs
- Upgrade browser.snapshot to emit accessibility tree text with element refs (eN)
- Add right-sibling pane reuse policy for browser.open_split placement
- Add rich not_found diagnostics with retry logic for selector actions
- Support --snapshot-after for post-action verification on mutating commands
- Allow browser fill with empty text for clearing inputs
- Default CLI --id-format to refs-first (UUIDs opt-in via --id-format uuids|both)
- Format legacy new-pane/new-surface output with short surface refs
- Add skills/cmuxterm-browser/ and skills/cmuxterm/ end-user skill docs
- Add regression tests for placement policy, snapshot refs, diagnostics, and ID defaults
* Update bonsplit submodule: keep raster favicons in color when inactive