cmux/tests_v2/test_browser_api_extended_families.py
Lawrence Chen 50f0dd334d
Fix frozen terminals after split churn (#12)
* Fix blank terminal after split operations and add visual tests

## Blank Terminal Fix
- Add `needsRefreshAfterWindowChange` flag in GhosttyTerminalView
- Force terminal refresh when view is added to window, even if size unchanged
- Add `ghostty_surface_refresh()` call in attachToView for same-view reattachment
- Add debug logging for surface attachment lifecycle (DEBUG builds only)

## Bonsplit Migration
- Add bonsplit as local Swift package (vendor/bonsplit submodule)
- Replace custom SplitTree with BonsplitController
- Add Panel protocol with TerminalPanel and BrowserPanel implementations
- Add SidebarTab as main tab container with BonsplitController
- Remove old Splits/ directory (SplitTree, SplitView, TerminalSplitTreeView)

## Visual Screenshot Tests
- Add test_visual_screenshots.py for automated visual regression testing
- Uses in-app screenshot API (CGWindowListCreateImage) - no screen recording needed
- Generates HTML report with before/after comparisons
- Tests: splits, browser panels, focus switching, close operations, rapid cycles
- Includes annotation fields for easy feedback

## Browser Shortcut (⌘⇧B)
- Add keyboard shortcut to open browser panel in current pane
- Add openBrowser() method to TabManager
- Add shortcut configuration in KeyboardShortcutSettings

## Screenshot Command
- Add 'screenshot' command to TerminalController for in-app window capture
- Returns OK with screenshot ID and path

## Other
- Add tests/visual_output/ and tests/visual_report.html to .gitignore

* Add browser title subscription and set tab height to 30px

- Subscribe to BrowserPanel.$pageTitle changes to update bonsplit tabs
- Update tab titles in real-time as page navigation occurs
- Clean up subscriptions when panels are removed
- Set bonsplit tab bar and tab height to 30px (in submodule)

* Fix socket API regressions in list_surfaces, list_bonsplit_tabs, focus_pane

- list_surfaces: Remove [terminal]/[browser] suffix to keep UUID-only format
  that clients and tests expect for parsing
- list_bonsplit_tabs --pane: Properly look up pane by UUID instead of
  creating a new PaneID (requires bonsplit PaneID.id to be public)
- focus_pane: Accept both UUID strings and integer indices as documented

* Fix browser panel stability and keyboard shortcuts

- Prevent WKWebView focus lifecycle crashes during split/view reshuffles
- Match bracket shortcuts via keyCode (Cmd+Shift+[ / ], Cmd+Ctrl+[ / ])
- Support Ghostty config goto_split:* keybinds when WebView is focused
- Add focus_webview/is_webview_focused socket commands and regression tests
- Rename SidebarTab to Workspace and update docs

* Make ctrl+enter keybind test skippable

Skip when the Ghostty keybind isn't configured or when osascript can't send keystrokes (no Accessibility permission), so VM runs stay green.

* Auto-focus browser omnibar when blank

When a browser surface is focused but no URL is loaded yet, focus the address bar instead of the WKWebView.

* Stabilize socket surface indexing

* Focus browser omnibar escape; add webview keybind UI tests

- Escape in omnibar now returns focus to WKWebView\n- Add UI tests for Cmd+Ctrl+H pane navigation with WebKit focused (including Ghostty config)\n- Avoid flaky element screenshots in UpdatePillUITests on the UTM VM

* Fix browser drag-to-split blanks and socket parsing

* Fix webview-focused shortcuts and stabilize browser splits

- Match ctrl/shift shortcuts by keyCode where needed (Ctrl+H, bracket keys)
- Load Ghostty goto_split triggers reliably and refresh on config load
- Add debug socket helpers: set_shortcut + simulate_shortcut for tests
- Convert browser goto_split/keybind tests to socket-based injection (no osascript)
- Bump bonsplit for drag-to-split fixes

* Fix split layout collapse and harden socket pane APIs

* Stabilize OSC 99 notification test timing

* Fix terminal focus routing after split reparent

* Support simulate_shortcut enter for focus routing test

* Stabilize terminal focus routing test

* Fix frozen new terminal tabs after many splits

* Fix frozen new terminal tabs after splits

* Fix terminal freeze on launch/new tabs

* Update ghostty submodule

* Fix terminal focus/render stalls after split churn

* Fix nested split collapsing existing pane

* Fix nested split collapse + stabilize new-surface focus

* Update bonsplit submodule

* Fix SIGINT test flake

* Remove bonsplit tab-switch crossfade

* Remove PROJECTS.md

* Remove bonsplit tab selection animation

* Ignore generated test reports

* Middle click closes tab

* Revert unintended .gitignore change

* Fix build after main merge

* Revert "Fix build after main merge"

This reverts commit 16bf9816d0856b5385d52f886aa5eb50f3c9d9a4.

* Revert "Merge remote-tracking branch 'origin/main' into fix/blank-terminal-and-visual-tests"

This reverts commit 7c20fb53fd71fea7a19a3673f2dd73e5f0c783c4, reversing
changes made to 0aff107d787bc9d8bbc28220090b4ca7af72e040.

* Remove tab close fade animation

* Use terminal.fill icon

* Make terminal tab icon smaller

* Match browser globe tab icon size

* Bonsplit: tab min width 48 and tighter close button

* Bonsplit: smaller tab title font

* Show unread notification badge in bonsplit tabs and improve UI polish

Sync unread notification state to bonsplit tab badges (blue dot).
Improve EmptyPanelView with Terminal/Browser buttons and shortcut hints.
Add tooltips to close tab button and search overlay buttons.

* Fix reload.sh single-instance safety check on macOS

Replace GNU-only `ps -o etimes=` with portable `ps -o etime=` and
parse the dd-hh:mm:ss format manually for macOS compatibility.

* Centralize keyboard shortcut definitions into Action enum

Replace per-shortcut boilerplate with a single Action enum that holds
the label, defaults key, and default binding for each shortcut. All
call sites now use shortcut(for:). Settings UI is data-driven via
ForEach(Action.allCases). Titlebar tooltips update dynamically when
shortcuts are changed. Remove duplicate .keyboardShortcut() modifiers
from menu items that are already handled by the event monitor.

* Fix WKWebView consuming app menu shortcuts and close panel confirmation

Add CmuxWebView subclass that routes key equivalents through the main
menu before WebKit, so Cmd+N/Cmd+W/tab switching work when a browser
pane is focused. Fix Cmd+W close-panel path: bypass Bonsplit delegate
gating after the user confirms the running-process dialog by tracking
forceCloseTabIds. Add unit tests (CmuxWebViewKeyEquivalentTests) and
UI test scaffolding (MenuKeyEquivalentRoutingUITests) with a new
cmux-unit Xcode scheme.

* Update CLAUDE.md and PROJECTS.md with recent changes

CLAUDE.md: enforce --tag for reload commands, add cleanup safety rules.
PROJECTS.md: log notification badge, reload.sh fix, Cmd+W fix, WebView
key equiv fix, and centralized shortcuts work.

* Keep selection index stable on close

* Add concepts page documenting terminology hierarchy

New docs page explaining Window > Workspace > Pane > Surface > Panel
hierarchy with aligned ASCII diagram. Updated tabs.mdx and splits.mdx
to use consistent terminology (workspace instead of tab, surface
instead of panel) and corrected outdated CLI command references.

* Update bonsplit submodule

* WIP: improve split close stability and UI regressions

* Close terminal panel on child exit; hide terminal dirty dot

* Fix split close/focus regressions and stabilize UI tests

* Add unread Dock/Cmd+Tab badge with settings toggle

* Fix browser-surface shortcuts and Cmd+L browser opening

* Snapshot current workspace state before regression fixes

* Update bonsplit submodule snapshot

* Stabilize split-close regression capture and sidebar resize assertions

* Change default Show Notifications shortcut from Cmd+Shift+I to Cmd+I

* Fix update check readiness race, enable release update logging, and improve checking spinner

* Restore terminal file drop, fix browser omnibar click focus, and add panel workspace ID mutation for surface moves

* Add Cmd+digit workspace hints, titlebar shortcut pills, sidebar drag-reorder, and workspace placement settings

* Add v2 browser automation API, surface move/reorder commands, and short-handle ref system to TerminalController

* Add CLI browser command surface, --id-format flag, and move/reorder commands

* Extend test clients with move/reorder APIs, ref-handle support, and increased timeouts

* Harden test runner scripts with deterministic builds, retry logic, and robust socket readiness

* Stabilize existing test suites with focus-wait helpers, increased timeouts, and API shape updates

* Add terminal file drop e2e regression test

* Add v2 browser API, CLI ref resolution, and surface move/reorder test suites

* Add unit tests for shortcut hints, workspace reorder, drop planner, and update UI test stabilization

* Add cmux-debug-windows skill with snapshot script and agent config

* Update project docs: mark browser parity and move/reorder phases complete, add parallel agent workflow guidelines

* Update bonsplit submodule: re-entrant setPosition guard, tab shortcut hints, and moveTab/reorderTab API

* Add browser agent UX improvements: snapshot refs, placement reuse, diagnostics, and skill docs

- Upgrade browser.snapshot to emit accessibility tree text with element refs (eN)
- Add right-sibling pane reuse policy for browser.open_split placement
- Add rich not_found diagnostics with retry logic for selector actions
- Support --snapshot-after for post-action verification on mutating commands
- Allow browser fill with empty text for clearing inputs
- Default CLI --id-format to refs-first (UUIDs opt-in via --id-format uuids|both)
- Format legacy new-pane/new-surface output with short surface refs
- Add skills/cmuxterm-browser/ and skills/cmuxterm/ end-user skill docs
- Add regression tests for placement policy, snapshot refs, diagnostics, and ID defaults

* Update bonsplit submodule: keep raster favicons in color when inactive
2026-02-13 16:45:31 -08:00

333 lines
15 KiB
Python

#!/usr/bin/env python3
"""Extended browser.* coverage for newly added agent-browser parity families."""
import base64
import http.server
import os
import socketserver
import sys
import tempfile
import threading
import time
from contextlib import contextmanager
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent))
from cmux import cmux, cmuxError
SOCKET_PATH = os.environ.get("CMUX_SOCKET", "/tmp/cmux-debug.sock")
def _must(cond: bool, msg: str) -> None:
if not cond:
raise cmuxError(msg)
def _expect_error_contains(label: str, fn, needle: str) -> None:
try:
fn()
except cmuxError as exc:
text = str(exc)
if needle in text:
return
raise cmuxError(f"{label}: expected error containing {needle!r}, got: {text}")
raise cmuxError(f"{label}: expected error containing {needle!r}, but call succeeded")
def _wait_selector(c: cmux, surface_id: str, selector: str, timeout_s: float = 6.0) -> None:
timeout_ms = max(1, int(timeout_s * 1000.0))
try:
c._call("browser.wait", {"surface_id": surface_id, "selector": selector, "timeout_ms": timeout_ms})
return
except cmuxError as exc:
if "timeout" not in str(exc):
raise
deadline = time.time() + timeout_s
script = f"document.querySelector({selector!r}) !== null"
while time.time() < deadline:
probe = c._call("browser.eval", {"surface_id": surface_id, "script": script}) or {}
if bool(probe.get("value")):
return
time.sleep(0.05)
raise cmuxError(f"Timed out waiting for selector {selector}")
def _wait_function(c: cmux, surface_id: str, expression: str, timeout_s: float = 6.0) -> None:
timeout_ms = max(1, int(timeout_s * 1000.0))
try:
c._call("browser.wait", {"surface_id": surface_id, "function": expression, "timeout_ms": timeout_ms})
return
except cmuxError as exc:
if "timeout" not in str(exc):
raise
deadline = time.time() + timeout_s
while time.time() < deadline:
probe = c._call("browser.eval", {"surface_id": surface_id, "script": expression}) or {}
if bool(probe.get("value")):
return
time.sleep(0.05)
raise cmuxError(f"Timed out waiting for function: {expression}")
@contextmanager
def _local_test_server() -> str:
with tempfile.TemporaryDirectory(prefix="cmux-browser-ext-") as root:
root_path = Path(root)
pixel = base64.b64decode("R0lGODlhAQABAIAAAAAAAP///ywAAAAAAQABAAACAUwAOw==")
(root_path / "tiny.gif").write_bytes(pixel)
(root_path / "frame.html").write_text(
"""<!doctype html>
<html>
<body>
<button id="frame-btn" onclick="window.top.frameClicks = (window.top.frameClicks || 0) + 1">Frame Button</button>
<div id="frame-text">frame-ready</div>
</body>
</html>
""".strip(),
encoding="utf-8",
)
(root_path / "second.html").write_text(
"""<!doctype html>
<html>
<head>
<title>cmux-browser-extended-second</title>
</head>
<body>
<div id="second">second-page</div>
<div id="style-target">style-target-second</div>
</body>
</html>
""".strip(),
encoding="utf-8",
)
(root_path / "index.html").write_text(
"""<!doctype html>
<html>
<head>
<title>cmux-browser-extended</title>
<style>
#style-target { color: rgb(255, 0, 0); }
</style>
</head>
<body>
<label for="name">Agent Name</label>
<input id="name" placeholder="Type name" title="name-title" data-testid="name-field" />
<img id="hero" alt="hero image" src="/tiny.gif" />
<button id="action-btn" role="button" onclick="window.actionCount = (window.actionCount || 0) + 1; document.querySelector('#status').textContent = 'clicked';">Submit Action</button>
<div id="status">ready</div>
<ul id="rows">
<li class="row">row-1</li>
<li class="row">row-2</li>
<li class="row">row-3</li>
</ul>
<iframe id="frame-a" src="/frame.html"></iframe>
<div id="style-target">style target</div>
<script>
window.actionCount = 0;
window.frameClicks = 0;
window.triggerDialogs = function () {
confirm('confirm-message');
prompt('prompt-message', 'prompt-default');
alert('alert-message');
return true;
};
window.emitConsoleAndError = function () {
console.log('cmux-console-entry');
setTimeout(function () {
throw new Error('cmux-boom');
}, 0);
return true;
};
</script>
</body>
</html>
""".strip(),
encoding="utf-8",
)
class Handler(http.server.SimpleHTTPRequestHandler):
def __init__(self, *args, **kwargs):
super().__init__(*args, directory=root, **kwargs)
def log_message(self, format: str, *args) -> None: # noqa: A003
return
class ThreadedTCPServer(socketserver.ThreadingMixIn, socketserver.TCPServer):
allow_reuse_address = True
daemon_threads = True
server = ThreadedTCPServer(("127.0.0.1", 0), Handler)
thread = threading.Thread(target=server.serve_forever, daemon=True)
thread.start()
try:
yield f"http://127.0.0.1:{server.server_address[1]}"
finally:
server.shutdown()
server.server_close()
thread.join(timeout=1.0)
def main() -> int:
with _local_test_server() as base_url:
index_url = f"{base_url}/index.html"
second_url = f"{base_url}/second.html"
with cmux(SOCKET_PATH) as c:
opened = c._call("browser.open_split", {"url": "about:blank"}) or {}
sid = str(opened.get("surface_id") or "")
_must(bool(sid), f"browser.open_split returned no surface_id: {opened}")
c._call("browser.navigate", {"surface_id": sid, "url": index_url})
_wait_selector(c, sid, "#action-btn", timeout_s=7.0)
find_role = c._call("browser.find.role", {"surface_id": sid, "role": "button", "name": "submit"}) or {}
role_ref = str(find_role.get("element_ref") or "")
_must(role_ref.startswith("@e"), f"Expected element_ref from find.role: {find_role}")
c._call("browser.click", {"surface_id": sid, "selector": role_ref})
status = c._call("browser.get.text", {"surface_id": sid, "selector": "#status"}) or {}
_must(str(status.get("value") or "") == "clicked", f"Expected clicked status via element ref: {status}")
find_cases = [
("browser.find.text", {"text": "row-2"}),
("browser.find.label", {"label": "Agent Name"}),
("browser.find.placeholder", {"placeholder": "Type name"}),
("browser.find.alt", {"alt": "hero image"}),
("browser.find.title", {"title": "name-title"}),
("browser.find.testid", {"testid": "name-field"}),
("browser.find.first", {"selector": "li.row"}),
("browser.find.last", {"selector": "li.row"}),
("browser.find.nth", {"selector": "li.row", "index": 1}),
]
for method, extra in find_cases:
params = {"surface_id": sid}
params.update(extra)
payload = c._call(method, params) or {}
ref = str(payload.get("element_ref") or "")
_must(ref.startswith("@e"), f"Expected element_ref from {method}: {payload}")
c._call("browser.frame.select", {"surface_id": sid, "selector": "#frame-a"})
_wait_function(c, sid, "document.querySelector('#frame-text') !== null", timeout_s=7.0)
frame_text = c._call("browser.get.text", {"surface_id": sid, "selector": "#frame-text"}) or {}
_must(str(frame_text.get("value") or "") == "frame-ready", f"Expected frame text: {frame_text}")
c._call("browser.click", {"surface_id": sid, "selector": "#frame-btn"})
c._call("browser.frame.main", {"surface_id": sid})
frame_clicks = c._call("browser.eval", {"surface_id": sid, "script": "window.frameClicks || 0"}) or {}
_must(int(frame_clicks.get("value") or 0) >= 1, f"Expected frame click count >= 1: {frame_clicks}")
c._call("browser.console.list", {"surface_id": sid})
c._call("browser.addscript", {"surface_id": sid, "script": "window.triggerDialogs(); true;"})
d1 = c._call("browser.dialog.accept", {"surface_id": sid, "text": "agent-text"}) or {}
d2 = c._call("browser.dialog.dismiss", {"surface_id": sid}) or {}
d3 = c._call("browser.dialog.accept", {"surface_id": sid}) or {}
_must(bool(d1.get("accepted")) is True, f"Expected first dialog accepted: {d1}")
_must(bool(d2.get("accepted")) is False, f"Expected second dialog dismissed: {d2}")
_must(bool(d3.get("accepted")) is True, f"Expected third dialog accepted: {d3}")
_expect_error_contains(
"dialog queue empty",
lambda: c._call("browser.dialog.dismiss", {"surface_id": sid}),
"not_found",
)
download_path = tempfile.NamedTemporaryFile(delete=False, prefix="cmux-download-", suffix=".txt").name
os.unlink(download_path)
def _write_download() -> None:
time.sleep(0.2)
Path(download_path).write_text("downloaded", encoding="utf-8")
t = threading.Thread(target=_write_download, daemon=True)
t.start()
dl = c._call("browser.download.wait", {"surface_id": sid, "path": download_path, "timeout_ms": 5000}) or {}
_must(bool(dl.get("downloaded")) is True, f"Expected download wait success: {dl}")
c._call(
"browser.cookies.set",
{
"surface_id": sid,
"name": "cmux_cookie",
"value": "cookie_value",
"url": index_url,
},
)
got_cookie = c._call("browser.cookies.get", {"surface_id": sid, "name": "cmux_cookie"}) or {}
cookies = got_cookie.get("cookies") or []
_must(any(str(row.get("name")) == "cmux_cookie" for row in cookies), f"Expected cmux_cookie in cookies.get: {got_cookie}")
c._call("browser.cookies.clear", {"surface_id": sid, "name": "cmux_cookie"})
got_after_clear = c._call("browser.cookies.get", {"surface_id": sid, "name": "cmux_cookie"}) or {}
_must(len(got_after_clear.get("cookies") or []) == 0, f"Expected cookie cleared: {got_after_clear}")
c._call("browser.storage.set", {"surface_id": sid, "type": "local", "key": "alpha", "value": "one"})
c._call("browser.storage.set", {"surface_id": sid, "type": "session", "key": "beta", "value": "two"})
storage_local = c._call("browser.storage.get", {"surface_id": sid, "type": "local", "key": "alpha"}) or {}
storage_session = c._call("browser.storage.get", {"surface_id": sid, "type": "session", "key": "beta"}) or {}
_must(str(storage_local.get("value") or "") == "one", f"Expected local storage value: {storage_local}")
_must(str(storage_session.get("value") or "") == "two", f"Expected session storage value: {storage_session}")
c._call("browser.storage.clear", {"surface_id": sid, "type": "session"})
storage_session_after = c._call("browser.storage.get", {"surface_id": sid, "type": "session", "key": "beta"}) or {}
_must(storage_session_after.get("value") is None, f"Expected session key cleared: {storage_session_after}")
tabs_before = c._call("browser.tab.list", {"surface_id": sid}) or {}
before_count = len(tabs_before.get("tabs") or [])
tab_new = c._call("browser.tab.new", {"surface_id": sid, "url": second_url}) or {}
sid2 = str(tab_new.get("surface_id") or "")
_must(bool(sid2), f"Expected surface_id from browser.tab.new: {tab_new}")
_wait_selector(c, sid2, "#second", timeout_s=7.0)
tabs_after = c._call("browser.tab.list", {"surface_id": sid2}) or {}
ids_after = {str(item.get("id") or "") for item in (tabs_after.get("tabs") or [])}
_must(sid2 in ids_after and len(ids_after) >= before_count + 1, f"Expected new tab in list: {tabs_after}")
c._call("browser.tab.switch", {"surface_id": sid2, "target_surface_id": sid})
c._call("browser.tab.close", {"surface_id": sid, "target_surface_id": sid2})
addscript_payload = c._call("browser.addscript", {"surface_id": sid, "script": "1 + 2"}) or {}
_must(int(addscript_payload.get("value") or 0) == 3, f"Expected addscript value=3: {addscript_payload}")
c._call("browser.addstyle", {"surface_id": sid, "css": "#style-target { color: rgb(0, 128, 0); }"})
style_color = c._call("browser.get.styles", {"surface_id": sid, "selector": "#style-target", "property": "color"}) or {}
_must("0, 128, 0" in str(style_color.get("value") or ""), f"Expected updated style color: {style_color}")
c._call("browser.addinitscript", {"surface_id": sid, "script": "window.__cmuxInitMarker = 'init-ok';"})
c._call("browser.navigate", {"surface_id": sid, "url": second_url})
_wait_selector(c, sid, "#second", timeout_s=7.0)
init_value = c._call("browser.eval", {"surface_id": sid, "script": "window.__cmuxInitMarker || ''"}) or {}
_must(str(init_value.get("value") or "") == "init-ok", f"Expected init script marker after navigation: {init_value}")
c._call("browser.navigate", {"surface_id": sid, "url": index_url})
_wait_selector(c, sid, "#action-btn", timeout_s=7.0)
c._call("browser.console.list", {"surface_id": sid})
c._call("browser.addscript", {"surface_id": sid, "script": "window.emitConsoleAndError();"})
time.sleep(0.35)
console_entries = c._call("browser.console.list", {"surface_id": sid}) or {}
errors_entries = c._call("browser.errors.list", {"surface_id": sid}) or {}
_must(int(console_entries.get("count") or 0) >= 1, f"Expected console entries: {console_entries}")
_must(int(errors_entries.get("count") or 0) >= 1, f"Expected error entries: {errors_entries}")
c._call("browser.console.clear", {"surface_id": sid})
console_after = c._call("browser.console.list", {"surface_id": sid}) or {}
_must(int(console_after.get("count") or 0) == 0, f"Expected cleared console entries: {console_after}")
c._call("browser.highlight", {"surface_id": sid, "selector": "#action-btn"})
state_path = tempfile.NamedTemporaryFile(delete=False, prefix="cmux-state-", suffix=".json").name
c._call("browser.storage.set", {"surface_id": sid, "type": "local", "key": "persist", "value": "yes"})
c._call("browser.state.save", {"surface_id": sid, "path": state_path})
c._call("browser.storage.set", {"surface_id": sid, "type": "local", "key": "persist", "value": "no"})
c._call("browser.state.load", {"surface_id": sid, "path": state_path})
persisted = c._call("browser.storage.get", {"surface_id": sid, "type": "local", "key": "persist"}) or {}
_must(str(persisted.get("value") or "") == "yes", f"Expected state.load to restore storage key: {persisted}")
print("PASS: extended browser parity families are green")
return 0
if __name__ == "__main__":
raise SystemExit(main())