cmux/tests_v2/test_notifications.py
Lawrence Chen 50f0dd334d
Fix frozen terminals after split churn (#12)
* Fix blank terminal after split operations and add visual tests

## Blank Terminal Fix
- Add `needsRefreshAfterWindowChange` flag in GhosttyTerminalView
- Force terminal refresh when view is added to window, even if size unchanged
- Add `ghostty_surface_refresh()` call in attachToView for same-view reattachment
- Add debug logging for surface attachment lifecycle (DEBUG builds only)

## Bonsplit Migration
- Add bonsplit as local Swift package (vendor/bonsplit submodule)
- Replace custom SplitTree with BonsplitController
- Add Panel protocol with TerminalPanel and BrowserPanel implementations
- Add SidebarTab as main tab container with BonsplitController
- Remove old Splits/ directory (SplitTree, SplitView, TerminalSplitTreeView)

## Visual Screenshot Tests
- Add test_visual_screenshots.py for automated visual regression testing
- Uses in-app screenshot API (CGWindowListCreateImage) - no screen recording needed
- Generates HTML report with before/after comparisons
- Tests: splits, browser panels, focus switching, close operations, rapid cycles
- Includes annotation fields for easy feedback

## Browser Shortcut (⌘⇧B)
- Add keyboard shortcut to open browser panel in current pane
- Add openBrowser() method to TabManager
- Add shortcut configuration in KeyboardShortcutSettings

## Screenshot Command
- Add 'screenshot' command to TerminalController for in-app window capture
- Returns OK with screenshot ID and path

## Other
- Add tests/visual_output/ and tests/visual_report.html to .gitignore

* Add browser title subscription and set tab height to 30px

- Subscribe to BrowserPanel.$pageTitle changes to update bonsplit tabs
- Update tab titles in real-time as page navigation occurs
- Clean up subscriptions when panels are removed
- Set bonsplit tab bar and tab height to 30px (in submodule)

* Fix socket API regressions in list_surfaces, list_bonsplit_tabs, focus_pane

- list_surfaces: Remove [terminal]/[browser] suffix to keep UUID-only format
  that clients and tests expect for parsing
- list_bonsplit_tabs --pane: Properly look up pane by UUID instead of
  creating a new PaneID (requires bonsplit PaneID.id to be public)
- focus_pane: Accept both UUID strings and integer indices as documented

* Fix browser panel stability and keyboard shortcuts

- Prevent WKWebView focus lifecycle crashes during split/view reshuffles
- Match bracket shortcuts via keyCode (Cmd+Shift+[ / ], Cmd+Ctrl+[ / ])
- Support Ghostty config goto_split:* keybinds when WebView is focused
- Add focus_webview/is_webview_focused socket commands and regression tests
- Rename SidebarTab to Workspace and update docs

* Make ctrl+enter keybind test skippable

Skip when the Ghostty keybind isn't configured or when osascript can't send keystrokes (no Accessibility permission), so VM runs stay green.

* Auto-focus browser omnibar when blank

When a browser surface is focused but no URL is loaded yet, focus the address bar instead of the WKWebView.

* Stabilize socket surface indexing

* Focus browser omnibar escape; add webview keybind UI tests

- Escape in omnibar now returns focus to WKWebView\n- Add UI tests for Cmd+Ctrl+H pane navigation with WebKit focused (including Ghostty config)\n- Avoid flaky element screenshots in UpdatePillUITests on the UTM VM

* Fix browser drag-to-split blanks and socket parsing

* Fix webview-focused shortcuts and stabilize browser splits

- Match ctrl/shift shortcuts by keyCode where needed (Ctrl+H, bracket keys)
- Load Ghostty goto_split triggers reliably and refresh on config load
- Add debug socket helpers: set_shortcut + simulate_shortcut for tests
- Convert browser goto_split/keybind tests to socket-based injection (no osascript)
- Bump bonsplit for drag-to-split fixes

* Fix split layout collapse and harden socket pane APIs

* Stabilize OSC 99 notification test timing

* Fix terminal focus routing after split reparent

* Support simulate_shortcut enter for focus routing test

* Stabilize terminal focus routing test

* Fix frozen new terminal tabs after many splits

* Fix frozen new terminal tabs after splits

* Fix terminal freeze on launch/new tabs

* Update ghostty submodule

* Fix terminal focus/render stalls after split churn

* Fix nested split collapsing existing pane

* Fix nested split collapse + stabilize new-surface focus

* Update bonsplit submodule

* Fix SIGINT test flake

* Remove bonsplit tab-switch crossfade

* Remove PROJECTS.md

* Remove bonsplit tab selection animation

* Ignore generated test reports

* Middle click closes tab

* Revert unintended .gitignore change

* Fix build after main merge

* Revert "Fix build after main merge"

This reverts commit 16bf9816d0856b5385d52f886aa5eb50f3c9d9a4.

* Revert "Merge remote-tracking branch 'origin/main' into fix/blank-terminal-and-visual-tests"

This reverts commit 7c20fb53fd71fea7a19a3673f2dd73e5f0c783c4, reversing
changes made to 0aff107d787bc9d8bbc28220090b4ca7af72e040.

* Remove tab close fade animation

* Use terminal.fill icon

* Make terminal tab icon smaller

* Match browser globe tab icon size

* Bonsplit: tab min width 48 and tighter close button

* Bonsplit: smaller tab title font

* Show unread notification badge in bonsplit tabs and improve UI polish

Sync unread notification state to bonsplit tab badges (blue dot).
Improve EmptyPanelView with Terminal/Browser buttons and shortcut hints.
Add tooltips to close tab button and search overlay buttons.

* Fix reload.sh single-instance safety check on macOS

Replace GNU-only `ps -o etimes=` with portable `ps -o etime=` and
parse the dd-hh:mm:ss format manually for macOS compatibility.

* Centralize keyboard shortcut definitions into Action enum

Replace per-shortcut boilerplate with a single Action enum that holds
the label, defaults key, and default binding for each shortcut. All
call sites now use shortcut(for:). Settings UI is data-driven via
ForEach(Action.allCases). Titlebar tooltips update dynamically when
shortcuts are changed. Remove duplicate .keyboardShortcut() modifiers
from menu items that are already handled by the event monitor.

* Fix WKWebView consuming app menu shortcuts and close panel confirmation

Add CmuxWebView subclass that routes key equivalents through the main
menu before WebKit, so Cmd+N/Cmd+W/tab switching work when a browser
pane is focused. Fix Cmd+W close-panel path: bypass Bonsplit delegate
gating after the user confirms the running-process dialog by tracking
forceCloseTabIds. Add unit tests (CmuxWebViewKeyEquivalentTests) and
UI test scaffolding (MenuKeyEquivalentRoutingUITests) with a new
cmux-unit Xcode scheme.

* Update CLAUDE.md and PROJECTS.md with recent changes

CLAUDE.md: enforce --tag for reload commands, add cleanup safety rules.
PROJECTS.md: log notification badge, reload.sh fix, Cmd+W fix, WebView
key equiv fix, and centralized shortcuts work.

* Keep selection index stable on close

* Add concepts page documenting terminology hierarchy

New docs page explaining Window > Workspace > Pane > Surface > Panel
hierarchy with aligned ASCII diagram. Updated tabs.mdx and splits.mdx
to use consistent terminology (workspace instead of tab, surface
instead of panel) and corrected outdated CLI command references.

* Update bonsplit submodule

* WIP: improve split close stability and UI regressions

* Close terminal panel on child exit; hide terminal dirty dot

* Fix split close/focus regressions and stabilize UI tests

* Add unread Dock/Cmd+Tab badge with settings toggle

* Fix browser-surface shortcuts and Cmd+L browser opening

* Snapshot current workspace state before regression fixes

* Update bonsplit submodule snapshot

* Stabilize split-close regression capture and sidebar resize assertions

* Change default Show Notifications shortcut from Cmd+Shift+I to Cmd+I

* Fix update check readiness race, enable release update logging, and improve checking spinner

* Restore terminal file drop, fix browser omnibar click focus, and add panel workspace ID mutation for surface moves

* Add Cmd+digit workspace hints, titlebar shortcut pills, sidebar drag-reorder, and workspace placement settings

* Add v2 browser automation API, surface move/reorder commands, and short-handle ref system to TerminalController

* Add CLI browser command surface, --id-format flag, and move/reorder commands

* Extend test clients with move/reorder APIs, ref-handle support, and increased timeouts

* Harden test runner scripts with deterministic builds, retry logic, and robust socket readiness

* Stabilize existing test suites with focus-wait helpers, increased timeouts, and API shape updates

* Add terminal file drop e2e regression test

* Add v2 browser API, CLI ref resolution, and surface move/reorder test suites

* Add unit tests for shortcut hints, workspace reorder, drop planner, and update UI test stabilization

* Add cmux-debug-windows skill with snapshot script and agent config

* Update project docs: mark browser parity and move/reorder phases complete, add parallel agent workflow guidelines

* Update bonsplit submodule: re-entrant setPosition guard, tab shortcut hints, and moveTab/reorderTab API

* Add browser agent UX improvements: snapshot refs, placement reuse, diagnostics, and skill docs

- Upgrade browser.snapshot to emit accessibility tree text with element refs (eN)
- Add right-sibling pane reuse policy for browser.open_split placement
- Add rich not_found diagnostics with retry logic for selector actions
- Support --snapshot-after for post-action verification on mutating commands
- Allow browser fill with empty text for clearing inputs
- Default CLI --id-format to refs-first (UUIDs opt-in via --id-format uuids|both)
- Format legacy new-pane/new-surface output with short surface refs
- Add skills/cmuxterm-browser/ and skills/cmuxterm/ end-user skill docs
- Add regression tests for placement policy, snapshot refs, diagnostics, and ID defaults

* Update bonsplit submodule: keep raster favicons in color when inactive
2026-02-13 16:45:31 -08:00

484 lines
16 KiB
Python

#!/usr/bin/env python3
"""
Automated tests for notification focus/suppression behavior.
Usage:
python3 test_notifications.py
Requirements:
- cmux must be running with the socket controller enabled
"""
import os
import sys
import time
from typing import Optional
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
from cmux import cmux, cmuxError
class TestResult:
def __init__(self, name: str):
self.name = name
self.passed = False
self.message = ""
def success(self, msg: str = ""):
self.passed = True
self.message = msg
def failure(self, msg: str):
self.passed = False
self.message = msg
def wait_for_notifications(client: cmux, expected: int, timeout: float = 2.0) -> list[dict]:
start = time.time()
while time.time() - start < timeout:
items = client.list_notifications()
if len(items) == expected:
return items
time.sleep(0.05)
return client.list_notifications()
def wait_for_flash_count(client: cmux, surface: str, minimum: int = 1, timeout: float = 2.0) -> int:
"""Poll flash_count until it reaches `minimum` or timeout. Returns final count."""
start = time.time()
last = 0
while time.time() - start < timeout:
try:
last = client.flash_count(surface)
except Exception:
last = 0
if last >= minimum:
return last
time.sleep(0.05)
return last
def ensure_two_surfaces(client: cmux) -> list[tuple[int, str, bool]]:
surfaces = client.list_surfaces()
if len(surfaces) < 2:
client.new_split("right")
time.sleep(0.1)
surfaces = client.list_surfaces()
return surfaces
def focused_surface_index(client: cmux) -> int:
surfaces = client.list_surfaces()
focused = next((s for s in surfaces if s[2]), None)
if focused is None:
raise RuntimeError("No focused surface")
return focused[0]
def send_osc(client: cmux, sequence: str, surface: Optional[int] = None) -> None:
"""Send an OSC sequence by printing it in the shell."""
command = f"printf '{sequence}'\\n"
if surface is None:
client.send(command)
else:
client.send_surface(surface, command)
def test_clear_prior_notifications(client: cmux) -> TestResult:
result = TestResult("Clear Prior Panel Notifications")
try:
client.clear_notifications()
client.set_app_focus(False)
client.notify("first")
time.sleep(0.1)
client.notify("second")
items = wait_for_notifications(client, 1)
if len(items) != 1:
result.failure(f"Expected 1 notification, got {len(items)}")
elif items[0]["title"] != "second":
result.failure(f"Expected latest title 'second', got '{items[0]['title']}'")
else:
result.success("Prior panel notifications cleared")
except Exception as e:
result.failure(f"Exception: {e}")
return result
def test_suppress_when_focused(client: cmux) -> TestResult:
result = TestResult("Suppress When App+Panel Focused")
try:
client.clear_notifications()
client.set_app_focus(True)
client.notify("focused")
items = wait_for_notifications(client, 0)
if len(items) == 0:
result.success("Suppressed notification when focused")
else:
result.failure(f"Expected 0 notifications, got {len(items)}")
except Exception as e:
result.failure(f"Exception: {e}")
return result
def test_not_suppressed_when_inactive(client: cmux) -> TestResult:
result = TestResult("Allow When App Inactive")
try:
client.clear_notifications()
client.set_app_focus(False)
client.notify("inactive")
items = wait_for_notifications(client, 1)
if len(items) != 1:
result.failure(f"Expected 1 notification, got {len(items)}")
elif items[0]["is_read"]:
result.failure("Expected notification to be unread")
else:
result.success("Notification stored when app inactive")
except Exception as e:
result.failure(f"Exception: {e}")
return result
def test_kitty_notification_simple(client: cmux) -> TestResult:
result = TestResult("Kitty OSC 99 Simple")
try:
client.clear_notifications()
client.set_app_focus(False)
# Avoid Ghostty's 1s desktop notification rate limit. This test can run
# immediately after app launch in CI/VM environments.
time.sleep(1.1)
surface = focused_surface_index(client)
send_osc(client, "\\x1b]99;;Kitty Simple\\x1b\\\\", surface)
items = wait_for_notifications(client, 1)
if len(items) != 1:
result.failure(f"Expected 1 notification, got {len(items)}")
elif items[0]["title"] != "Kitty Simple":
result.failure(f"Expected title 'Kitty Simple', got '{items[0]['title']}'")
else:
result.success("OSC 99 simple notification received")
except Exception as e:
result.failure(f"Exception: {e}")
return result
def test_kitty_notification_chunked(client: cmux) -> TestResult:
result = TestResult("Kitty OSC 99 Chunked Title/Body")
try:
client.clear_notifications()
client.set_app_focus(False)
# Avoid Ghostty's 1s desktop notification rate limit.
time.sleep(1.1)
surface = focused_surface_index(client)
send_osc(client, "\\x1b]99;i=kitty:d=0:p=title;Kitty Title\\x1b\\\\", surface)
time.sleep(0.1)
items = client.list_notifications()
if items:
result.failure("Expected no notification before final chunk")
return result
send_osc(client, "\\x1b]99;i=kitty:p=body;Kitty Body\\x1b\\\\", surface)
items = wait_for_notifications(client, 1)
if len(items) != 1:
result.failure(f"Expected 1 notification, got {len(items)}")
elif items[0]["title"] != "Kitty Title" or items[0]["body"] != "Kitty Body":
result.failure(
f"Expected title/body 'Kitty Title'/'Kitty Body', got "
f"'{items[0]['title']}'/'{items[0]['body']}'"
)
else:
result.success("OSC 99 chunked notification received")
except Exception as e:
result.failure(f"Exception: {e}")
return result
def test_rxvt_notification_osc777(client: cmux) -> TestResult:
result = TestResult("RXVT OSC 777 Notification")
try:
client.clear_notifications()
client.set_app_focus(False)
# Avoid Ghostty's 1s desktop notification rate limit.
time.sleep(1.1)
surface = focused_surface_index(client)
command = "printf '\\x1b]777;notify;OSC777 Title;OSC777 Body\\x07'"
client.send_surface(surface, command + "\\n")
items = wait_for_notifications(client, 1)
if len(items) != 1:
result.failure(f"Expected 1 notification, got {len(items)}")
elif items[0]["title"] != "OSC777 Title" or items[0]["body"] != "OSC777 Body":
result.failure(
f"Expected title/body 'OSC777 Title'/'OSC777 Body', got "
f"'{items[0]['title']}'/'{items[0]['body']}'"
)
else:
result.success("OSC 777 notification received")
except Exception as e:
result.failure(f"Exception: {e}")
return result
def test_mark_read_on_focus_change(client: cmux) -> TestResult:
result = TestResult("Mark Read On Panel Focus")
try:
client.clear_notifications()
client.reset_flash_counts()
surfaces = ensure_two_surfaces(client)
focused = next((s for s in surfaces if s[2]), None)
other = next((s for s in surfaces if not s[2]), None)
if focused is None or other is None:
result.failure("Unable to identify focused and unfocused surfaces")
return result
client.set_app_focus(False)
client.notify_surface(other[0], "focusread")
time.sleep(0.1)
client.set_app_focus(True)
client.focus_surface(other[0])
time.sleep(0.1)
items = client.list_notifications()
target = next((n for n in items if n["surface_id"] == other[1]), None)
if target is None:
result.failure("Expected notification for target surface")
elif not target["is_read"]:
result.failure("Expected notification to be marked read on focus")
else:
count = wait_for_flash_count(client, other[1], minimum=1, timeout=2.0)
if count < 1:
result.failure("Expected flash on panel focus dismissal")
else:
result.success("Notification marked read on focus")
except Exception as e:
result.failure(f"Exception: {e}")
return result
def test_mark_read_on_app_active(client: cmux) -> TestResult:
result = TestResult("Mark Read On App Active")
try:
client.clear_notifications()
client.set_app_focus(False)
client.notify("activate")
time.sleep(0.1)
items = client.list_notifications()
if not items or items[0]["is_read"]:
result.failure("Expected unread notification before activation")
return result
client.simulate_app_active()
time.sleep(0.1)
items = client.list_notifications()
if not items:
result.failure("Expected notification to remain after activation")
elif not items[0]["is_read"]:
result.failure("Expected notification to be marked read on app active")
else:
result.success("Notification marked read on app active")
except Exception as e:
result.failure(f"Exception: {e}")
return result
def test_mark_read_on_tab_switch(client: cmux) -> TestResult:
result = TestResult("Mark Read On Tab Switch")
try:
client.clear_notifications()
client.set_app_focus(False)
tab1 = client.current_workspace()
client.notify("tabswitch")
time.sleep(0.1)
tab2 = client.new_workspace()
time.sleep(0.1)
client.set_app_focus(True)
client.select_workspace(tab1)
time.sleep(0.1)
items = client.list_notifications()
target = next((n for n in items if n["workspace_id"] == tab1), None)
if target is None:
result.failure("Expected notification for original tab")
elif not target["is_read"]:
result.failure("Expected notification to be marked read on tab switch")
else:
result.success("Notification marked read on tab switch")
except Exception as e:
result.failure(f"Exception: {e}")
return result
def test_flash_on_tab_switch(client: cmux) -> TestResult:
result = TestResult("Flash On Tab Switch")
try:
client.clear_notifications()
client.reset_flash_counts()
tab1 = client.current_workspace()
surfaces = client.list_surfaces()
focused = next((s for s in surfaces if s[2]), None)
if focused is None:
result.failure("Unable to identify focused surface")
return result
client.set_app_focus(False)
client.notify("tabswitchflash")
time.sleep(0.1)
client.new_workspace()
time.sleep(0.1)
client.set_app_focus(True)
client.select_workspace(tab1)
time.sleep(0.2)
count = wait_for_flash_count(client, focused[1], minimum=1, timeout=2.0)
if count < 1:
result.failure(f"Expected flash count >= 1, got {count}")
else:
result.success("Flash triggered on tab switch dismissal")
except Exception as e:
result.failure(f"Exception: {e}")
return result
def test_focus_on_notification_click(client: cmux) -> TestResult:
result = TestResult("Focus On Notification Click")
try:
client.clear_notifications()
client.reset_flash_counts()
surfaces = ensure_two_surfaces(client)
focused = next((s for s in surfaces if s[2]), None)
other = next((s for s in surfaces if not s[2]), None)
if focused is None or other is None:
result.failure("Unable to identify focused and unfocused surfaces")
return result
client.set_app_focus(False)
client.notify_surface(other[0], "notifyfocus")
time.sleep(0.1)
client.set_app_focus(True)
workspace_id = client.current_workspace()
client.focus_notification(workspace_id, other[0])
time.sleep(0.2)
surfaces = client.list_surfaces()
target = next((s for s in surfaces if s[1] == other[1]), None)
if target is None or not target[2]:
result.failure("Expected notification surface to be focused")
return result
count = wait_for_flash_count(client, other[1], minimum=1, timeout=2.0)
if count < 1:
result.failure(f"Expected flash count >= 1, got {count}")
else:
result.success("Notification click focuses and flashes panel")
except Exception as e:
result.failure(f"Exception: {e}")
return result
def test_restore_focus_on_tab_switch(client: cmux) -> TestResult:
result = TestResult("Restore Focus On Tab Switch")
try:
client.clear_notifications()
client.set_app_focus(True)
surfaces = ensure_two_surfaces(client)
focused = next((s for s in surfaces if s[2]), None)
other = next((s for s in surfaces if not s[2]), None)
if focused is None or other is None:
result.failure("Unable to identify focused and unfocused surfaces")
return result
client.focus_surface(other[0])
time.sleep(0.1)
tab1 = client.current_workspace()
client.new_workspace()
time.sleep(0.1)
client.select_workspace(tab1)
time.sleep(0.2)
surfaces = client.list_surfaces()
target = next((s for s in surfaces if s[1] == other[1]), None)
if target is None:
result.failure("Unable to find previously focused surface")
elif not target[2]:
result.failure("Expected previously focused surface to be focused after tab switch")
else:
result.success("Restored last focused surface after tab switch")
except Exception as e:
result.failure(f"Exception: {e}")
return result
def test_clear_on_tab_close(client: cmux) -> TestResult:
result = TestResult("Clear On Tab Close")
try:
client.clear_notifications()
client.set_app_focus(False)
tab1 = client.current_workspace()
client.notify("closetab")
time.sleep(0.1)
items = wait_for_notifications(client, 1)
if len(items) != 1:
result.failure(f"Expected 1 notification, got {len(items)}")
return result
client.new_workspace()
time.sleep(0.1)
client.close_workspace(tab1)
time.sleep(0.2)
items = client.list_notifications()
if items:
result.failure(f"Expected 0 notifications after tab close, got {len(items)}")
else:
result.success("Notifications cleared when tab closed")
except Exception as e:
result.failure(f"Exception: {e}")
return result
def run_tests() -> int:
results = []
with cmux() as client:
results.append(test_clear_prior_notifications(client))
results.append(test_suppress_when_focused(client))
results.append(test_not_suppressed_when_inactive(client))
results.append(test_kitty_notification_simple(client))
results.append(test_kitty_notification_chunked(client))
results.append(test_rxvt_notification_osc777(client))
results.append(test_mark_read_on_focus_change(client))
results.append(test_mark_read_on_app_active(client))
results.append(test_mark_read_on_tab_switch(client))
results.append(test_flash_on_tab_switch(client))
results.append(test_focus_on_notification_click(client))
results.append(test_restore_focus_on_tab_switch(client))
results.append(test_clear_on_tab_close(client))
client.set_app_focus(None)
client.clear_notifications()
print("\nNotification Tests:")
for r in results:
status = "PASS" if r.passed else "FAIL"
msg = f" - {r.message}" if r.message else ""
print(f"{status}: {r.name}{msg}")
passed = sum(1 for r in results if r.passed)
total = len(results)
if passed == total:
print("\n🎉 All notification tests passed!")
return 0
print(f"\n⚠️ {total - passed} test(s) failed")
return 1
if __name__ == "__main__":
sys.exit(run_tests())