cmux/tests_v2/test_cpu_usage.py
Lawrence Chen 50f0dd334d
Fix frozen terminals after split churn (#12)
* Fix blank terminal after split operations and add visual tests

## Blank Terminal Fix
- Add `needsRefreshAfterWindowChange` flag in GhosttyTerminalView
- Force terminal refresh when view is added to window, even if size unchanged
- Add `ghostty_surface_refresh()` call in attachToView for same-view reattachment
- Add debug logging for surface attachment lifecycle (DEBUG builds only)

## Bonsplit Migration
- Add bonsplit as local Swift package (vendor/bonsplit submodule)
- Replace custom SplitTree with BonsplitController
- Add Panel protocol with TerminalPanel and BrowserPanel implementations
- Add SidebarTab as main tab container with BonsplitController
- Remove old Splits/ directory (SplitTree, SplitView, TerminalSplitTreeView)

## Visual Screenshot Tests
- Add test_visual_screenshots.py for automated visual regression testing
- Uses in-app screenshot API (CGWindowListCreateImage) - no screen recording needed
- Generates HTML report with before/after comparisons
- Tests: splits, browser panels, focus switching, close operations, rapid cycles
- Includes annotation fields for easy feedback

## Browser Shortcut (⌘⇧B)
- Add keyboard shortcut to open browser panel in current pane
- Add openBrowser() method to TabManager
- Add shortcut configuration in KeyboardShortcutSettings

## Screenshot Command
- Add 'screenshot' command to TerminalController for in-app window capture
- Returns OK with screenshot ID and path

## Other
- Add tests/visual_output/ and tests/visual_report.html to .gitignore

* Add browser title subscription and set tab height to 30px

- Subscribe to BrowserPanel.$pageTitle changes to update bonsplit tabs
- Update tab titles in real-time as page navigation occurs
- Clean up subscriptions when panels are removed
- Set bonsplit tab bar and tab height to 30px (in submodule)

* Fix socket API regressions in list_surfaces, list_bonsplit_tabs, focus_pane

- list_surfaces: Remove [terminal]/[browser] suffix to keep UUID-only format
  that clients and tests expect for parsing
- list_bonsplit_tabs --pane: Properly look up pane by UUID instead of
  creating a new PaneID (requires bonsplit PaneID.id to be public)
- focus_pane: Accept both UUID strings and integer indices as documented

* Fix browser panel stability and keyboard shortcuts

- Prevent WKWebView focus lifecycle crashes during split/view reshuffles
- Match bracket shortcuts via keyCode (Cmd+Shift+[ / ], Cmd+Ctrl+[ / ])
- Support Ghostty config goto_split:* keybinds when WebView is focused
- Add focus_webview/is_webview_focused socket commands and regression tests
- Rename SidebarTab to Workspace and update docs

* Make ctrl+enter keybind test skippable

Skip when the Ghostty keybind isn't configured or when osascript can't send keystrokes (no Accessibility permission), so VM runs stay green.

* Auto-focus browser omnibar when blank

When a browser surface is focused but no URL is loaded yet, focus the address bar instead of the WKWebView.

* Stabilize socket surface indexing

* Focus browser omnibar escape; add webview keybind UI tests

- Escape in omnibar now returns focus to WKWebView\n- Add UI tests for Cmd+Ctrl+H pane navigation with WebKit focused (including Ghostty config)\n- Avoid flaky element screenshots in UpdatePillUITests on the UTM VM

* Fix browser drag-to-split blanks and socket parsing

* Fix webview-focused shortcuts and stabilize browser splits

- Match ctrl/shift shortcuts by keyCode where needed (Ctrl+H, bracket keys)
- Load Ghostty goto_split triggers reliably and refresh on config load
- Add debug socket helpers: set_shortcut + simulate_shortcut for tests
- Convert browser goto_split/keybind tests to socket-based injection (no osascript)
- Bump bonsplit for drag-to-split fixes

* Fix split layout collapse and harden socket pane APIs

* Stabilize OSC 99 notification test timing

* Fix terminal focus routing after split reparent

* Support simulate_shortcut enter for focus routing test

* Stabilize terminal focus routing test

* Fix frozen new terminal tabs after many splits

* Fix frozen new terminal tabs after splits

* Fix terminal freeze on launch/new tabs

* Update ghostty submodule

* Fix terminal focus/render stalls after split churn

* Fix nested split collapsing existing pane

* Fix nested split collapse + stabilize new-surface focus

* Update bonsplit submodule

* Fix SIGINT test flake

* Remove bonsplit tab-switch crossfade

* Remove PROJECTS.md

* Remove bonsplit tab selection animation

* Ignore generated test reports

* Middle click closes tab

* Revert unintended .gitignore change

* Fix build after main merge

* Revert "Fix build after main merge"

This reverts commit 16bf9816d0856b5385d52f886aa5eb50f3c9d9a4.

* Revert "Merge remote-tracking branch 'origin/main' into fix/blank-terminal-and-visual-tests"

This reverts commit 7c20fb53fd71fea7a19a3673f2dd73e5f0c783c4, reversing
changes made to 0aff107d787bc9d8bbc28220090b4ca7af72e040.

* Remove tab close fade animation

* Use terminal.fill icon

* Make terminal tab icon smaller

* Match browser globe tab icon size

* Bonsplit: tab min width 48 and tighter close button

* Bonsplit: smaller tab title font

* Show unread notification badge in bonsplit tabs and improve UI polish

Sync unread notification state to bonsplit tab badges (blue dot).
Improve EmptyPanelView with Terminal/Browser buttons and shortcut hints.
Add tooltips to close tab button and search overlay buttons.

* Fix reload.sh single-instance safety check on macOS

Replace GNU-only `ps -o etimes=` with portable `ps -o etime=` and
parse the dd-hh:mm:ss format manually for macOS compatibility.

* Centralize keyboard shortcut definitions into Action enum

Replace per-shortcut boilerplate with a single Action enum that holds
the label, defaults key, and default binding for each shortcut. All
call sites now use shortcut(for:). Settings UI is data-driven via
ForEach(Action.allCases). Titlebar tooltips update dynamically when
shortcuts are changed. Remove duplicate .keyboardShortcut() modifiers
from menu items that are already handled by the event monitor.

* Fix WKWebView consuming app menu shortcuts and close panel confirmation

Add CmuxWebView subclass that routes key equivalents through the main
menu before WebKit, so Cmd+N/Cmd+W/tab switching work when a browser
pane is focused. Fix Cmd+W close-panel path: bypass Bonsplit delegate
gating after the user confirms the running-process dialog by tracking
forceCloseTabIds. Add unit tests (CmuxWebViewKeyEquivalentTests) and
UI test scaffolding (MenuKeyEquivalentRoutingUITests) with a new
cmux-unit Xcode scheme.

* Update CLAUDE.md and PROJECTS.md with recent changes

CLAUDE.md: enforce --tag for reload commands, add cleanup safety rules.
PROJECTS.md: log notification badge, reload.sh fix, Cmd+W fix, WebView
key equiv fix, and centralized shortcuts work.

* Keep selection index stable on close

* Add concepts page documenting terminology hierarchy

New docs page explaining Window > Workspace > Pane > Surface > Panel
hierarchy with aligned ASCII diagram. Updated tabs.mdx and splits.mdx
to use consistent terminology (workspace instead of tab, surface
instead of panel) and corrected outdated CLI command references.

* Update bonsplit submodule

* WIP: improve split close stability and UI regressions

* Close terminal panel on child exit; hide terminal dirty dot

* Fix split close/focus regressions and stabilize UI tests

* Add unread Dock/Cmd+Tab badge with settings toggle

* Fix browser-surface shortcuts and Cmd+L browser opening

* Snapshot current workspace state before regression fixes

* Update bonsplit submodule snapshot

* Stabilize split-close regression capture and sidebar resize assertions

* Change default Show Notifications shortcut from Cmd+Shift+I to Cmd+I

* Fix update check readiness race, enable release update logging, and improve checking spinner

* Restore terminal file drop, fix browser omnibar click focus, and add panel workspace ID mutation for surface moves

* Add Cmd+digit workspace hints, titlebar shortcut pills, sidebar drag-reorder, and workspace placement settings

* Add v2 browser automation API, surface move/reorder commands, and short-handle ref system to TerminalController

* Add CLI browser command surface, --id-format flag, and move/reorder commands

* Extend test clients with move/reorder APIs, ref-handle support, and increased timeouts

* Harden test runner scripts with deterministic builds, retry logic, and robust socket readiness

* Stabilize existing test suites with focus-wait helpers, increased timeouts, and API shape updates

* Add terminal file drop e2e regression test

* Add v2 browser API, CLI ref resolution, and surface move/reorder test suites

* Add unit tests for shortcut hints, workspace reorder, drop planner, and update UI test stabilization

* Add cmux-debug-windows skill with snapshot script and agent config

* Update project docs: mark browser parity and move/reorder phases complete, add parallel agent workflow guidelines

* Update bonsplit submodule: re-entrant setPosition guard, tab shortcut hints, and moveTab/reorderTab API

* Add browser agent UX improvements: snapshot refs, placement reuse, diagnostics, and skill docs

- Upgrade browser.snapshot to emit accessibility tree text with element refs (eN)
- Add right-sibling pane reuse policy for browser.open_split placement
- Add rich not_found diagnostics with retry logic for selector actions
- Support --snapshot-after for post-action verification on mutating commands
- Allow browser fill with empty text for clearing inputs
- Default CLI --id-format to refs-first (UUIDs opt-in via --id-format uuids|both)
- Format legacy new-pane/new-surface output with short surface refs
- Add skills/cmuxterm-browser/ and skills/cmuxterm/ end-user skill docs
- Add regression tests for placement policy, snapshot refs, diagnostics, and ID defaults

* Update bonsplit submodule: keep raster favicons in color when inactive
2026-02-13 16:45:31 -08:00

220 lines
6.8 KiB
Python

#!/usr/bin/env python3
"""
CPU usage test for cmux.
This test monitors cmux's CPU usage during idle periods to catch
performance regressions like runaway animations or continuous view updates.
Run this test after launching cmux:
python3 tests/test_cpu_usage.py
The test will fail if idle CPU is *sustained* above threshold.
"""
from __future__ import annotations
import subprocess
import sys
import time
import re
import statistics
from pathlib import Path
from typing import List, Optional
# Maximum acceptable CPU usage during idle (percentage)
MAX_IDLE_CPU_PERCENT = 15.0
# How long to wait for app to settle before measuring (seconds)
SETTLE_TIME = 2.0
# Optional pre-check: wait for CPU to calm down before taking the idle sample.
# This reduces startup/transient flakiness while still preserving regression signal.
IDLE_PRECHECK_MAX_WAIT = 20.0
IDLE_PRECHECK_THRESHOLD = 20.0
IDLE_PRECHECK_CONSECUTIVE = 4
# Duration to monitor CPU usage (seconds)
MONITOR_DURATION = 5.0
# Sampling interval for CPU checks (seconds)
SAMPLE_INTERVAL = 0.5
# Patterns that indicate performance issues in sample output
SUSPICIOUS_PATTERNS = [
r"body\.getter.*\d{3,}", # View body getter called 100+ times
r"repeatForever", # Runaway animations
r"TimelineView.*animation.*\d{3,}", # Unpaused timeline views
]
def get_cmux_pid() -> Optional[int]:
"""Get the PID of the running cmux process."""
result = subprocess.run(
["pgrep", "-f", r"cmux\.app/Contents/MacOS/cmux$"],
capture_output=True,
text=True,
)
if result.returncode != 0:
# Try DEV build
result = subprocess.run(
["pgrep", "-f", r"cmux DEV\.app/Contents/MacOS/cmux"],
capture_output=True,
text=True,
)
if result.returncode != 0:
return None
pids = result.stdout.strip().split("\n")
return int(pids[0]) if pids and pids[0] else None
def get_cpu_usage(pid: int) -> float:
"""Get current CPU usage percentage for a process."""
result = subprocess.run(
["ps", "-p", str(pid), "-o", "%cpu="],
capture_output=True,
text=True,
)
if result.returncode != 0:
return 0.0
try:
return float(result.stdout.strip())
except ValueError:
return 0.0
def sample_process(pid: int, duration: int = 2) -> str:
"""Sample a process and return the output."""
result = subprocess.run(
["sample", str(pid), str(duration)],
capture_output=True,
text=True,
)
return result.stdout + result.stderr
def check_sample_for_issues(sample_output: str) -> List[str]:
"""Check sample output for suspicious patterns."""
issues = []
for pattern in SUSPICIOUS_PATTERNS:
if re.search(pattern, sample_output):
issues.append(f"Found suspicious pattern: {pattern}")
return issues
def monitor_cpu_usage(pid: int, duration: float, interval: float) -> List[float]:
"""Monitor CPU usage over a period and return all readings."""
readings = []
start = time.time()
while time.time() - start < duration:
cpu = get_cpu_usage(pid)
readings.append(cpu)
time.sleep(interval)
return readings
def wait_for_idle_precheck(pid: int) -> bool:
"""Wait for a short streak of lower CPU readings before formal measurement."""
deadline = time.time() + IDLE_PRECHECK_MAX_WAIT
streak = 0
while time.time() < deadline:
cpu = get_cpu_usage(pid)
if cpu <= IDLE_PRECHECK_THRESHOLD:
streak += 1
if streak >= IDLE_PRECHECK_CONSECUTIVE:
return True
else:
streak = 0
time.sleep(SAMPLE_INTERVAL)
return False
def main():
print("=" * 60)
print("cmux CPU Usage Test")
print("=" * 60)
# Find cmux process
pid = get_cmux_pid()
if pid is None:
print("\n❌ SKIP: cmux is not running")
print("Start cmux and run this test again.")
return 0 # Not a failure, just skip
print(f"\nFound cmux process: PID {pid}")
# Wait for app to settle
print(f"Waiting {SETTLE_TIME}s for app to settle...")
time.sleep(SETTLE_TIME)
print(
f"Waiting for idle precheck (<= {IDLE_PRECHECK_THRESHOLD:.1f}% "
f"for {IDLE_PRECHECK_CONSECUTIVE} samples, timeout {IDLE_PRECHECK_MAX_WAIT:.1f}s)..."
)
if not wait_for_idle_precheck(pid):
print(" ⚠️ Precheck timeout; continuing with measurement anyway")
else:
print(" ✓ Idle precheck passed")
# Monitor CPU usage
print(f"Monitoring CPU usage for {MONITOR_DURATION}s...")
readings = monitor_cpu_usage(pid, MONITOR_DURATION, SAMPLE_INTERVAL)
avg_cpu = sum(readings) / len(readings) if readings else 0.0
max_cpu = max(readings) if readings else 0.0
min_cpu = min(readings) if readings else 0.0
median_cpu = statistics.median(readings) if readings else 0.0
over_threshold = sum(1 for r in readings if r > MAX_IDLE_CPU_PERCENT)
print("\nCPU Usage Results:")
print(f" Average: {avg_cpu:.1f}%")
print(f" Median: {median_cpu:.1f}%")
print(f" Max: {max_cpu:.1f}%")
print(f" Min: {min_cpu:.1f}%")
print(f" Samples: {len(readings)}")
print(f" >{MAX_IDLE_CPU_PERCENT:.1f}%: {over_threshold}/{len(readings)}")
# Treat failures as sustained-idle regressions, not single transient spikes.
sustained_high = over_threshold >= ((len(readings) + 1) // 2)
if median_cpu > MAX_IDLE_CPU_PERCENT or sustained_high:
reason = []
if median_cpu > MAX_IDLE_CPU_PERCENT:
reason.append(f"median {median_cpu:.1f}% > {MAX_IDLE_CPU_PERCENT:.1f}%")
if sustained_high:
reason.append(f"{over_threshold}/{len(readings)} samples above threshold")
print(f"\n❌ FAIL: Sustained high idle CPU detected ({'; '.join(reason)})")
# Take a sample to diagnose
print("\nTaking process sample for diagnosis...")
sample_output = sample_process(pid, 2)
# Check for known issues
issues = check_sample_for_issues(sample_output)
if issues:
print("\nDiagnostic findings:")
for issue in issues:
print(f" - {issue}")
# Save sample for debugging
sample_file = Path("/tmp/cmux_cpu_test_sample.txt")
sample_file.write_text(sample_output)
print(f"\nFull sample saved to: {sample_file}")
# Show top functions from sample
print("\nTop functions in sample (look for .body.getter or Animation):")
lines = sample_output.split("\n")
relevant_lines = [
l for l in lines
if "cmux" in l and ("body" in l or "Animation" in l or "Timer" in l)
][:10]
for line in relevant_lines:
print(f" {line.strip()[:100]}")
return 1
print("\n✅ PASS: CPU usage is within acceptable range")
return 0
if __name__ == "__main__":
sys.exit(main())