patches: fix O(position) performance via UAZoomEnabled caching
Root cause (per Opus analysis): UAZoomEnabled() is a synchronous Mach IPC roundtrip to macOS Accessibility server, called 3x per redisplay cycle. At 60fps = 180 IPC roundtrips/second blocking the main thread. Combined with Emacs's inherent O(position) redisplay cost, this compounded into progressive choppy behavior. Fix 1: ns_zoom_enabled_p() caches UAZoomEnabled() for 1 second. Fix 2: ns_zoom_track_completion() rate-limited to 2 Hz. Also includes BUF_CHARS_MODIFF fix (patch 0009) for VoiceOver cache.
This commit is contained in:
@@ -0,0 +1,117 @@
|
||||
From 9e2b56b7aec560540ef9f49c9444c2c0fc903090 Mon Sep 17 00:00:00 2001
|
||||
From: Daneel <daneel@sukany.cz>
|
||||
Date: Sun, 1 Mar 2026 05:23:37 +0100
|
||||
Subject: [PATCH 11/11] perf: cache UAZoomEnabled() and rate-limit completion
|
||||
tracking
|
||||
MIME-Version: 1.0
|
||||
Content-Type: text/plain; charset=UTF-8
|
||||
Content-Transfer-Encoding: 8bit
|
||||
|
||||
UAZoomEnabled() performs a synchronous Mach IPC roundtrip to the macOS
|
||||
Accessibility server (~50-200 µs per call). With three call sites in
|
||||
the redisplay hot path (ns_draw_window_cursor, ns_update_end fallback,
|
||||
ns_zoom_track_completion) this adds up to 150-600 µs of IPC overhead
|
||||
per redisplay cycle. At 60 fps, that is 9-36 ms/s of pure IPC cost.
|
||||
The overhead blocks the main thread, extending the inter-frame interval
|
||||
and making cursor movement feel choppy, especially at the end of large
|
||||
files where Emacs's own redisplay work is already at its maximum.
|
||||
|
||||
* src/nsterm.m (ns_zoom_cached_enabled, ns_zoom_cache_time): New
|
||||
static variables.
|
||||
(ns_zoom_enabled_p): New inline wrapper around UAZoomEnabled().
|
||||
Caches the result for 1 second using CFAbsoluteTimeGetCurrent(), a
|
||||
near-free VDSO call. Zoom state changes only on explicit user action
|
||||
in System Settings, so a 1-second TTL is indistinguishable from
|
||||
querying on every frame. Replaces all three UAZoomEnabled() call
|
||||
sites.
|
||||
(ns_zoom_track_completion): Add 500 ms rate limit. Completion
|
||||
candidate detection requires FOR_EACH_FRAME iteration and
|
||||
Fget_char_property calls. Completion state changes at human
|
||||
interaction speed (well below 2 Hz), so 500 ms polling is
|
||||
imperceptible while eliminating per-redisplay Lisp overhead.
|
||||
---
|
||||
src/nsterm.m | 43 ++++++++++++++++++++++++++++++++++++++++---
|
||||
1 file changed, 40 insertions(+), 3 deletions(-)
|
||||
|
||||
diff --git a/src/nsterm.m b/src/nsterm.m
|
||||
index e3f9466..590287a 100644
|
||||
--- a/src/nsterm.m
|
||||
+++ b/src/nsterm.m
|
||||
@@ -1092,6 +1092,31 @@ static NSRect constrain_frame_rect(NSRect frameRect, bool isFullscreen)
|
||||
ivy-current-match, corfu-current that mark the selected candidate. */
|
||||
static bool ns_ax_face_is_selected (Lisp_Object face);
|
||||
|
||||
+/* Cached wrapper around UAZoomEnabled().
|
||||
+ UAZoomEnabled() performs a synchronous Mach IPC roundtrip to the
|
||||
+ macOS Accessibility server (~50-200 µs per call). With three call
|
||||
+ sites in the redisplay hot path the overhead accumulates to
|
||||
+ 150-600 µs per frame. Zoom state only changes on explicit user
|
||||
+ action in System Settings, so a 1-second TTL is both safe and
|
||||
+ indistinguishable from querying on every frame.
|
||||
+
|
||||
+ Uses CFAbsoluteTimeGetCurrent() (a VDSO read on modern macOS, ~5 ns)
|
||||
+ to avoid a second IPC roundtrip for time. */
|
||||
+static BOOL ns_zoom_cached_enabled;
|
||||
+static CFAbsoluteTime ns_zoom_cache_time;
|
||||
+
|
||||
+static BOOL
|
||||
+ns_zoom_enabled_p (void)
|
||||
+{
|
||||
+ CFAbsoluteTime now = CFAbsoluteTimeGetCurrent ();
|
||||
+ if (now - ns_zoom_cache_time > 1.0)
|
||||
+ {
|
||||
+ ns_zoom_cached_enabled = UAZoomEnabled ();
|
||||
+ ns_zoom_cache_time = now;
|
||||
+ }
|
||||
+ return ns_zoom_cached_enabled;
|
||||
+}
|
||||
+
|
||||
/* Scan overlay before-string / after-string properties in the
|
||||
selected window for a completion candidate with a "selected"
|
||||
face. Return the 0-based visual line index of the selected
|
||||
@@ -1215,11 +1240,23 @@ If a completion candidate is selected (overlay or child frame),
|
||||
static void
|
||||
ns_zoom_track_completion (struct frame *f, EmacsView *view)
|
||||
{
|
||||
- if (!UAZoomEnabled ())
|
||||
+ if (!ns_zoom_enabled_p ())
|
||||
return;
|
||||
if (!WINDOWP (f->selected_window))
|
||||
return;
|
||||
|
||||
+ /* Rate-limit completion scan to 2 Hz. Completion overlays and
|
||||
+ child frames change at human interaction speed (<<10 Hz), so
|
||||
+ checking every 500 ms is indistinguishable from every frame
|
||||
+ while eliminating per-redisplay FOR_EACH_FRAME iteration. */
|
||||
+ {
|
||||
+ static CFAbsoluteTime last_completion_track;
|
||||
+ CFAbsoluteTime now = CFAbsoluteTimeGetCurrent ();
|
||||
+ if (now - last_completion_track < 0.5)
|
||||
+ return;
|
||||
+ last_completion_track = now;
|
||||
+ }
|
||||
+
|
||||
specpdl_ref count = SPECPDL_INDEX ();
|
||||
record_unwind_current_buffer ();
|
||||
|
||||
@@ -1323,7 +1360,7 @@ so the visual offset is (ov_line + 1) * line_h from
|
||||
(zoomCursorUpdated is NO). */
|
||||
#if defined (MAC_OS_X_VERSION_MIN_REQUIRED) \
|
||||
&& MAC_OS_X_VERSION_MIN_REQUIRED >= 101000
|
||||
- if (view && !view->zoomCursorUpdated && UAZoomEnabled ()
|
||||
+ if (view && !view->zoomCursorUpdated && ns_zoom_enabled_p ()
|
||||
&& !NSIsEmptyRect (view->lastCursorRect))
|
||||
{
|
||||
NSRect r = view->lastCursorRect;
|
||||
@@ -3500,7 +3537,7 @@ EmacsView pixels (AppKit, flipped, top-left origin)
|
||||
|
||||
#if defined (MAC_OS_X_VERSION_MIN_REQUIRED) \
|
||||
&& MAC_OS_X_VERSION_MIN_REQUIRED >= 101000
|
||||
- if (UAZoomEnabled ())
|
||||
+ if (ns_zoom_enabled_p ())
|
||||
{
|
||||
NSRect windowRect = [view convertRect:r toView:nil];
|
||||
NSRect screenRect
|
||||
--
|
||||
2.43.0
|
||||
|
||||
Reference in New Issue
Block a user