Fix dialog lifecycle race on GNOME 46+ (refocused follow-up to #9)#11
Fix dialog lifecycle race on GNOME 46+ (refocused follow-up to #9)#11tmetz1987 wants to merge 2 commits intonick-redwill:mainfrom
Conversation
Moves the player spawn, window interception, and screen shield dialog injection out of enable() and into a new _setupForLock() method that runs whenever Main.screenShield emits active-changed with active=true. A matching _teardownForUnlock() runs on unlock and on disable(). Reason: hooking Main.screenShield._dialog directly at enable() time is unreliable on GNOME 46 Wayland. The dialog object can be null (if the extension is enabled before the user has ever locked) or a dead reference (if GNOME recreates the dialog between enable() and the next lock cycle). On Raspberry Pi 5 / Ubuntu 24.04 this manifested as a black screen on Super+L with no visible video, because the injections were targeting a destroyed dialog instance. Listening to active-changed guarantees we inject into the current, live dialog on every lock. _injectIntoDialog() includes a short poll fallback because active-changed can fire a hair before _dialog is populated in some GNOME versions. All upstream behavior is preserved: connector-based window-actor mapping, _forceFullscreen debug path, multi-monitor position signals, and the GNOME 48 TapAction replacement.
The original `_injectIntoDialog()` retry path had two issues that
could cause the extension to silently fail on GNOME versions where
`_dialog` is recreated (and populated asynchronously) on each lock:
1. It only retried once after 100ms. If `_dialog` wasn't ready
within that single tick, the retry returned and no injection
ever happened.
2. The retry timer id was never saved, so `_teardownForUnlock()`
could not cancel a pending retry. If unlock happened during
the 100ms wait, the retry would fire into a half-torn-down
state on the next lock.
This matches the reported symptom in nick-redwill#9 ("works on first lock but
fails silently on subsequent attempts" on GNOME 49) — though I
cannot reproduce it directly, since my test environment is GNOME
46 / Raspberry Pi 5 / Ubuntu 24.04.
The fix:
- Cap the retry loop at ~2s (20 × 100ms) and log if we give up,
so the failure mode becomes debuggable instead of silent.
- Track the timer id in `this._injectRetryId` and cancel it in
`_teardownForUnlock()`, and reset `_injectAttempts` there too.
- Initialise both fields in `_resetLockState()` for symmetry.
No behaviour change on the happy path: if `_dialog` is present on
the first attempt (as it always was in my GNOME 46 testing), the
retry code is never reached.
|
Hi, thanks for the update! Unfortunately, the issue with videos not playing is still present. I did some digging and noticed that the At this point, the only reliable approach seems to be polling the state every N ms (e.g. every 100 ms), checking That said, I’m a bit curious about your experience. In my testing, I’ve never encountered a situation where windows were intercepted before the dialog was initialized, and no users have reported issues like that either. From your previous PR, it seems like you did run into this case. How often does this happen for you and under what conditions? |
Follow-up to #9 — refocused to only the part you indicated you'd consider reviewing again, with a fix attempting to address the GNOME 49 silent-fail you observed.
What changed vs. #9
The two commits you rejected on scope grounds — the screen shield lightbox override and the Mutter
PowerSaveModeDPMS wake — have been dropped from this PR entirely. I hear you on both counts:Unblank lock screenalready owns. I also confirmed the conflict you described — when both extensions are active, the lightbox override wins and your video never shows. That's clearly not acceptable upstream.Unblankisn't installed and the physical DPMS blank is visible because of the slow first-frame time. They won't come back in this PR.So this PR contains just the lifecycle refactor (
refactor: defer player setup until screen shield is active) plus one small follow-up commit that tries to fix the "works on first lock, silent fail on subsequent" regression you hit on GNOME 49.Commit 1 — lifecycle refactor (unchanged from #9)
Same rationale as before: hooking
Main.screenShield._dialogdirectly atenable()time is unreliable because the dialog can be null (never locked yet) or a dead reference (recreated betweenenable()and the next lock cycle). Moving the player spawn and injection into aMain.screenShield::active-changedhandler guarantees we inject into the live dialog every time.Validated on Raspberry Pi 5 / Ubuntu 24.04 / GNOME Shell 46.0 / GJS 1.80.2 / Wayland. All upstream behavior preserved: connector-based window mapping,
_forceFullscreendebug path, multi-monitor position signals, and the GNOME 48TapActionreplacement from #7.Commit 2 — bounded, cancelable retry (new, tries to address your GNOME 49 observation)
On your feedback re: GNOME 49:
I can't reproduce this directly — I don't have a GNOME 49 machine — so I went back and re-read the code for any path that could produce exactly that symptom. I think I found one:
Two latent issues here that I missed in #9:
_dialogisn't populated by the time that one retry fires, the whole injection silently gives up. Nothing logs, nothing throws._teardownForUnlock()can't cancel it. If unlock happens while the retry is pending, the retry still fires — potentially into partially-torn-down state.Hypothesis for the second-lock symptom on GNOME 49: on the first lock of a session,
_dialogis either already present or arrives within 100ms, so the happy path catches it. On unlock, GNOME 49 may tear the dialog down. On the second lock,_dialogis recreated asynchronously and is slower than 100ms, so the single retry fires before it's ready and we silently give up. From the user's perspective: first lock shows video, second lock shows the default GNOME lock screen with no error.This is still a hypothesis — I have no way to verify the exact timing on GNOME 49 from here. But the bounded-retry fix is defensible regardless of whether it turns out to be the root cause:
this._injectRetryIdand cancels it in_teardownForUnlock().console.warnif we exhaust retries, so if it ever fails again the mode is debuggable instead of silent._dialogis present on the first attempt (as it is on GNOME 46 in my testing), the retry code is never reached.If this doesn't fix it on your GNOME 49 setup, I'd really appreciate two things:
journalctl --user -b -o cat /usr/bin/gnome-shell | grep -i lockscreenafter a failing lock cycle — the newconsole.warnshould reveal whether we're giving up in the retry path or somewhere else entirely.Main.screenShield._dialog === <previous>) or a new one. That'd tell us if the dialog is being recreated at all on 49.I'm happy to iterate further. I'd rather resolve this for real than force it through.
Test plan
gnome-extensions disablewhile locked → cleanly revertsgnome-extensions enable→ works on next lockMain.screenShield.activeearly-run path inenable())