Skip to content

fix: ARM64 context switch, boot.S x17 clobber, and BWM rendering#218

Merged
ryanbreen merged 1 commit intomainfrom
fix/arm64-context-switch-and-rendering
Feb 15, 2026
Merged

fix: ARM64 context switch, boot.S x17 clobber, and BWM rendering#218
ryanbreen merged 1 commit intomainfrom
fix/arm64-context-switch-and-rendering

Conversation

@ryanbreen
Copy link
Owner

Summary

  • boot.S ERET path: Fixed x17 register clobbered by mrs x17, spsr_el1 after frame restore — root cause of DATA_ABORT crashes (FAR=0x5) during the 84-test suite. Changed to x16 (scratch register) for SPSR check.
  • Context switch consolidation: Reduced 15-22 separate SCHEDULER lock acquisitions to a single lock hold per context switch, eliminating TOCTOU race windows between lock releases.
  • Render thread GPU flush: Fixed blank screen when BWM takes display ownership. DISPLAY_TAKEN was incorrectly skipping flush_framebuffer(), so dirty rects from sys_fbdraw Flush were never submitted to the GPU.
  • kthread lock ordering: Fixed potential deadlock in kthread_exit() and current_kthread() by matching without_interrupts pattern used by kthread_run.
  • Full system test: New run-aarch64-full-test.sh that waits for all 84 tests, verifies services, and runs a 15s stability soak.

Test plan

  • ARM64 build clean (zero warnings)
  • x86_64 build clean (zero warnings)
  • 84/84 boot tests pass (3 consecutive runs)
  • Full system test passes (boot tests + services + 15s soak)
  • BWM displays correctly with terminal and bounce demo
  • Sustained operation under GPU load (60+ seconds, no crashes)

🤖 Generated with Claude Code

…rendering

Three critical fixes for ARM64 stability and rendering:

1. boot.S ERET path: Fixed x17 register clobbered by `mrs x17, spsr_el1`
   after restoring it from the exception frame. Changed both sync exception
   and IRQ handlers to use x16 (already a scratch register saved in
   eret_scratch) for the SPSR EL0/EL1 check. This was the root cause of
   DATA_ABORT crashes (FAR=0x5, DFSC=0x21) during the 84-test suite.

2. Context switch consolidation: Reduced 15-22 separate SCHEDULER lock
   acquisitions per context switch to a single lock hold, eliminating
   TOCTOU windows between lock releases. Added lock_for_context_switch(),
   inline save/restore helpers, and dispatch_thread_locked().

3. Render thread GPU flush: Fixed blank screen when BWM takes display
   ownership. sys_fbdraw Flush copies pixels and marks dirty rects but
   relies on the render thread to call gpu_mmio::flush_rect(). The
   DISPLAY_TAKEN flag was incorrectly skipping flush_framebuffer(),
   so GPU scanout never happened.

Also includes: kthread lock ordering fixes, full system test script
(run-aarch64-full-test.sh), and test harness improvements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ryanbreen ryanbreen merged commit 9a847e2 into main Feb 15, 2026
2 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant