Skip to content

Comments

Fix segmentation fault in libev prepare_callback during shutdown#708

Open
dkropachev wants to merge 2 commits intomasterfrom
fix-libev-segfault
Open

Fix segmentation fault in libev prepare_callback during shutdown#708
dkropachev wants to merge 2 commits intomasterfrom
fix-libev-segfault

Conversation

@dkropachev
Copy link
Collaborator

@dkropachev dkropachev commented Feb 21, 2026

Summary

Fixes a race condition that causes segmentation faults when the Python driver shuts down while libev callbacks are still executing.

Addresses: scylladb/scylla-cluster-tests#11713 and #524

Root Cause

Race condition during shutdown where metrics cleanup in main thread races with libev callbacks executing in background thread, leading to access of freed Python objects at address 0x2f998.

Changes

  • Add null checks in prepare_callback() to handle destroyed objects safely
  • Stop prepare watcher before cleanup to prevent race conditions
  • Increase thread join timeout to allow proper shutdown synchronization

Test plan

  • Basic import test passes after changes
  • Run integration tests with libev event loop
  • Verify no regression in normal operation

This commit fixes a race condition that causes segmentation faults when
the Python driver shuts down while libev callbacks are still executing.

Changes:
1. Add null checks in prepare_callback() to handle destroyed objects
2. Stop prepare watcher before cleanup to prevent race conditions
3. Increase thread join timeout to allow proper shutdown synchronization

The issue occurred when metrics cleanup during shutdown raced with libev
callbacks executing in background threads, causing access to freed Python
objects.

Fixes crash at address 0x2f998 in prepare_callback accessing destroyed
libevwrapper_Prepare objects.
@dkropachev dkropachev marked this pull request as draft February 21, 2026 22:35
@dkropachev dkropachev marked this pull request as ready for review February 21, 2026 22:41
@dkropachev dkropachev self-assigned this Feb 21, 2026
@dkropachev dkropachev requested a review from fruch February 21, 2026 22:42
This extends the segmentation fault fix to cover all libev callbacks that
access Python objects during shutdown.

Changes:
1. Add null checks in io_callback() before accessing self->callback
2. Add null checks in timer_callback() before accessing self->callback

These prevent race conditions similar to the prepare_callback issue where
libev callbacks could execute after Python objects are destroyed during
driver shutdown.
@fruch
Copy link

fruch commented Feb 22, 2026

@dkropachev that replace the fix copilot was trying todo in #680 ? on c side of things ? that was reverted ?

@dkropachev
Copy link
Collaborator Author

dkropachev commented Feb 22, 2026

@dkropachev that replace the fix copilot was trying todo in #680 ? on c side of things ? that was reverted ?

No, it addresses same problem from other side. It compliments it.

Copy link

@fruch fruch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants