Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Jan 10, 2026

Fix threading failures on #6410

Summary by CodeRabbit

  • New Features

    • Global trace/profile handlers and cross-thread settrace/setprofile APIs; per-thread current-frames reporting.
    • Per-VM async-generator hook storage and joinable thread handles for thread management.
  • Bug Fixes

    • Improved thread lifecycle and cleanup; interpreter finalization now waits for non-daemon threads.
    • fork() now warns when invoked in multi-threaded processes.
  • Chores

    • Increased CI timeouts and adjusted test settings.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 10, 2026

📝 Walkthrough

Walkthrough

Introduces VM-scoped tracing/profile globals, per-thread current-frame tracking, thread handle registries and lifecycle (start/join/shutdown), moves asyncgen hooks to VM fields, integrates fork safety checks, and waits for non-daemon threads during interpreter finalization.

Changes

Cohort / File(s) Summary
Threading infrastructure & thread handles
crates/vm/src/stdlib/thread.rs
Adds public thread handle types and lifecycle (ThreadHandle, StartJoinableThreadArgs, join, _set_done), registries (THREAD_HANDLES, SHUTDOWN_HANDLES), per-thread Local data with cleanup guards, get_all_current_frames(), after_fork_child(), _shutdown() and related lifecycle helpers.
sys module: frames & global trace/profile APIs
crates/vm/src/stdlib/sys.rs
Adds _current_frames() (threading-aware), _settraceallthreads() and _setprofileallthreads() to set global trace/profile functions; updates asyncgen hook setters to use VM fields.
VM state & initialization
crates/vm/src/vm/mod.rs
Adds async_gen_firstiter/async_gen_finalizer to VirtualMachine, global_trace_func/global_profile_func to PyGlobalState, and threading registries (main_thread_ident, thread_frames, thread_handles, shutdown_handles) with initialization and frame push/pop updates.
Per-thread frame tracking & thread startup
crates/vm/src/vm/thread.rs
Introduces CurrentFrameSlot and helpers (init_frame_slot_if_needed, update_current_frame, cleanup_current_thread_frames, reinit_frame_slot_after_fork); makes new threads inherit global trace/profile settings and initialize asyncgen hooks to None.
Finalization & shutdown integration
crates/vm/src/vm/interpreter.rs
Adds wait_for_thread_shutdown() and calls threading._shutdown() during finalize before atexit handlers.
Fork safety & warnings
crates/vm/src/stdlib/posix.rs
Calls after_fork_child() in fork child path and adds warn_if_multi_threaded() to emit DeprecationWarning when forking with multiple threads.
Async generator hooks
crates/vm/src/builtins/asyncgenerator.rs
Reads asyncgen hooks from VM fields (vm.async_gen_firstiter / vm.async_gen_finalizer) instead of thread-local storage.
CI adjustments
.github/workflows/ci.yaml
Removes a couple polluting tests from env vars and increases CI job timeouts (35→45 mins).

Sequence Diagram(s)

sequenceDiagram
  participant Main as Main Thread
  participant VM as VirtualMachine
  participant Global as PyGlobalState
  participant Thread as New OS Thread
  participant Sys as _thread module

  Note over Main,VM: Thread start flow
  Main->>VM: start_joinable_thread(args)
  VM->>Global: read global_trace_func / global_profile_func
  VM->>Thread: spawn OS thread (inherit trace/profile)
  Thread->>Sys: init_frame_slot_if_needed(vm)
  Thread->>VM: register CurrentFrameSlot in thread_frames
  Thread->>VM: set async_gen_firstiter/finalizer = None
Loading
sequenceDiagram
  participant Interpreter as Finalizer
  participant SysMod as threading module
  participant ThreadRegistry as VM thread_handles/shutdown_handles

  Note over Interpreter,SysMod: Shutdown flow during finalize
  Interpreter->>SysMod: attempt to locate threading
  SysMod-->>ThreadRegistry: threading._shutdown()
  ThreadRegistry->>ThreadRegistry: wait for non-daemon threads to join
  ThreadRegistry-->>SysMod: return status/errors
  SysMod-->>Interpreter: report unraisable if error
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • ShaharNaveh

Poem

🐰 nose twitch, whiskers all a-quiver
Threads awake and frames deliver,
Traces hum through every call,
Forks now warned and shutdowns stall—
The VM hops forward, brave and clever. 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main changes: upgrading threading to 3.13.11, implementing sys.setprofile, and expanding threading functionality, which aligns with the changeset's core focus.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone marked this pull request as ready for review January 10, 2026 10:50
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @crates/vm/src/stdlib/thread.rs:
- Around line 709-739: The join method currently accepts a timeout
OptionalArg<f64> but ignores it; change the impl in join to validate the timeout
instead of unwrapping to -1.0: do not call timeout.unwrap_or(-1.0); if
timeout.is_present() return an Err via the VM (e.g. vm.new_not_implemented_error
or vm.new_runtime_error) with a clear message like "timeout is not supported for
thread.join", otherwise proceed with the existing join logic (use
self.inner.lock() and inner.join_handle as before).
🧹 Nitpick comments (1)
crates/vm/src/stdlib/sys.rs (1)

1110-1132: Consider simplifying None/Some conversion.

The logic is correct, but the conversion from PyObjectRef to Option<PyObjectRef> could be more concise:

♻️ Optional refactor to reduce duplication
 #[pyfunction]
 fn _settraceallthreads(tracefunc: PyObjectRef, vm: &VirtualMachine) {
-    let func = if vm.is_none(&tracefunc) {
-        None
-    } else {
-        Some(tracefunc.clone())
-    };
+    let func = (!vm.is_none(&tracefunc)).then(|| tracefunc.clone());
     *vm.state.global_trace_func.lock() = func;
     vm.trace_func.replace(tracefunc);
     update_use_tracing(vm);
 }

 #[pyfunction]
 fn _setprofileallthreads(profilefunc: PyObjectRef, vm: &VirtualMachine) {
-    let func = if vm.is_none(&profilefunc) {
-        None
-    } else {
-        Some(profilefunc.clone())
-    };
+    let func = (!vm.is_none(&profilefunc)).then(|| profilefunc.clone());
     *vm.state.global_profile_func.lock() = func;
     vm.profile_func.replace(profilefunc);
     update_use_tracing(vm);
 }
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fe0598 and b89cca3.

⛔ Files ignored due to path filters (1)
  • Lib/threading.py is excluded by !Lib/**
📒 Files selected for processing (4)
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/vm/mod.rs
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/thread.rs
🧬 Code graph analysis (1)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/stdlib/thread.rs (2)
  • thread (476-478)
  • init_main_thread_ident (441-444)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (8)
crates/vm/src/vm/thread.rs (1)

142-156: LGTM! Global tracing state properly propagated to new threads.

The implementation correctly reads the global trace/profile functions, computes the use_tracing flag, and initializes the new thread's trace/profile state. Thread safety is ensured via the mutex lock.

crates/vm/src/vm/mod.rs (3)

110-113: LGTM! Global trace/profile state properly declared.

The new fields are correctly typed with PyMutex<Option<PyObjectRef>> for thread-safe access and well-documented with their purpose.


198-199: LGTM! Proper initialization of global tracing state.

The fields are correctly initialized with default values (equivalent to PyMutex::new(None)).


308-310: LGTM! Main thread identity correctly initialized.

The main thread identifier is properly captured early in the initialization sequence, before any threading operations occur. The feature gate and explanatory comment make the intent clear.

crates/vm/src/stdlib/thread.rs (4)

3-3: LGTM! Properly exports main thread initialization function.

The init_main_thread_ident function is correctly exported for use by the VM initialization code.


177-179: Behavior change: Lock can now be instantiated from Python.

Previously, attempting to create a Lock instance would raise a type error. Now it successfully returns a new lock. This aligns with the allocate_lock() function behavior and appears intentional.


403-444: LGTM! Thread management functions properly implemented.

The new functions provide thread lifecycle management:

  • Stub implementations (_shutdown, daemon_threads_allowed) are appropriately documented
  • Main thread identification is correctly captured using AtomicCell for thread-safe access
  • _make_thread_handle properly initializes handle state

749-824: LGTM! Thread lifecycle properly managed with unused daemon parameter noted.

The start_joinable_thread function correctly:

  • Initializes thread state machine
  • Propagates VM context to new thread
  • Handles exceptions appropriately
  • Cleans up resources (sentinels, thread count)
  • Stores join handle for future joining

Note: The daemon parameter (line 756) is currently unused and marked with #[allow(dead_code)], suggesting it's reserved for future implementation.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
crates/vm/src/stdlib/thread.rs (1)

577-651: Local::l_dict() holds the global map lock while allocating a new dict; consider allocating outside the lock.
This reduces lock hold time and avoids surprising re-entrancy/deadlock hazards if vm.ctx.new_dict() ever starts interacting with user code / GC hooks.

Possible refactor (allocate before insert)
 fn l_dict(&self, vm: &VirtualMachine) -> PyDictRef {
     let thread_id = std::thread::current().id();
-    let mut data = self.inner.data.lock();
-
-    if let Some(dict) = data.get(&thread_id) {
-        return dict.clone();
-    }
-
-    // Create new dict for this thread
-    let dict = vm.ctx.new_dict();
-    data.insert(thread_id, dict.clone());
+    // Fast path: check under lock
+    if let Some(dict) = self.inner.data.lock().get(&thread_id) {
+        return dict.clone();
+    }
+
+    // Allocate outside lock
+    let dict = vm.ctx.new_dict();
+
+    // Insert (double-check to avoid races)
+    let mut data = self.inner.data.lock();
+    let dict = data.entry(thread_id).or_insert_with(|| dict.clone()).clone();
 
     // Register cleanup guard for this thread
     let guard = LocalGuard {
         local: std::sync::Arc::downgrade(&self.inner),
         thread_id,
     };
     LOCAL_GUARDS.with(|guards| {
         guards.borrow_mut().push(guard);
     });
 
     dict
 }

Also please verify PyDictRef is safe to store in a cross-thread HashMap (Send/Sync expectations) and safe to drop during TLS teardown.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b89cca3 and 800dcd4.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_threading.py is excluded by !Lib/**
📒 Files selected for processing (1)
  • crates/vm/src/stdlib/thread.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/thread.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Check Rust code with clippy
🔇 Additional comments (3)
crates/vm/src/stdlib/thread.rs (3)

3-3: Re-export of init_main_thread_ident looks fine; please ensure it’s called exactly once before _get_main_thread_ident().
If init_main_thread_ident() can be missed in some build/feature path, _get_main_thread_ident() will return 0 silently. Consider a debug assertion or OnceLock-backed initialization.


176-178: Lock::py_new() fix is good (constructor now succeeds).


402-444: _shutdown() as a no-op requires verification of actual usage and shutdown semantics.

The functions _shutdown(), _make_thread_handle(), _get_main_thread_ident(), and _is_main_interpreter() are exposed to Python via #[pyfunction] in the _thread module. While these functions are not called internally within the Rust codebase, they are public Python APIs (despite the leading underscore convention for internal functions) and could be invoked by external code or the threading module during shutdown.

The concern about semantic correctness is valid: CPython's _thread._shutdown() has real shutdown semantics (joining non-daemon threads), and a no-op implementation could theoretically cause issues if Python code expects this behavior. However, without visibility into:

  • How/if the standard library's threading module calls these functions
  • The actual interpreter shutdown sequence in RustPython
  • Test coverage for shutdown behavior

We cannot definitively confirm whether this implementation gap causes practical problems or is intentionally acceptable for RustPython's threading model.

Comment on lines 802 to 1167
#[derive(FromArgs)]
struct StartJoinableThreadArgs {
#[pyarg(positional)]
function: ArgCallable,
#[pyarg(any, optional)]
handle: OptionalArg<PyRef<ThreadHandle>>,
#[pyarg(any, default = true)]
#[allow(dead_code)]
daemon: bool,
}

#[pyfunction]
fn start_joinable_thread(
args: StartJoinableThreadArgs,
vm: &VirtualMachine,
) -> PyResult<PyRef<ThreadHandle>> {
let handle = match args.handle {
OptionalArg::Present(h) => h,
OptionalArg::Missing => ThreadHandle::new().into_ref(&vm.ctx),
};

// Mark as starting
handle.inner.lock().state = ThreadHandleState::Starting;

let func = args.function;
let handle_clone = handle.clone();
let inner_clone = handle.inner.clone();

let mut thread_builder = thread::Builder::new();
let stacksize = vm.state.stacksize.load();
if stacksize != 0 {
thread_builder = thread_builder.stack_size(stacksize);
}

let join_handle = thread_builder
.spawn(vm.new_thread().make_spawn_func(move |vm| {
// Set ident and mark as running
{
let mut inner = inner_clone.lock();
inner.ident = get_ident();
inner.state = ThreadHandleState::Running;
}

// Run the function
match func.invoke((), vm) {
Ok(_) => {}
Err(e) if e.fast_isinstance(vm.ctx.exceptions.system_exit) => {}
Err(exc) => {
vm.run_unraisable(
exc,
Some("Exception ignored in thread started by".to_owned()),
func.into(),
);
}
}

// Mark as done
inner_clone.lock().state = ThreadHandleState::Done;

// Handle sentinels
for lock in SENTINELS.take() {
if lock.mu.is_locked() {
unsafe { lock.mu.unlock() };
}
}
vm.state.thread_count.fetch_sub(1);
}))
.map_err(|err| err.to_pyexception(vm))?;

vm.state.thread_count.fetch_add(1);

// Store the join handle
handle.inner.lock().join_handle = Some(join_handle);

Ok(handle_clone)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

wc -l crates/vm/src/stdlib/thread.rs

Repository: RustPython/RustPython

Length of output: 99


🏁 Script executed:

cd crates/vm/src/stdlib && head -100 thread.rs && echo "---" && sed -n '750,900p' thread.rs

Repository: RustPython/RustPython

Length of output: 7841


🏁 Script executed:

# Get the full ThreadHandle struct definition and related code
cd crates/vm/src/stdlib && rg -A 30 "struct ThreadHandle" thread.rs

Repository: RustPython/RustPython

Length of output: 1661


🏁 Script executed:

# Check if daemon parameter is used anywhere in the file
cd crates/vm/src/stdlib && rg "daemon" thread.rs -n -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 549


🏁 Script executed:

# Look for ThreadHandleState definition
cd crates/vm/src/stdlib && rg -B 5 -A 15 "enum ThreadHandleState" thread.rs

Repository: RustPython/RustPython

Length of output: 556


🏁 Script executed:

cd crates/vm/src/stdlib && rg -B 5 -A 5 "catch_unwind\|AssertUnwindSafe" thread.rs

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check for Clippy warnings
cd crates && cargo clippy --lib 2>&1 | grep -A 5 "thread.rs" | head -50

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if there are tests for start_joinable_thread
cd crates && rg "start_joinable_thread" --type rust -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 2591


🏁 Script executed:

# Verify spawning behavior - check if the closure is inside a panic-safe context
cd crates/vm/src/stdlib && sed -n '830,880p' thread.rs | cat -n

Repository: RustPython/RustPython

Length of output: 2096


🏁 Script executed:

cd crates && rg "start_joinable_thread" -B 3 -A 10 --type rust | head -100

Repository: RustPython/RustPython

Length of output: 4111


🏁 Script executed:

# Check if daemon parameter is documented anywhere
cd crates && rg "daemon.*thread" --type rust -B 2 -A 2 | grep -A 2 -B 2 "start_joinable"

Repository: RustPython/RustPython

Length of output: 2351


🏁 Script executed:

# Check the make_spawn_func to understand panic behavior
cd crates/vm && rg -B 5 -A 20 "make_spawn_func" src/ --type rust | head -80

Repository: RustPython/RustPython

Length of output: 4714


start_joinable_thread: missing panic-safe cleanup, handle reuse validation, and unused daemon parameter.

  • If the spawned thread panics before reaching the cleanup tail (lines 869–876), state won't transition to Done, sentinels may remain locked, and thread_count will leak. Wrap the closure body with std::panic::catch_unwind(AssertUnwindSafe(...)) and ensure state→Done, sentinel unlock, and thread_count.fetch_sub(1) always run via a finally-equivalent block.
  • If a handle is passed (line 813), it's reused without validation. Reject reuse unless the handle is in NotStarted state with no existing join_handle and joined == false.
  • daemon is accepted with #[allow(dead_code)] but never stored or used. Per documentation, it's required for correct _shutdown() semantics.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @crates/vm/src/stdlib/thread.rs:
- Around line 833-897: start_joinable_thread currently sets
handle.inner.lock().state = ThreadHandleState::Starting before calling
thread_builder.spawn, but if spawn() returns Err the handle remains in Starting;
modify start_joinable_thread to reset the handle state to a non-starting value
on spawn error (e.g., ThreadHandleState::Dead or the previous state) by
acquiring handle.inner lock and updating state inside the .map_err() error
branch or immediately after a failed spawn attempt, and ensure no join_handle is
stored when spawn fails so ThreadHandle::inner.join_handle remains None;
reference start_joinable_thread, ThreadHandle, handle.inner, join_handle and the
spawn() call when applying the change.
- Around line 436-443: init_main_thread_ident currently unconditionally
overwrites MAIN_THREAD_IDENT allowing multiple initializations; protect it by
using a std::sync::Once (e.g., a static ONCE) or by early-return when
MAIN_THREAD_IDENT.load() != 0 so the store only happens once—wrap the call to
get_ident() and MAIN_THREAD_IDENT.store(ident) inside ONCE.call_once(|| { ... })
or add a check like if MAIN_THREAD_IDENT.load() == 0 { let ident = get_ident();
MAIN_THREAD_IDENT.store(ident); } to ensure the main thread id is set only once.
🧹 Nitpick comments (3)
crates/vm/src/stdlib/thread.rs (3)

408-412: Document the intended behavior of _shutdown.

The stub implementation with a no-op could cause issues if Python code relies on CPython's behavior of joining non-daemon threads. Consider:

  • Adding a TODO or FIXME comment with a tracking issue
  • Documenting any known limitations in module-level docs

Would you like me to open an issue to track the complete implementation of _shutdown?


414-423: Consider validating the ident parameter.

The function assumes ident is a valid, currently-running thread ID. Since this is an internal API (prefixed with _), document the preconditions or add a debug assertion:

Suggested validation
 #[pyfunction]
 fn _make_thread_handle(ident: u64, vm: &VirtualMachine) -> PyRef<ThreadHandle> {
+    debug_assert!(ident != 0, "thread ident must be non-zero");
     let handle = ThreadHandle::new();

577-596: Consider removing commented debug code.

Line 591 contains a commented eprintln! debug statement. Either:

  • Remove it if debugging is complete
  • Use conditional compilation with #[cfg(debug_assertions)] if it might be useful later
Optional cleanup
 impl Drop for LocalGuard {
     fn drop(&mut self) {
-        // eprintln!("[DEBUG] LocalGuard::drop called for thread {:?}", self.thread_id);
         if let Some(local_data) = self.local.upgrade() {

Otherwise, the cleanup mechanism with Weak references and thread-local guards looks solid.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 800dcd4 and e1126fc.

📒 Files selected for processing (2)
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/thread.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/thread.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run tests under miri
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (6)
crates/vm/src/stdlib/sys.rs (2)

1110-1116: LGTM: Global trace function implementation looks correct.

The function properly:

  • Converts the tracefunc to an Option based on None check
  • Uses locking for thread-safe global state updates
  • Updates both global state (for new threads) and current thread state
  • Triggers tracing infrastructure updates via update_use_tracing

1118-1124: LGTM: Global profile function mirrors trace implementation.

Consistent implementation with _settraceallthreads, ensuring uniform behavior for both tracing and profiling across all threads.

crates/vm/src/stdlib/thread.rs (4)

176-178: Good practice: Unused parameter marked with underscore prefix.

The _vm prefix correctly indicates the parameter is unused, following Rust conventions.


598-607: LGTM: LocalData structure is well-designed.

The use of parking_lot::Mutex with HashMap<ThreadId, PyDictRef> provides efficient thread-safe access to per-thread dictionaries.


618-653: Excellent concurrent design with proper lock management.

The implementation demonstrates several best practices:

  • Fast path optimization for existing entries
  • Allocation outside lock to minimize hold time
  • Double-check pattern with Entry API to handle races
  • Lock dropped before TLS access to avoid potential deadlocks
  • Guard registration only for new entries

656-664: LGTM: Constructor properly initializes shared LocalData.

Each Local instance gets its own Arc<LocalData>, enabling safe sharing across threads while maintaining per-thread dictionaries.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @crates/vm/src/stdlib/thread.rs:
- Around line 775-808: The join method currently ignores the timeout param and
only checks self-join when state == ThreadHandleState::Running; update
join(&self, timeout: OptionalArg<Option<f64>>, vm: &VirtualMachine) to return an
error if a non-None timeout is passed instead of silently ignoring it (check
timeout.into_option() or similar and call vm.new_value_error with a clear
message), and broaden the self-join guard to treat ThreadHandleState::Starting
the same as Running (e.g., change the condition from inner.state ==
ThreadHandleState::Running to matches!(inner.state, ThreadHandleState::Running |
ThreadHandleState::Starting)) so self-joins during startup are detected; keep
existing behavior for already-joined and join_handle.take(), and ensure you
still set inner.joined = true after successful join.
- Around line 829-893: The start_joinable_thread function sets
handle.inner.lock().state = ThreadHandleState::Starting before attempting
thread_builder.spawn, but if spawn fails (the map_err branch) a caller-provided
ThreadHandle stays in Starting; update the error path to reset the handle state
(e.g., set inner.lock().state = ThreadHandleState::NotStarted) before returning
the PyErr so the handle is left consistent—apply this to the spawn failure
return (the map_err result) for the ThreadHandle (args.handle / handle) used in
start_joinable_thread.
🧹 Nitpick comments (1)
crates/vm/src/stdlib/thread.rs (1)

853-883: Consider refactoring duplicated thread execution logic.

The thread execution logic (lines 862-883) duplicates the logic from run_thread (lines 350-368). Consider extracting this into a shared helper function to reduce duplication and improve maintainability.

Example refactor
fn execute_thread_function(
    func: ArgCallable,
    args: FuncArgs,
    vm: &VirtualMachine,
) {
    match func.invoke(args, vm) {
        Ok(_) => {}
        Err(e) if e.fast_isinstance(vm.ctx.exceptions.system_exit) => {}
        Err(exc) => {
            vm.run_unraisable(
                exc,
                Some("Exception ignored in thread started by".to_owned()),
                func.into(),
            );
        }
    }
    
    // Handle sentinels
    for lock in SENTINELS.take() {
        if lock.mu.is_locked() {
            unsafe { lock.mu.unlock() };
        }
    }
    vm.state.thread_count.fetch_sub(1);
}

Then call this from both run_thread and the spawned closure in start_joinable_thread.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e1126fc and 6af2678.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_threading.py is excluded by !Lib/**
📒 Files selected for processing (1)
  • crates/vm/src/stdlib/thread.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/thread.rs
🧬 Code graph analysis (1)
crates/vm/src/stdlib/thread.rs (1)
crates/vm/src/vm/context.rs (1)
  • new_dict (470-472)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Check Rust code with clippy
🔇 Additional comments (10)
crates/vm/src/stdlib/thread.rs (10)

112-112: LGTM: Module attributes and Lock constructor updated correctly.

The module attribute changes to "_thread" align with Python's standard library naming, and the Lock constructor correctly initializes with RawMutex::INIT. The parameter rename to _vm follows Rust conventions for unused parameters.

Also applies to: 176-177, 190-190


402-434: LGTM: Thread management functions implemented correctly.

The new functions provide essential threading APIs:

  • daemon_threads_allowed and _is_main_interpreter return appropriate constants for RustPython's single-interpreter model
  • _shutdown is correctly stubbed as cleanup happens via the Thread class
  • _make_thread_handle properly creates a handle for an existing thread
  • _get_main_thread_ident provides access to the main thread identifier

436-443: LGTM: Main thread identifier storage implemented correctly.

The use of AtomicCell<u64> ensures thread-safe access to the main thread identifier. The initialization logic correctly captures the current thread's ID.


577-596: LGTM: Thread-local cleanup mechanism implemented correctly.

The LocalGuard with Weak reference ensures proper cleanup of thread-local data when threads terminate, while avoiding circular references that would prevent cleanup. The Drop implementation correctly checks if the LocalData still exists before attempting removal.


598-607: LGTM: LocalData structure is well-designed.

The structure uses parking_lot::Mutex to protect a HashMap mapping thread IDs to per-thread dictionaries, providing an efficient thread-local storage implementation.


610-665: Excellent implementation of thread-local storage with proper synchronization.

The Local::l_dict implementation uses an efficient double-checked locking pattern:

  • Fast path checks for existing dict under lock
  • Slow path allocates outside lock to reduce contention
  • Uses Entry API to handle races correctly
  • Explicitly drops lock before TLS access to avoid potential deadlocks
  • Guards registered only for new entries

This is a well-thought-out design that minimizes lock contention while ensuring correctness.


710-736: LGTM: Thread handle state machine and internal structure are well-designed.

The ThreadHandleState enum defines a clear lifecycle (NotStarted → Starting → Running → Done), and ThreadHandleInner tracks all necessary state including the join handle and a joined flag for idempotent join operations.


738-772: LGTM: ThreadHandle accessors and constructors are correctly implemented.

The new(), ident(), is_done(), _set_done(), and slot_new() methods properly manage the thread handle state with appropriate locking.

Also applies to: 810-816


818-827: LGTM: StartJoinableThreadArgs structure is well-defined.

The arguments structure correctly captures the function to execute, an optional pre-created handle, and a daemon flag (currently unused but allowed for future use).


3-3: The init_main_thread_ident function is properly called during VM initialization at crates/vm/src/vm/mod.rs:310, before other stdlib modules are initialized. The initialization is correctly guarded by the threading feature flag and properly captures the main thread identifier.

Comment on lines 775 to 1019
fn join(&self, timeout: OptionalArg<Option<f64>>, vm: &VirtualMachine) -> PyResult<()> {
// Note: timeout is accepted but ignored - std::thread::JoinHandle doesn't support
// timed joins. We always wait indefinitely. This matches the behavior when timeout
// support is unavailable.
let _ = timeout;

// Check for self-join and if already joined
let join_handle = {
let mut inner = self.inner.lock();
let current_ident = get_ident();

if inner.ident == current_ident && inner.state == ThreadHandleState::Running {
return Err(vm.new_runtime_error("cannot join current thread".to_owned()));
}

// If already joined, return immediately (idempotent)
if inner.joined {
return Ok(());
}

// Take the join handle if available
inner.join_handle.take()
};

if let Some(handle) = join_handle {
// Note: timeout is not supported with std::thread::JoinHandle
let _ = handle.join();
}

// Mark as joined
self.inner.lock().joined = true;

Ok(())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Join timeout is not implemented and self-join check is incomplete.

Two issues with the join implementation:

  1. Major limitation: The timeout parameter is accepted but completely ignored. The comment acknowledges this, but it's a significant deviation from standard Python threading behavior that could cause unexpected hangs.

  2. Minor gap: The self-join check at Line 786 only checks state == Running, missing the Starting state. A thread calling join on itself while still starting might not be caught.

Suggested improvements

For the self-join check, consider:

-if inner.ident == current_ident && inner.state == ThreadHandleState::Running {
+if inner.ident == current_ident && 
+   (inner.state == ThreadHandleState::Running || inner.state == ThreadHandleState::Starting) {
     return Err(vm.new_runtime_error("cannot join current thread".to_owned()));
 }

For the timeout limitation, document this clearly in the module-level documentation or consider returning an error when a timeout is provided, rather than silently ignoring it.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn join(&self, timeout: OptionalArg<Option<f64>>, vm: &VirtualMachine) -> PyResult<()> {
// Note: timeout is accepted but ignored - std::thread::JoinHandle doesn't support
// timed joins. We always wait indefinitely. This matches the behavior when timeout
// support is unavailable.
let _ = timeout;
// Check for self-join and if already joined
let join_handle = {
let mut inner = self.inner.lock();
let current_ident = get_ident();
if inner.ident == current_ident && inner.state == ThreadHandleState::Running {
return Err(vm.new_runtime_error("cannot join current thread".to_owned()));
}
// If already joined, return immediately (idempotent)
if inner.joined {
return Ok(());
}
// Take the join handle if available
inner.join_handle.take()
};
if let Some(handle) = join_handle {
// Note: timeout is not supported with std::thread::JoinHandle
let _ = handle.join();
}
// Mark as joined
self.inner.lock().joined = true;
Ok(())
}
fn join(&self, timeout: OptionalArg<Option<f64>>, vm: &VirtualMachine) -> PyResult<()> {
// Note: timeout is accepted but ignored - std::thread::JoinHandle doesn't support
// timed joins. We always wait indefinitely. This matches the behavior when timeout
// support is unavailable.
let _ = timeout;
// Check for self-join and if already joined
let join_handle = {
let mut inner = self.inner.lock();
let current_ident = get_ident();
if inner.ident == current_ident &&
(inner.state == ThreadHandleState::Running || inner.state == ThreadHandleState::Starting) {
return Err(vm.new_runtime_error("cannot join current thread".to_owned()));
}
// If already joined, return immediately (idempotent)
if inner.joined {
return Ok(());
}
// Take the join handle if available
inner.join_handle.take()
};
if let Some(handle) = join_handle {
// Note: timeout is not supported with std::thread::JoinHandle
let _ = handle.join();
}
// Mark as joined
self.inner.lock().joined = true;
Ok(())
}
🤖 Prompt for AI Agents
In @crates/vm/src/stdlib/thread.rs around lines 775 - 808, The join method
currently ignores the timeout param and only checks self-join when state ==
ThreadHandleState::Running; update join(&self, timeout:
OptionalArg<Option<f64>>, vm: &VirtualMachine) to return an error if a non-None
timeout is passed instead of silently ignoring it (check timeout.into_option()
or similar and call vm.new_value_error with a clear message), and broaden the
self-join guard to treat ThreadHandleState::Starting the same as Running (e.g.,
change the condition from inner.state == ThreadHandleState::Running to
matches!(inner.state, ThreadHandleState::Running | ThreadHandleState::Starting))
so self-joins during startup are detected; keep existing behavior for
already-joined and join_handle.take(), and ensure you still set inner.joined =
true after successful join.

@youknowone youknowone marked this pull request as draft January 10, 2026 13:38
@youknowone youknowone force-pushed the threading branch 2 times, most recently from cf7fe45 to 233e7de Compare January 11, 2026 02:21
@youknowone youknowone marked this pull request as ready for review January 11, 2026 03:04
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/thread.rs (1)

242-256: Review _at_fork_reinit implementation for potential memory safety issue.

Lines 252-253 attempt to swap the RawRMutex through an AtomicCell, but the types don't match:

  • Line 252 creates AtomicCell<&RawRMutex> (reference to mutex)
  • Line 253 tries to swap with &RawRMutex (also a reference)

This differs from the Lock::_at_fork_reinit pattern (lines 153-157) which uses AtomicCell<RawMutex> (by value). The current code may not actually reinitialize the mutex as intended.

🔧 Suggested fix to match Lock's pattern
         self.count.store(0, core::sync::atomic::Ordering::Relaxed);
         let new_mut = RawRMutex::INIT;
-
-        let old_mutex: AtomicCell<&RawRMutex> = AtomicCell::new(&self.mu);
-        old_mutex.swap(&new_mut);
-
+        unsafe {
+            let old_mutex: &AtomicCell<RawRMutex> = core::mem::transmute(&self.mu);
+            old_mutex.swap(new_mut);
+        }
         Ok(())
🤖 Fix all issues with AI agents
In @crates/vm/src/stdlib/thread.rs:
- Around line 1086-1104: The defer block in start_joinable_thread is missing a
call to cleanup_thread_local_data(), so thread-local Local objects won't run
their __del__ on thread exit; update the scopeguard::defer inside
start_joinable_thread to invoke cleanup_thread_local_data() (similar to
run_thread) before handling SENTINELS and before decrementing
vm_state.thread_count, ensuring thread-local cleanup runs even if the thread
panics or returns.
🧹 Nitpick comments (2)
crates/vm/src/stdlib/posix.rs (1)

678-725: Consider using RuntimeWarning instead of DeprecationWarning.

The warning mechanism is well-implemented with proper error handling and context. However, using DeprecationWarning for a runtime safety concern (potential deadlocks in multi-threaded forking) may not be semantically correct—RuntimeWarning would better communicate this is a runtime safety issue rather than a deprecated API.

⚠️ Proposed change to use RuntimeWarning
             let args = crate::function::FuncArgs::new(
                 vec![
                     vm.ctx.new_str(msg).into(),
-                    vm.ctx.exceptions.deprecation_warning.as_object().to_owned(),
+                    vm.ctx.exceptions.runtime_warning.as_object().to_owned(),
                 ],
                 crate::function::KwArgs::new(
                     [("stacklevel".to_owned(), vm.ctx.new_int(2).into())]
crates/vm/src/stdlib/thread.rs (1)

991-1001: Consider using a condvar instead of spin-waiting for concurrent joins.

Lines 993-1000 use yield_now() in a spin-loop when another thread is already joining. While correct, this could cause unnecessary CPU usage. Consider adding a condvar to signal when joined transitions to true, similar to how done_event works.

♻️ Alternative using condvar for join completion

Add a joined_event condvar to ThreadHandle (similar to done_event) and wait on it instead of spinning. This would make concurrent joins more efficient.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6af2678 and 9f83dc4.

⛔ Files ignored due to path filters (9)
  • Lib/_dummy_thread.py is excluded by !Lib/**
  • Lib/dummy_threading.py is excluded by !Lib/**
  • Lib/test/lock_tests.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_process_pool.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_shutdown.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_thread_pool.py is excluded by !Lib/**
  • Lib/test/test_importlib/test_locks.py is excluded by !Lib/**
  • Lib/test/test_threading.py is excluded by !Lib/**
  • Lib/threading.py is excluded by !Lib/**
📒 Files selected for processing (7)
  • .github/workflows/ci.yaml
  • crates/vm/src/stdlib/posix.rs
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/vm/src/vm/thread.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/posix.rs
  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/stdlib/thread.rs
🧠 Learnings (2)
📚 Learning: 2025-07-11T09:35:43.571Z
Learnt from: ShaharNaveh
Repo: RustPython/RustPython PR: 5947
File: .github/workflows/ci.yaml:116-116
Timestamp: 2025-07-11T09:35:43.571Z
Learning: GitHub Actions timeout-minutes field supports expressions that evaluate to integers, such as `${{ contains(matrix.os, 'windows') && 40 || 30 }}`, which will conditionally set different timeout values based on the runner OS.

Applied to files:

  • .github/workflows/ci.yaml
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/vm/src/stdlib/sys.rs
🧬 Code graph analysis (2)
crates/vm/src/stdlib/posix.rs (1)
crates/vm/src/stdlib/thread.rs (2)
  • thread (574-576)
  • after_fork_child (820-858)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/stdlib/thread.rs (2)
  • thread (574-576)
  • init_main_thread_ident (539-542)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (windows-2025)
🔇 Additional comments (19)
.github/workflows/ci.yaml (1)

283-299: LGTM! Timeout increases align with threading enhancements.

The increased timeouts (35→45 minutes) and the addition of a 45-minute timeout for Windows tests are appropriate given the expanded threading test coverage introduced in this PR. The threading-related changes (tracing across threads, fork-awareness, thread shutdown integration) naturally extend test execution time.

crates/vm/src/stdlib/posix.rs (2)

664-667: LGTM! Clean integration of thread state cleanup after fork.

The call to crate::stdlib::thread::after_fork_child() correctly marks non-current threads as done in the child process, preventing deadlocks. The placement before running Python callbacks and the feature guard are both correct.


729-729: LGTM! Warning placement is correct.

Calling warn_if_multi_threaded before the actual fork operation ensures users are notified of potential issues at the right time.

crates/vm/src/stdlib/sys.rs (1)

1110-1124: LGTM! Dual-state update pattern correctly enables all-threads tracing.

The functions properly update both:

  1. Global state (vm.state.global_trace_func/global_profile_func) for new threads to inherit
  2. Current thread state (vm.trace_func/profile_func) for immediate effect

This design ensures existing threads see the change immediately while new threads created afterward inherit the global setting.

crates/vm/src/vm/interpreter.rs (2)

113-118: LGTM! Documentation accurately reflects the new finalization step.

The update correctly documents the addition of thread shutdown waiting as step 3 in the finalization sequence.


132-173: LGTM! Clean implementation of threading shutdown integration.

The placement is correct—waiting for threads after computing the exit code but before running atexit functions ensures proper cleanup order. The implementation:

  • Gracefully handles the case where threading was never imported
  • Uses run_unraisable appropriately for non-fatal errors during shutdown
  • Mirrors CPython's behavior in waiting for non-daemon threads
crates/vm/src/vm/mod.rs (3)

110-113: LGTM! Global tracing state properly added to PyGlobalState.

The new fields are:

  • Correctly typed as PyMutex<Option<PyObjectRef>> for thread-safe access
  • Well-documented with references to the sys functions that set them
  • Public, allowing access from thread initialization code

198-199: LGTM! Fields properly initialized.

Using PyMutex::default() correctly initializes both fields to None, ensuring new VMs start without global tracing enabled.


308-310: LGTM! Main thread identity initialization correctly ordered.

Calling stdlib::thread::init_main_thread_ident() before any threading operations ensures the main thread's identity is established, which is necessary for proper thread tracking and fork handling. The feature guard and explanatory comment are appropriate.

crates/vm/src/stdlib/thread.rs (10)

3-3: LGTM: New public exports for fork and thread initialization.

The addition of after_fork_child and init_main_thread_ident to the public API appropriately exposes fork handling and main thread initialization required by the VM initialization path.


110-179: LGTM: Lock class correctly updated for _thread module.

The constructor now properly returns an initialized, unlocked mutex matching Python's threading.Lock() semantics. The module attribute correctly reflects _thread.


378-409: LGTM: Thread cleanup sequence properly handles thread-local storage.

The cleanup sequence is correct:

  1. Sentinel locks are unlocked (lines 390-394)
  2. Thread-local data cleanup triggers __del__ methods (line 397)
  3. Thread count is decremented (line 398)

This ordering ensures __del__ methods execute while the VM context is still valid. The guard-based cleanup mechanism correctly removes per-thread dictionaries when threads exit.


457-499: LGTM: Shutdown logic correctly waits for non-daemon threads.

The implementation properly:

  • Excludes the current thread from waiting (line 470, 479)
  • Cleans up dead weak references (line 467)
  • Uses condvar for efficient waiting (lines 490-494)
  • Protects against indefinite hangs via the scopeguard in start_joinable_thread (lines 1086-1104) which ensures state is set to Done even on panic

513-542: LGTM: Thread identity helpers correctly support fork semantics.

The main thread identity tracking properly initializes at startup and updates after fork (see line 824 in after_fork_child). The _make_thread_handle function correctly creates thread handles in the Running state for existing threads.


717-752: Thread-local storage implementation uses sound double-checked locking pattern.

The l_dict method correctly:

  • Uses fast path for existing entries (lines 721-723)
  • Allocates outside the lock to minimize lock hold time (line 726)
  • Uses Entry API for race-safe insertion (lines 731-737)
  • Releases lock before TLS access to avoid lock ordering issues (line 738)
  • Registers cleanup guard only for new entries (lines 741-748)

The brief window between lock release (line 738) and guard registration (line 748) is acceptable since the thread cannot exit during its own method call under normal circumstances.


820-858: LGTM: Fork handling correctly prevents orphaned thread joins in child process.

The after_fork_child implementation properly:

  • Updates main thread identity to the current (only) thread in child (line 824)
  • Preserves the current thread and not-yet-started threads (lines 840-845)
  • Marks all other threads as Done and clears join handles (lines 848-850)
  • Notifies waiters to unblock any pending joins (lines 852-854)

This prevents the child process from hanging on join() calls for threads that were inherited but don't actually exist in the child.


954-961: LGTM: Self-join detection correctly prevents deadlock.

The check properly identifies when a thread attempts to join itself and returns an appropriate error, preventing deadlock.


1073-1127: Thread spawning correctly integrates with VM thread creation.

The implementation properly:

  • Creates a new VM context via vm.new_thread() (line 1074)
  • Sets thread identity in the spawned thread (lines 1076-1080)
  • Registers non-daemon threads for shutdown tracking (lines 1058-1060)
  • Handles exceptions with run_unraisable (lines 1110-1116)
  • Stores the join handle for later joining (line 1124)

285-339: Thread identity implementation correctly handles fork compatibility on Unix.

The use of pthread_self() via libc on Unix ensures thread IDs remain stable after fork, which is critical for the fork handling logic. The non-Unix fallback using ThreadId hash is appropriate. The libc crate is properly declared as a dependency, and pthread_self() is correctly invoked with the cast to u64.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
crates/vm/src/stdlib/thread.rs (3)

142-160: UB in Lock::_at_fork_reinit: transmute to AtomicCell<RawMutex> is not sound.
This is relying on layout/aliasing guarantees that Rust does not provide, and it’s very easy to “work until it doesn’t”.

Proposed direction: store the raw mutex in interior-mutability and reset with `ptr::write`
 use parking_lot::{
     RawMutex, RawThreadId,
     lock_api::{RawMutex as RawMutexT, RawMutexTimed, RawReentrantMutex},
 };
+use core::cell::UnsafeCell;

 struct Lock {
-    mu: RawMutex,
+    mu: UnsafeCell<RawMutex>,
 }

@@
     fn acquire(&self, args: AcquireArgs, vm: &VirtualMachine) -> PyResult<bool> {
-        acquire_lock_impl!(&self.mu, args, vm)
+        let mu = unsafe { &*self.mu.get() };
+        acquire_lock_impl!(mu, args, vm)
     }

@@
     fn release(&self, vm: &VirtualMachine) -> PyResult<()> {
-        if !self.mu.is_locked() {
+        let mu = unsafe { &*self.mu.get() };
+        if !mu.is_locked() {
             return Err(vm.new_runtime_error("release unlocked lock"));
         }
-        unsafe { self.mu.unlock() };
+        unsafe { mu.unlock() };
         Ok(())
     }

@@
     fn _at_fork_reinit(&self, _vm: &VirtualMachine) -> PyResult<()> {
-        if self.mu.is_locked() {
-            unsafe {
-                self.mu.unlock();
-            };
-        }
-        // Casting to AtomicCell is as unsafe as CPython code.
-        // Using AtomicCell will prevent compiler optimizer move it to somewhere later unsafe place.
-        // It will be not under the cell anymore after init call.
-
-        let new_mut = RawMutex::INIT;
-        unsafe {
-            let old_mutex: &AtomicCell<RawMutex> = core::mem::transmute(&self.mu);
-            old_mutex.swap(new_mut);
-        }
+        let mu = unsafe { &mut *self.mu.get() };
+        if mu.is_locked() {
+            unsafe { mu.unlock() };
+        }
+        // After fork we must re-init the lock in-place.
+        unsafe { core::ptr::write(mu, RawMutex::INIT) };
         Ok(())
     }

@@
     fn py_new(_cls: &Py<PyType>, _args: Self::Args, _vm: &VirtualMachine) -> PyResult<Self> {
-        Ok(Self { mu: RawMutex::INIT })
+        Ok(Self { mu: UnsafeCell::new(RawMutex::INIT) })
     }

242-256: RLock::_at_fork_reinit currently doesn’t reinitialize self.mu (and the swap is a no-op / invalid).
AtomicCell<&RawRMutex> swaps a pointer inside a local AtomicCell; it never mutates self.mu, and &new_mut is a reference to a stack local.

Minimal fix idea (aligned with the Lock fix approach): put `mu` behind `UnsafeCell` and `ptr::write` it
+use core::cell::UnsafeCell;

 struct RLock {
-    mu: RawRMutex,
+    mu: UnsafeCell<RawRMutex>,
     count: core::sync::atomic::AtomicUsize,
 }

@@
     fn slot_new(cls: PyTypeRef, _args: FuncArgs, vm: &VirtualMachine) -> PyResult {
         Self {
-            mu: RawRMutex::INIT,
+            mu: UnsafeCell::new(RawRMutex::INIT),
             count: core::sync::atomic::AtomicUsize::new(0),
         }

@@
     fn acquire(&self, args: AcquireArgs, vm: &VirtualMachine) -> PyResult<bool> {
-        let result = acquire_lock_impl!(&self.mu, args, vm)?;
+        let mu = unsafe { &*self.mu.get() };
+        let result = acquire_lock_impl!(mu, args, vm)?;
         ...
     }

@@
     fn release(&self, vm: &VirtualMachine) -> PyResult<()> {
-        if !self.mu.is_locked() {
+        let mu = unsafe { &*self.mu.get() };
+        if !mu.is_locked() {
             return Err(vm.new_runtime_error("release unlocked lock"));
         }
         ...
-        unsafe { self.mu.unlock() };
+        unsafe { mu.unlock() };
         Ok(())
     }

@@
     fn _at_fork_reinit(&self, _vm: &VirtualMachine) -> PyResult<()> {
-        if self.mu.is_locked() {
-            unsafe {
-                self.mu.unlock();
-            };
-        }
+        let mu = unsafe { &mut *self.mu.get() };
+        if mu.is_locked() {
+            unsafe { mu.unlock() };
+        }
         self.count.store(0, core::sync::atomic::Ordering::Relaxed);
-        let new_mut = RawRMutex::INIT;
-
-        let old_mutex: AtomicCell<&RawRMutex> = AtomicCell::new(&self.mu);
-        old_mutex.swap(&new_mut);
+        unsafe { core::ptr::write(mu, RawRMutex::INIT) };
         Ok(())
     }

@@
     fn _is_owned(&self) -> bool {
-        self.mu.is_owned_by_current_thread()
+        unsafe { &*self.mu.get() }.is_owned_by_current_thread()
     }

347-376: thread_count increments after spawn can underflow if the child exits quickly.
The spawned thread calls fetch_sub(1) at the end; if that runs before the parent’s fetch_add(1), the counter can wrap.

Proposed fix: increment before spawn; roll back on spawn error
     fn start_new_thread(...) -> PyResult<u64> {
@@
-        thread_builder
-            .spawn(
+        vm.state.thread_count.fetch_add(1);
+        thread_builder
+            .spawn(
                 vm.new_thread()
                     .make_spawn_func(move |vm| run_thread(func, args, vm)),
             )
             .map(|handle| {
-                vm.state.thread_count.fetch_add(1);
                 thread_to_id(&handle)
             })
-            .map_err(|err| vm.new_runtime_error(format!("can't start new thread: {err}")))
+            .map_err(|err| {
+                vm.state.thread_count.fetch_sub(1);
+                vm.new_runtime_error(format!("can't start new thread: {err}"))
+            })
     }
🤖 Fix all issues with AI agents
In @crates/vm/src/stdlib/sys.rs:
- Around line 1110-1125: The docstring and implementation disagree:
_settraceallthreads and _setprofileallthreads only set vm.state.global_* and the
current vm.trace_func/vm.profile_func, but do not propagate to threads already
running; either update the documentation to say only the current thread and
future threads are affected, or implement propagation by iterating the VM's
active thread registry/handles and updating each thread's trace/profile field
(mirror what you do for the current thread), then call update_use_tracing() once
after updates; change the docstring if you choose the documentation route or add
the iteration logic (accessing the VM thread list / per-thread trace/profile
slots) if you choose to broadcast to existing threads.

In @crates/vm/src/stdlib/thread.rs:
- Around line 395-409: The joinable-thread exit path in start_joinable_thread
currently omits the TLS cleanup done in run_thread, risking PyObjectRef drops
outside an active VM; modify the worker closure in start_joinable_thread to
mirror run_thread’s shutdown sequence by invoking cleanup_thread_local_data()
while the VM context is still active and then decrementing vm.state.thread_count
(matching the vm.state.thread_count.fetch_sub(1) behavior), ensuring any
LocalGuard drops occur inside the VM context before the thread exits.
- Around line 890-1033: ThreadHandle::join currently can block forever for
NotStarted/Starting states and busy-spins while waiting for another joiner;
update join to first validate ThreadHandleInner.state (return an Err for
NotStarted or handle a Starting-to-Running transition instead of proceeding to
wait indefinitely), and replace the spin-wait on inner.joining with a proper
wait on the existing done_event Condvar (or add a short-wait condvar on
ThreadHandleInner) so that when another thread is joining we wait on a condition
variable rather than loop with std::thread::yield_now(); specifically change the
joining-path where you poll inner.joined to instead wait on done_event (the
Arc<(parking_lot::Mutex<bool>, parking_lot::Condvar)>) or notify a new condvar
when joining finishes, ensure inner.joining/inner.joined are set and cleared
under the inner lock, and keep the actual join_handle.join() outside the lock
as-is.
- Around line 1044-1127: In start_joinable_thread: avoid thread_count underflow
by incrementing vm.state.thread_count (fetch_add(1)) before spawning the OS
thread and if spawn fails immediately decrement it (fetch_sub(1)) so counts stay
balanced; if spawn fails and the caller supplied args.handle, set
handle.inner.lock().state = ThreadHandleState::Done, set the done_event
(notify_all) and remove it from the shutdown registry if you previously called
add_to_shutdown_handles so callers don't observe a permanently Starting handle;
ensure on normal thread exit the scopeguard defer also calls
cleanup_thread_local_data() before decrementing thread_count; relevant symbols:
start_joinable_thread, add_to_shutdown_handles, vm.state.thread_count,
ThreadHandleState::Starting/Done, handle.inner.lock().join_handle, done_event,
cleanup_thread_local_data, SENTINELS.
- Around line 443-512: _shutdown() can deadlock due to inconsistent lock
ordering (holding done_event lock and then checking inner.lock()); change it so
the waiting loop only inspects the done boolean under the done_event lock (do
not read inner.lock().state while holding done_event), and after waking drop the
done_event lock before acquiring inner.lock() if you need to re-check state. In
ThreadHandle::join(), handle spurious wakeups by turning the infinite wait into
a loop that re-checks the done flag (while !*done { cvar.wait(&mut done); }),
and for the timeout path ensure you verify the done flag after wait_timeout and
return a timeout only if the flag is still false. Reference functions:
_shutdown(), add_to_shutdown_handles, and ThreadHandle::join().
🧹 Nitpick comments (1)
crates/vm/src/vm/interpreter.rs (1)

111-143: Consider setting finalizing=true before waiting for thread shutdown.
Otherwise, other threads may still spawn new non-daemon threads while threading._shutdown() is trying to quiesce, risking hangs / non-deterministic shutdown.

Also applies to: 146-173

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f83dc4 and d8d36c6.

⛔ Files ignored due to path filters (9)
  • Lib/_dummy_thread.py is excluded by !Lib/**
  • Lib/dummy_threading.py is excluded by !Lib/**
  • Lib/test/lock_tests.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_process_pool.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_shutdown.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_thread_pool.py is excluded by !Lib/**
  • Lib/test/test_importlib/test_locks.py is excluded by !Lib/**
  • Lib/test/test_threading.py is excluded by !Lib/**
  • Lib/threading.py is excluded by !Lib/**
📒 Files selected for processing (7)
  • .github/workflows/ci.yaml
  • crates/vm/src/stdlib/posix.rs
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/ci.yaml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/stdlib/posix.rs
🧠 Learnings (1)
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/vm/src/stdlib/sys.rs
🧬 Code graph analysis (2)
crates/vm/src/vm/thread.rs (1)
crates/vm/src/vm/mod.rs (1)
  • new (125-230)
crates/vm/src/stdlib/posix.rs (1)
crates/vm/src/stdlib/thread.rs (2)
  • thread (574-576)
  • after_fork_child (820-858)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (7)
crates/vm/src/stdlib/thread.rs (3)

3-3: Export surface change looks consistent with PR goals.


285-339: Verify pthread_t casting safety and equality semantics across target platforms.

The code casts pthread_self() (line 296) and as_pthread_t() (line 333) directly to u64 without compile-time validation. While this works on tested platforms (x86_64, aarch64), pthread_t representation varies across Unix systems—some use structs, and 32-bit platforms may truncate when casting to u64. Since thread IDs are compared for equality in critical paths (fork cleanup, self-join detection, multiprocessing ownership), information loss would break correctness. Add compile-time assertions to verify safe casting, or document which architectures are unsupported and why.


110-113: Naming is correct and matches CPython 3.13's _thread module.

The current implementation is correct: #[pyclass(name = "lock")] sets the actual class name to _thread.lock (matching CPython's tp_name), and #[pyattr(name = "LockType")] creates the type alias _thread.LockType as expected by stdlib. RLock uses #[pyclass(name = "RLock")] without the LockType alias, which is also correct per CPython behavior.

crates/vm/src/stdlib/posix.rs (1)

663-668: Calling after_fork_child() before Python callbacks is the right ordering.

crates/vm/src/vm/mod.rs (2)

110-114: Global trace/profile state wiring looks straightforward.

Also applies to: 198-200


308-311: Code formatting appears clean; however, verification requires running cargo tools.

The code at lines 308-311 shows proper indentation, no trailing whitespace, and follows Rust conventions. However, the review's requirement to verify with cargo fmt --check and cargo clippy -- -D warnings cannot be completed in this environment due to sandbox limitations. The code visually conforms to Rust style standards and the feature-gated threading initialization is correctly placed, but formal tool-based verification is needed to confirm no linting warnings are introduced.

crates/vm/src/vm/thread.rs (1)

141-166: Thread snapshot semantics are correct—globals are captured at creation time. The new_thread() method clones global_trace_func and global_profile_func once at thread creation and stores them in thread-local RefCell values. When _settraceallthreads() or _setprofileallthreads() are called, they update the shared global state and the calling thread's copy, but do not propagate changes to threads that were already created. This one-time snapshot design appears intentional and matches the thread isolation pattern, though it differs from potentially updating all existing threads. Confirm the intended semantics with tests if Python compatibility requires broader propagation.

Comment on lines +1110 to +1151
#[pyfunction]
fn _settraceallthreads(tracefunc: PyObjectRef, vm: &VirtualMachine) {
let func = (!vm.is_none(&tracefunc)).then(|| tracefunc.clone());
*vm.state.global_trace_func.lock() = func;
vm.trace_func.replace(tracefunc);
update_use_tracing(vm);
}

#[pyfunction]
fn _setprofileallthreads(profilefunc: PyObjectRef, vm: &VirtualMachine) {
let func = (!vm.is_none(&profilefunc)).then(|| profilefunc.clone());
*vm.state.global_profile_func.lock() = func;
vm.profile_func.replace(profilefunc);
update_use_tracing(vm);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's understand the context - look at the full file around the lines mentioned
sed -n '1090,1150p' crates/vm/src/stdlib/sys.rs

Repository: RustPython/RustPython

Length of output: 2253


🏁 Script executed:

# Search for how trace_func and profile_func are used
rg 'trace_func|profile_func' crates/vm/src/ -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 7623


🏁 Script executed:

# Look for where new threads are created to see how they inherit trace/profile functions
rg 'spawn|thread|Thread' crates/vm/src/ -t rust | head -50

Repository: RustPython/RustPython

Length of output: 4440


🏁 Script executed:

# Search for any documentation or comments about settrace/setprofile
rg 'settrace|setprofile' crates/vm/src/stdlib/sys.rs -B 5 -A 5

Repository: RustPython/RustPython

Length of output: 1168


🏁 Script executed:

# Look for tests related to trace or profile
fd -e rs | xargs grep -l 'test.*trace\|test.*profile' | head -10

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look for tests related to _settraceallthreads
rg '_settraceallthreads|_setprofileallthreads' crates/ tests/ --type rust

Repository: RustPython/RustPython

Length of output: 1059


🏁 Script executed:

# Check if there are any comments or documentation about the intent
rg 'settraceallthreads|setprofileallthreads' . --type rust -B 10 | head -100

Repository: RustPython/RustPython

Length of output: 4965


🏁 Script executed:

# Look for thread-related trace/profile tests
fd -e py -e rs | xargs grep -l 'settrace.*thread\|trace.*thread' 2>/dev/null | head -10

Repository: RustPython/RustPython

Length of output: 177


Documentation-implementation mismatch: _settraceallthreads doesn't update already-running threads as documented.

The docstring claims "Set the profiling function in all running threads," but the implementation only updates the current thread and the global state (for newly created threads). Existing threads spawned before this call won't receive the new trace/profile function. Either implement broadcasting to existing threads or clarify the documentation to reflect that only the current thread and future threads are affected.

🤖 Prompt for AI Agents
In @crates/vm/src/stdlib/sys.rs around lines 1110 - 1125, The docstring and
implementation disagree: _settraceallthreads and _setprofileallthreads only set
vm.state.global_* and the current vm.trace_func/vm.profile_func, but do not
propagate to threads already running; either update the documentation to say
only the current thread and future threads are affected, or implement
propagation by iterating the VM's active thread registry/handles and updating
each thread's trace/profile field (mirror what you do for the current thread),
then call update_use_tracing() once after updates; change the docstring if you
choose the documentation route or add the iteration logic (accessing the VM
thread list / per-thread trace/profile slots) if you choose to broadcast to
existing threads.

Comment on lines 1044 to 1167
#[pyfunction]
fn start_joinable_thread(
args: StartJoinableThreadArgs,
vm: &VirtualMachine,
) -> PyResult<PyRef<ThreadHandle>> {
let handle = match args.handle {
OptionalArg::Present(h) => h,
OptionalArg::Missing => ThreadHandle::new().into_ref(&vm.ctx),
};

// Mark as starting
handle.inner.lock().state = ThreadHandleState::Starting;

// Add non-daemon threads to shutdown registry so _shutdown() will wait for them
if !args.daemon {
add_to_shutdown_handles(&handle.inner, &handle.done_event);
}

let func = args.function;
let handle_clone = handle.clone();
let inner_clone = handle.inner.clone();
let done_event_clone = handle.done_event.clone();

let mut thread_builder = thread::Builder::new();
let stacksize = vm.state.stacksize.load();
if stacksize != 0 {
thread_builder = thread_builder.stack_size(stacksize);
}

let join_handle = thread_builder
.spawn(vm.new_thread().make_spawn_func(move |vm| {
// Set ident and mark as running
{
let mut inner = inner_clone.lock();
inner.ident = get_ident();
inner.state = ThreadHandleState::Running;
}

// Ensure cleanup happens even if the function panics
let inner_for_cleanup = inner_clone.clone();
let done_event_for_cleanup = done_event_clone.clone();
let vm_state = vm.state.clone();
scopeguard::defer! {
// Mark as done
inner_for_cleanup.lock().state = ThreadHandleState::Done;

// Signal waiting threads that this thread is done
{
let (lock, cvar) = &*done_event_for_cleanup;
*lock.lock() = true;
cvar.notify_all();
}

// Handle sentinels
for lock in SENTINELS.take() {
if lock.mu.is_locked() {
unsafe { lock.mu.unlock() };
}
}
vm_state.thread_count.fetch_sub(1);
}

// Run the function
match func.invoke((), vm) {
Ok(_) => {}
Err(e) if e.fast_isinstance(vm.ctx.exceptions.system_exit) => {}
Err(exc) => {
vm.run_unraisable(
exc,
Some("Exception ignored in thread started by".to_owned()),
func.into(),
);
}
}
}))
.map_err(|err| vm.new_runtime_error(format!("can't start new thread: {err}")))?;

vm.state.thread_count.fetch_add(1);

// Store the join handle
handle.inner.lock().join_handle = Some(join_handle);

Ok(handle_clone)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

start_joinable_thread: fix (1) thread_count race, (2) spawn-error behavior with caller-supplied handle, (3) missing TLS cleanup.

  • Same thread_count underflow risk (child can fetch_sub before parent fetch_add).
  • If args.handle is provided and spawn fails, the caller keeps a handle in Starting with done=false.
  • Exit defer! should call cleanup_thread_local_data() too.
Concrete patch sketch
     fn start_joinable_thread(...) -> PyResult<PyRef<ThreadHandle>> {
@@
         // Mark as starting
         handle.inner.lock().state = ThreadHandleState::Starting;
@@
-        let join_handle = thread_builder
+        // Ensure thread_count can't underflow if the thread exits quickly.
+        vm.state.thread_count.fetch_add(1);
+
+        let join_handle = thread_builder
             .spawn(vm.new_thread().make_spawn_func(move |vm| {
@@
                 scopeguard::defer! {
                     // Mark as done
                     inner_for_cleanup.lock().state = ThreadHandleState::Done;
@@
                     // Handle sentinels
                     for lock in SENTINELS.take() {
                         if lock.mu.is_locked() {
                             unsafe { lock.mu.unlock() };
                         }
                     }
+                    // Clean up thread-local storage while VM context is still active
+                    cleanup_thread_local_data();
                     vm_state.thread_count.fetch_sub(1);
                 }
@@
             }))
-            .map_err(|err| vm.new_runtime_error(format!("can't start new thread: {err}")))?;
-
-        vm.state.thread_count.fetch_add(1);
+            .map_err(|err| {
+                // Roll back thread_count and unblock joins on caller-supplied handles.
+                vm.state.thread_count.fetch_sub(1);
+                handle.inner.lock().state = ThreadHandleState::Done;
+                let (lock, cvar) = &*handle.done_event;
+                *lock.lock() = true;
+                cvar.notify_all();
+                vm.new_runtime_error(format!("can't start new thread: {err}"))
+            })?;
🤖 Prompt for AI Agents
In @crates/vm/src/stdlib/thread.rs around lines 1044 - 1127, In
start_joinable_thread: avoid thread_count underflow by incrementing
vm.state.thread_count (fetch_add(1)) before spawning the OS thread and if spawn
fails immediately decrement it (fetch_sub(1)) so counts stay balanced; if spawn
fails and the caller supplied args.handle, set handle.inner.lock().state =
ThreadHandleState::Done, set the done_event (notify_all) and remove it from the
shutdown registry if you previously called add_to_shutdown_handles so callers
don't observe a permanently Starting handle; ensure on normal thread exit the
scopeguard defer also calls cleanup_thread_local_data() before decrementing
thread_count; relevant symbols: start_joinable_thread, add_to_shutdown_handles,
vm.state.thread_count, ThreadHandleState::Starting/Done,
handle.inner.lock().join_handle, done_event, cleanup_thread_local_data,
SENTINELS.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/thread.rs (1)

242-256: RLock::_at_fork_reinit doesn’t actually reinitialize the mutex (and risks subtle UB expectations)
AtomicCell<&RawRMutex>::swap(&new_mut) only swaps a pointer in a local AtomicCell; it does not mutate self.mu, so the lock state isn’t reset post-fork.

Proposed fix (write new INIT into the field via raw pointer)
 fn _at_fork_reinit(&self, _vm: &VirtualMachine) -> PyResult<()> {
     if self.mu.is_locked() {
         unsafe {
             self.mu.unlock();
         };
     }
     self.count.store(0, core::sync::atomic::Ordering::Relaxed);
-    let new_mut = RawRMutex::INIT;
-
-    let old_mutex: AtomicCell<&RawRMutex> = AtomicCell::new(&self.mu);
-    old_mutex.swap(&new_mut);
+    // Re-init the underlying raw mutex in-place (post-fork “single thread” world).
+    // SAFETY: fork-reset path; we intentionally overwrite without running Drop.
+    unsafe {
+        let mu_ptr = &self.mu as *const RawRMutex as *mut RawRMutex;
+        core::ptr::write(mu_ptr, RawRMutex::INIT);
+    }

     Ok(())
 }
🤖 Fix all issues with AI agents
In @crates/vm/src/stdlib/thread.rs:
- Around line 395-409: The joinable-thread path omits calling
cleanup_thread_local_data(), so LocalGuard can drop after the VM context ends
and panic; modify the joinable-thread scopeguard::defer! (the same defer used
when spawning joinable threads in run_thread()) to call
cleanup_thread_local_data() before tearing down the VM context, i.e., add a call
to the existing fn cleanup_thread_local_data() inside that defer so thread-local
dicts are cleared prior to vm.state/thread_count changes; apply the same
addition to the other joinable-thread defer location around lines 1082-1104 to
ensure both places perform TLS cleanup.
- Around line 676-695: LocalGuard::drop currently removes the entry from
LocalData.data while holding the MutexGuard, which can deadlock if dropping the
removed PyDictRef runs Python finalizers that access thread-local data; change
Drop for LocalGuard to lock LocalData.data, take the removed dict out (e.g.,
with remove returning Option<_>), then drop the mutex (end the lock scope) and
only then let the removed dict be dropped outside the lock; apply the same
pattern to the other removal site that similarly removes while holding
LocalData.data.
- Around line 941-1024: ThreadHandle::join currently does a single cvar.wait_for
and can proceed after a spurious wake to call handle.join() and block past the
timeout; change join() to loop on the predicate (while !*done { wait_for or wait
returns; break only when predicate true or timed_out) so spurious wakeups
re-check and honor the timeout. Also update start_joinable_thread so that if
thread::spawn fails it sets inner.state off Starting (e.g. to Stopped/Failed),
sets *done = true and signals the done_event Condvar (cvar.notify_all()) and
clears/marks join_handle appropriately, so subsequent join()/shutdown() won't
hang waiting for a never-started thread; reference the done_event Condvar,
inner.state/Starting, start_joinable_thread, and ThreadHandle::join to locate
changes.
- Around line 457-499: _shutdown holds the done_event mutex and then locks
inner, which can deadlock against _set_done() that locks inner then done_event;
change the waiting loop so you never lock inner while holding the done_event
mutex (e.g., check *done under the done_event lock, drop that lock before
locking inner to inspect inner.lock().state, and only re-acquire the done_event
lock when waiting on the cvar), updating the loop in _shutdown (and the
analogous loop in the other shutdown location) to avoid locking done_event and
inner at the same time; reference symbols: _shutdown, _set_done,
SHUTDOWN_HANDLES, done_event, inner, ThreadHandleState.
🧹 Nitpick comments (1)
crates/vm/src/vm/interpreter.rs (1)

113-143: Finalize ordering: consider setting finalizing=true before waiting for thread shutdown
Right now, wait_for_thread_shutdown(vm) runs while vm.state.finalizing is still false. If any code gates “no new threads during shutdown” on that flag, you can get a moving target while waiting.

Also applies to: 146-173

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d8d36c6 and 9f9fda4.

⛔ Files ignored due to path filters (8)
  • Lib/_dummy_thread.py is excluded by !Lib/**
  • Lib/dummy_threading.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_process_pool.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_shutdown.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_thread_pool.py is excluded by !Lib/**
  • Lib/test/test_importlib/test_locks.py is excluded by !Lib/**
  • Lib/test/test_sys.py is excluded by !Lib/**
  • Lib/test/test_threading.py is excluded by !Lib/**
📒 Files selected for processing (7)
  • .github/workflows/ci.yaml
  • crates/vm/src/stdlib/posix.rs
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/ci.yaml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
  • crates/vm/src/stdlib/posix.rs
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/stdlib/thread.rs
🧠 Learnings (4)
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.

Applied to files:

  • crates/vm/src/stdlib/posix.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.

Applied to files:

  • crates/vm/src/stdlib/posix.rs
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: Applies to Lib/** : In the `Lib/` directory, when modifications are necessary, add a `# TODO: RUSTPYTHON` comment or use `unittest.skip('TODO: RustPython <reason>')` or `unittest.expectedFailure` with `# TODO: RUSTPYTHON <reason>` comment

Applied to files:

  • crates/vm/src/stdlib/posix.rs
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/vm/src/stdlib/sys.rs
🧬 Code graph analysis (3)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/stdlib/thread.rs (2)
  • thread (574-576)
  • init_main_thread_ident (539-542)
crates/vm/src/vm/thread.rs (1)
crates/vm/src/vm/mod.rs (1)
  • new (125-230)
crates/vm/src/stdlib/posix.rs (1)
crates/vm/src/stdlib/thread.rs (2)
  • thread (574-576)
  • after_fork_child (820-858)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Check Rust code with clippy
🔇 Additional comments (7)
crates/vm/src/stdlib/thread.rs (2)

3-3: Exported after_fork_child / init_main_thread_ident: ensure multi-interpreter semantics are intentional
Both are process-global (static MAIN_THREAD_IDENT) but RustPython can construct multiple VirtualMachines; this will “last writer wins” the main-thread ident. If that’s expected (single interpreter model), OK; otherwise you may need interpreter-scoped storage.

Also applies to: 535-543


285-339: Thread ident source changed: validate stability/correctness across targets
On unix you now cast pthread_self() / JoinHandleExt::as_pthread_t() to u64. Please confirm this matches the CPython 3.13 “ident” expectations on all supported unix targets (notably those where pthread_t is not an integer type).

Also applies to: 347-376

crates/vm/src/vm/thread.rs (1)

142-157: Per-thread VM inherits global trace/profile: looks consistent
Capturing global_trace_func/global_profile_func at thread creation and deriving use_tracing from presence matches the intended “inherit at creation” behavior.

crates/vm/src/stdlib/posix.rs (1)

663-668: Calling after_fork_child() before Python callbacks is the right sequencing
Doing the thread-state fixup prior to user register_at_fork(..., after_in_child=...) callbacks matches the intent of preventing post-fork join hangs.

crates/vm/src/stdlib/sys.rs (1)

1110-1125: Verify “all threads” semantics: current code sets global defaults + current thread only
As written, these APIs update PyGlobalState and the calling thread’s VM. If CPython-compat requires updating already-running threads too, you’ll need an iteration mechanism over active VMs/threads (or rename/adjust expectations).

crates/vm/src/vm/mod.rs (2)

110-114: Global trace/profile hooks in PyGlobalState: looks OK, but confirm lifetime expectations
Storing PyObjectRef in global state is fine for “inherit at thread creation”; just ensure shutdown/finalize doesn’t require explicitly clearing these to avoid keeping user callables alive longer than expected.

Also applies to: 198-200


308-311: init_main_thread_ident() is process-global; verify behavior with multiple VMs / non-main-thread interpreters
If an interpreter is created on a non-main OS thread, MAIN_THREAD_IDENT will point there. If RustPython’s model is “single main interpreter”, OK; otherwise you may need per-interpreter tracking.

@github-actions
Copy link
Contributor

Code has been automatically formatted

The code in this PR has been formatted using cargo fmt --all.
Please pull the latest changes before pushing again:

git pull origin threading

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/thread.rs (1)

242-256: Potential bug: _at_fork_reinit does not reinitialize the mutex.

The implementation creates an AtomicCell wrapping a reference to the mutex and swaps references, not the actual mutex data. Compare with Lock::_at_fork_reinit (lines 153-157) which uses transmute to swap the actual mutex value.

After this code executes, self.mu remains unchanged because only the references are swapped, not the underlying mutex data.

🐛 Suggested fix to match Lock::_at_fork_reinit pattern
     #[pymethod]
     fn _at_fork_reinit(&self, _vm: &VirtualMachine) -> PyResult<()> {
         if self.mu.is_locked() {
             unsafe {
                 self.mu.unlock();
             };
         }
         self.count.store(0, core::sync::atomic::Ordering::Relaxed);
         let new_mut = RawRMutex::INIT;

-        let old_mutex: AtomicCell<&RawRMutex> = AtomicCell::new(&self.mu);
-        old_mutex.swap(&new_mut);
+        unsafe {
+            let old_mutex: &AtomicCell<RawRMutex> = core::mem::transmute(&self.mu);
+            old_mutex.swap(new_mut);
+        }

         Ok(())
     }
🤖 Fix all issues with AI agents
In @crates/vm/src/stdlib/thread.rs:
- Around line 490-494: The loop currently holds done_event's lock and then calls
inner.lock(), causing inconsistent lock ordering with start_joinable_thread and
risking deadlock; remove the inner.lock() check and rely solely on the done flag
(it is set to true in start_joinable_thread cleanup when the thread finishes).
Change the loop in the waiting code to only wait on the condition variable using
the done_event lock (e.g., while !*done { cvar.wait(&mut done); }) and remove
any inner.lock() access there so you do not acquire both done_event and inner
locks; keep references to done_event, inner, ThreadHandleState::Done, and
start_joinable_thread in the change to ensure the intent and correctness are
clear.
🧹 Nitpick comments (2)
crates/vm/src/stdlib/thread.rs (2)

991-1001: Consider using a condvar instead of spin-wait for concurrent join.

When multiple threads try to join the same handle, this code spin-waits with yield_now(). While the yield prevents a tight spin, a condvar would be more efficient for this synchronization.


902-920: Consider lazy registration of thread handles.

Every ThreadHandle is registered in THREAD_HANDLES at creation time (lines 914-917), even if never started. While weak references are cheap and cleaned up during iteration, registering only when the thread actually starts would reduce registry bloat.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f9fda4 and 157343f.

⛔ Files ignored due to path filters (10)
  • Lib/_dummy_thread.py is excluded by !Lib/**
  • Lib/dummy_threading.py is excluded by !Lib/**
  • Lib/test/lock_tests.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_process_pool.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_shutdown.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_thread_pool.py is excluded by !Lib/**
  • Lib/test/test_importlib/test_locks.py is excluded by !Lib/**
  • Lib/test/test_sys.py is excluded by !Lib/**
  • Lib/test/test_threading.py is excluded by !Lib/**
  • Lib/threading.py is excluded by !Lib/**
📒 Files selected for processing (8)
  • .github/workflows/ci.yaml
  • crates/vm/src/stdlib/posix.rs
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
  • scripts/fix_test.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/posix.rs
  • crates/vm/src/vm/thread.rs
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/stdlib/thread.rs
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

  • scripts/fix_test.py
**/*test*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*test*.py: NEVER comment out or delete any test code lines except for removing @unittest.expectedFailure decorators and upper TODO comments
NEVER modify test assertions, test logic, or test data in test files
When a test cannot pass due to missing language features, keep it as expectedFailure and document the reason; do not comment it out
The only acceptable modifications to test files are: (1) Removing @unittest.expectedFailure decorators and the upper TODO comments when tests actually pass, and (2) Adding @unittest.expectedFailure decorators when tests cannot be fixed

Files:

  • scripts/fix_test.py
🧠 Learnings (3)
📚 Learning: 2025-08-30T14:40:05.858Z
Learnt from: ShaharNaveh
Repo: RustPython/RustPython PR: 6089
File: scripts/lib_updater.py:292-297
Timestamp: 2025-08-30T14:40:05.858Z
Learning: In scripts/lib_updater.py, the --inplace flag intentionally writes to orig_file (not remote_file) even though patches are applied to remote_file content. This workflow allows updating the original RustPython test file with patches applied to new upstream CPython content.

Applied to files:

  • scripts/fix_test.py
📚 Learning: 2025-09-07T05:38:31.690Z
Learnt from: ShaharNaveh
Repo: RustPython/RustPython PR: 6089
File: scripts/lib_updater.py:198-202
Timestamp: 2025-09-07T05:38:31.690Z
Learning: In scripts/lib_updater.py, the iter_patches function intentionally does not handle SyntaxError from ast.parse(contents). The author confirmed this behavior is fine and intended - the tool should fail fast on unparseable files rather than silently skip processing.

Applied to files:

  • scripts/fix_test.py
📚 Learning: 2025-07-11T09:35:43.571Z
Learnt from: ShaharNaveh
Repo: RustPython/RustPython PR: 5947
File: .github/workflows/ci.yaml:116-116
Timestamp: 2025-07-11T09:35:43.571Z
Learning: GitHub Actions timeout-minutes field supports expressions that evaluate to integers, such as `${{ contains(matrix.os, 'windows') && 40 || 30 }}`, which will conditionally set different timeout values based on the runner OS.

Applied to files:

  • .github/workflows/ci.yaml
🧬 Code graph analysis (2)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/stdlib/thread.rs (2)
  • thread (574-576)
  • init_main_thread_ident (539-542)
scripts/fix_test.py (1)
scripts/lib_updater.py (3)
  • PatchSpec (81-122)
  • UtMethod (59-78)
  • apply_patches (273-283)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Check the WASM package and demo
🔇 Additional comments (10)
scripts/fix_test.py (1)

30-30: LGTM!

Import reordering to alphabetical order is a minor style improvement. All imports (PatchSpec, UtMethod, apply_patches) are correctly used in the file.

crates/vm/src/stdlib/thread.rs (3)

716-752: LGTM!

The Local thread-local storage implementation is well-designed:

  • Fast path avoids allocation when dict exists
  • Double-check pattern correctly handles races during insertion
  • Lock is released before TLS access to reduce contention
  • Guard-based cleanup ensures proper __del__ invocation on thread exit

818-858: LGTM!

The after_fork_child implementation correctly handles the fork scenario:

  • Updates main thread identity to the surviving thread
  • Marks other threads as Done (they don't exist in the child process)
  • Clears join handles (can't join OS threads from child)
  • Notifies condvar waiters to unblock any pending join() calls

291-302: LGTM!

Using pthread_self() on Unix for thread identity is the correct approach for fork compatibility, matching CPython's behavior. The fallback to Rust's thread ID on non-Unix platforms is appropriate.

crates/vm/src/vm/interpreter.rs (2)

146-173: LGTM!

The wait_for_thread_shutdown implementation correctly:

  • Checks if threading module was imported (via sys.modules.get("threading"))
  • Silently skips if threading wasn't used
  • Reports shutdown errors via run_unraisable to avoid crashing during finalization
  • Matches CPython's wait_for_thread_shutdown behavior

113-143: LGTM!

The finalization sequence correctly adds thread shutdown waiting before atexit handlers, matching CPython's Py_Finalize behavior. Documentation is updated accurately.

crates/vm/src/vm/mod.rs (2)

110-114: LGTM!

Adding global_trace_func and global_profile_func to PyGlobalState enables cross-thread tracing/profiling via sys._settraceallthreads/sys._setprofileallthreads. Using PyMutex<Option<PyObjectRef>> provides appropriate thread-safe access.


308-311: LGTM!

Initializing main thread identity early in initialize() (before builtins/sys module setup) ensures the main thread ident is available before any threading operations. The #[cfg(feature = "threading")] guard correctly conditionalizes this.

.github/workflows/ci.yaml (2)

299-299: Monitor whether the timeout increase is sufficient.

The timeout increase from 35 to 45 minutes (28% increase) brings Linux and macOS test durations in line with Windows, which was already at 45 minutes. This is likely necessary because test_threading is no longer skipped as an environment polluter and now runs inline with other tests.

Please monitor CI run times to ensure:

  1. Tests complete comfortably within the 45-minute limit
  2. The timeout isn't excessively generous (which would delay failure detection)

Based on learnings, GitHub Actions supports conditional timeout expressions. If future CI runs show platform-specific timing differences, consider using conditional timeouts like ${{ contains(matrix.os, 'windows') && 45 || 35 }} to optimize CI resource usage.

Also applies to: 306-306, 313-313


108-109: Verify that test_threading no longer pollutes the environment.

Removing test_threading from ENV_POLLUTING_TESTS_COMMON is a positive outcome of the threading fixes in this PR. Please confirm that test_threading consistently passes the environment pollution checks (lines 322-404) across all platforms.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/thread.rs (1)

243-256: Fix incorrect mutex reinitialization in RLock's _at_fork_reinit.

Lines 252-253 create an AtomicCell<&RawRMutex> with a reference to the old mutex and swap it with a reference to a stack-allocated new_mut. This doesn't actually reinitialize the mutex field; it's swapping references in a temporary AtomicCell that is immediately dropped. Compare with Lock::_at_fork_reinit (lines 153-157), which correctly transmutes to &AtomicCell<RawMutex> and swaps the value, not references.

🐛 Proposed fix
         self.count.store(0, core::sync::atomic::Ordering::Relaxed);
         let new_mut = RawRMutex::INIT;
-
-        let old_mutex: AtomicCell<&RawRMutex> = AtomicCell::new(&self.mu);
-        old_mutex.swap(&new_mut);
+        unsafe {
+            let old_mutex: &AtomicCell<RawRMutex> = core::mem::transmute(&self.mu);
+            old_mutex.swap(new_mut);
+        }
 
         Ok(())
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 157343f and f8cfed7.

📒 Files selected for processing (3)
  • crates/vm/src/stdlib/posix.rs
  • crates/vm/src/stdlib/thread.rs
  • scripts/fix_test.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/vm/src/stdlib/posix.rs
  • scripts/fix_test.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/thread.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run tests under miri
🔇 Additional comments (1)
crates/vm/src/stdlib/thread.rs (1)

134-139: The implementation correctly follows CPython's _thread.lock semantics. CPython's basic lock intentionally allows any thread to call release(), regardless of which thread acquired it. This differs from RLock (which does enforce ownership checking via the _is_owned() method). The code is correct as written.

Likely an incorrect or invalid review comment.

@youknowone youknowone marked this pull request as draft January 11, 2026 13:07
@youknowone youknowone force-pushed the threading branch 3 times, most recently from 04490a7 to a8747e6 Compare January 11, 2026 23:11
@youknowone youknowone marked this pull request as ready for review January 12, 2026 00:30
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
crates/vm/src/stdlib/thread.rs (3)

1000-1011: Potential issue: timeout handling may not account for spurious wakeups correctly.

The loop correctly handles spurious wakeups, but when a timeout occurs and *done happens to be true due to a race, the code returns Ok(()) instead of proceeding to cleanup. This is actually correct behavior, but the comment on line 1003-1005 suggests it returns early on timeout even when done - which is misleading.

♻️ Suggested clarification
                 if let Some(timeout) = timeout_duration {
                     let result = cvar.wait_for(&mut done, timeout);
                     if result.timed_out() && !*done {
-                        // Timeout occurred and done is still false
+                        // Timeout occurred and thread is still running - return without cleanup
                         return Ok(());
                     }
+                    // If done became true during wait, continue to cleanup
                 } else {

1024-1034: Busy-wait loop when another thread is joining.

When inner.joining is true, this code enters a spin loop with yield_now(). While this prevents deadlock, it's inefficient. Consider using a condition variable or event to wait for the join to complete.

♻️ Consider using a condvar instead of spin-wait
                 // If another thread is already joining, wait for them to finish
                 if inner.joining {
                     drop(inner);
-                    loop {
-                        let inner = self.inner.lock();
-                        if inner.joined {
-                            return Ok(());
-                        }
-                        drop(inner);
-                        std::thread::yield_now();
-                    }
+                    // Wait on the done_event - the joining thread will signal it
+                    let (lock, cvar) = &*self.done_event;
+                    let mut done = lock.lock();
+                    while !*done {
+                        cvar.wait(&mut done);
+                    }
+                    return Ok(());
                 }

467-507: _shutdown loop structure is sound but could be simplified.

The shutdown implementation correctly handles waiting for non-daemon threads. The double-lock pattern (first to find a handle, then to wait) avoids holding locks during the wait. However, the _inner variable on line 495 is unused.

♻️ Remove unused variable
             match handle_to_join {
-                Some((_inner, done_event)) => {
+                Some((_, done_event)) => {
                     // Wait for this thread to finish (infinite timeout)
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a8747e6 and dcd4196.

⛔ Files ignored due to path filters (10)
  • Lib/_dummy_thread.py is excluded by !Lib/**
  • Lib/dummy_threading.py is excluded by !Lib/**
  • Lib/test/lock_tests.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_process_pool.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_shutdown.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_thread_pool.py is excluded by !Lib/**
  • Lib/test/test_importlib/test_locks.py is excluded by !Lib/**
  • Lib/test/test_sys.py is excluded by !Lib/**
  • Lib/test/test_threading.py is excluded by !Lib/**
  • Lib/threading.py is excluded by !Lib/**
📒 Files selected for processing (8)
  • .github/workflows/ci.yaml
  • crates/vm/src/builtins/asyncgenerator.rs
  • crates/vm/src/stdlib/posix.rs
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/vm/src/vm/interpreter.rs
  • .github/workflows/ci.yaml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/builtins/asyncgenerator.rs
  • crates/vm/src/stdlib/posix.rs
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/vm/thread.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/mod.rs
🧬 Code graph analysis (4)
crates/vm/src/stdlib/posix.rs (1)
crates/vm/src/stdlib/thread.rs (3)
  • thread (581-583)
  • after_fork_child (839-890)
  • new (934-952)
crates/vm/src/stdlib/sys.rs (2)
crates/vm/src/stdlib/thread.rs (2)
  • thread (581-583)
  • get_all_current_frames (829-835)
crates/vm/src/coroutine.rs (1)
  • frame (205-207)
crates/vm/src/stdlib/thread.rs (1)
crates/vm/src/vm/thread.rs (2)
  • cleanup_current_thread_frames (79-85)
  • reinit_frame_slot_after_fork (90-105)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/vm/thread.rs (1)
  • update_current_frame (69-75)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Check Rust code with clippy
🔇 Additional comments (20)
crates/vm/src/stdlib/thread.rs (4)

290-307: LGTM - Thread ID implementation uses pthread_self for fork compatibility.

The use of pthread_self() on Unix for get_ident() is the correct approach for fork compatibility, matching CPython's behavior. The non-Unix fallback using ThreadId hashing is reasonable.


696-703: Potential deadlock avoided, but consider documenting the lock ordering.

The LocalGuard::drop implementation correctly drops the removed value outside the lock to prevent deadlock if __del__ accesses the _local. However, consider adding a comment about lock ordering expectations for maintainability.


852-889: Fork cleanup uses try_lock appropriately for safety.

The after_fork_child implementation correctly uses try_lock() throughout to avoid deadlocks if locks were held during fork. The cleanup logic properly marks other threads as done and notifies waiters. This aligns well with CPython's _PyThread_AfterFork behavior.


1106-1164: Remove this comment - the code does not have the described race condition.

The join_handle is stored synchronously at line 1164 immediately after spawn() returns. The join() method uses proper synchronization: it waits on done_event (via Condvar) before accessing join_handle at line 1015. Since done_event is only signaled when the spawned thread completes (in the defer block), join_handle is guaranteed to be Some by the time it's accessed. There is no race condition here.

crates/vm/src/builtins/asyncgenerator.rs (1)

60-71: LGTM - Correctly reads async gen hooks from VM-scoped storage.

The change from thread-local storage to VM-managed fields (vm.async_gen_finalizer and vm.async_gen_firstiter) is correct for the threading upgrade. This ensures hooks are properly shared across the VM while maintaining per-VM isolation.

crates/vm/src/stdlib/posix.rs (2)

663-671: LGTM - after_fork_child called before Python callbacks.

The placement of after_fork_child(vm) before running Python callbacks is correct. This ensures thread handles are properly cleaned up before any Python code runs in the child process.


678-730: Good defensive implementation of multi-threaded fork warning.

The implementation correctly:

  1. Avoids vm.import() by checking sys.modules directly to prevent executing arbitrary code in fork path
  2. Uses length_opt with proper error handling
  3. Emits a DeprecationWarning with appropriate stacklevel

One minor observation: the count includes both _active and _limbo dicts which matches CPython behavior.

crates/vm/src/vm/thread.rs (3)

66-75: Hot path update_current_frame looks correct but verify lock contention.

The comment states this is a hot path using only thread-local storage, but it does acquire a mutex lock (s.lock()) on the shared CurrentFrameSlot. While this is necessary for cross-thread visibility (sys._current_frames), it could cause contention under heavy threading.

Consider if this could be optimized with a lock-free approach if profiling shows contention, though the current implementation is correct.


87-105: Fork reinitialization correctly uses try_lock for safety.

The implementation properly handles the case where the parent thread may have been holding the lock during fork by using try_lock(). The thread-local is always updated regardless of whether the global registry could be updated.


214-241: New thread correctly inherits global trace/profile functions.

The new_thread() method properly reads global trace/profile functions and initializes the new thread's use_tracing flag accordingly. The initialization of async_gen_firstiter and async_gen_finalizer to None is correct - each thread starts with no hooks and can set them independently.

crates/vm/src/stdlib/sys.rs (3)

818-842: LGTM - _current_frames implementation is correct.

The threading-enabled version properly delegates to get_all_current_frames and builds a dict mapping thread IDs to frames. The non-threading fallback returns an empty dict which is appropriate.


1136-1150: _settraceallthreads and _setprofileallthreads correctly set global state.

Both functions properly:

  1. Store None in global state when passed None (using the (!vm.is_none(&func)).then(...) pattern)
  2. Update the current thread's local function
  3. Call update_use_tracing to refresh the tracing flag

Note: These only affect the current thread and newly spawned threads. Existing threads will pick up the global function when they're spawned via new_thread(), but running threads won't be affected until they check the global state. This matches CPython 3.13 behavior.


1231-1236: Async gen hooks correctly use VM-scoped storage.

The set_asyncgen_hooks and get_asyncgen_hooks functions now correctly read from and write to vm.async_gen_finalizer and vm.async_gen_firstiter, aligning with the VM-scoped storage pattern introduced in this PR.

Also applies to: 1255-1259

crates/vm/src/vm/mod.rs (7)

85-88: LGTM!

The async generator hook fields are appropriately typed with RefCell<Option<PyObjectRef>> for per-thread mutability within the VM. Moving these from thread-local storage to VM-scoped fields is a sensible design choice that aligns with the threading upgrade objectives.


114-129: LGTM!

The global state additions are well-designed:

  • PyMutex<Option<PyObjectRef>> for trace/profile functions enables safe cross-thread access
  • AtomicCell<u64> for main_thread_ident provides lock-free atomic reads/writes
  • parking_lot::Mutex for the registry collections is appropriate for the complexity of these data structures
  • Proper #[cfg(feature = "threading")] guards ensure clean builds without threading support

214-223: LGTM!

Initialization is consistent with the field declarations. The main_thread_ident starting at 0 is appropriate since it gets properly initialized in VirtualMachine::initialize() before any threading operations occur.


227-228: LGTM!

Correct initialization of async generator hooks to None.


334-336: LGTM!

Correctly placed initialization of the main thread identifier. Setting this before builtins::init_module and sys::init_module ensures the thread identity is established before any modules might attempt threading operations.


643-645: LGTM!

The frame tracking update is correctly placed after pushing to self.frames. The clone is necessary since update_current_frame takes ownership, and the current frame in the thread-local slot should reflect the newly pushed frame for sys._current_frames().


655-657: LGTM!

Correctly updates the thread-local frame slot after popping. Using self.frames.borrow().last().cloned() elegantly handles both the empty case (returns None) and the non-empty case (returns the new top frame). The RefCell borrow is safe since the borrow_mut() on line 654 ends before this new borrow().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant