Skip to content

Fix pipeline fetcher deadlock#3704

Merged
r4victor merged 2 commits intomasterfrom
pr_pipeline_fetcher_deadlock
Mar 27, 2026
Merged

Fix pipeline fetcher deadlock#3704
r4victor merged 2 commits intomasterfrom
pr_pipeline_fetcher_deadlock

Conversation

@r4victor
Copy link
Copy Markdown
Collaborator

This PR fixes a deadlock with child-parent pipelines. E.g. JobRunningPipeline skipped jobs if RunModel.lock_owner was set to prioritize RunPipeline but if the jobs was stale this cause deadlock because RunPipeline cannot recover stale job locks. The fix is to skip locking only new jobs and still process stale jobs irrespective of parent's lock_owner.

Also prioritizes JobSubmittedPipeline over RunPipeline to speedup provisioning (especially large multi-node tasks).

@r4victor r4victor merged commit 30e90d3 into master Mar 27, 2026
28 checks passed
@r4victor r4victor deleted the pr_pipeline_fetcher_deadlock branch March 27, 2026 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant