enhancement(scheduler): honor QueueOrderFn in preempt action#5142
enhancement(scheduler): honor QueueOrderFn in preempt action#5142hajnalmt wants to merge 4 commits intovolcano-sh:masterfrom
Conversation
…ption The "Preemption between Task within Job" loop overwrites preemptorTasks[job.UID] for every starving job in underRequest, which is shared across all queues. In multi-queue scenarios this mutates the same preemptor state used later by the between-jobs preemption phase. Because queue traversal previously depended on Go map iteration order, the behavior becomes non-deterministic. If queue Q1 (with no relevant preemptors) is visited first, its intra-job pass still iterates shared underRequest entries and can drain/replace Q2's preemptorTasks entry. When Q2 is visited later, the between-jobs loop sees empty preemptor state and skips valid preemption, so starvation can persist. Use a scoped local queue (intraJobPreemptors) for the intra-job pass instead of reusing preemptorTasks[job.UID]. This preserves the original preemptorTasks map populated during job discovery for between-jobs preemption while keeping intra-job behavior isolated. Add and document a multi-queue regression test that models this cross- queue interference path and verifies that intra-job processing in one queue does not invalidate between-jobs preemption in another queue. The test also captures the prior flaky characteristic (majority pass, intermittent fail) caused by non-deterministic queue iteration. Signed-off-by: Hajnal Máté <hajnalmt@gmail.com>
Use util.NewPriorityQueue(ssn.QueueOrderFn) in preempt so queue processing order is deterministic and aligned with allocate/reclaim behavior. While preserving Osykov's original queue-order implementation intent, keep the preemptorTasks overwrite fix behavior from the stacked base commit by using scoped intra-job preemptor queues during intra-job preemption. Also include the topology-aware multi-queue test coverage from the original change to validate queue-ordered preemption behavior. Signed-off-by: Vitalii Osykov <vitaliyosykov@gmail.com> Signed-off-by: Hajnal Máté <hajnalmt@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request fixes a bug in the preemption action where preemptorTasks were being overwritten during intra-job preemption, which led to lost preemptors in multi-queue scenarios. It also introduces ordered queue processing using a priority queue and adds comprehensive regression tests. A review comment highlights that the intra-job preemption logic is currently redundant because it resides within the queue iteration loop, suggesting it be moved outside to improve performance.
Build underRequest entries per queue and process only the current queue's starving jobs in the intra-job preemption loop. This avoids cross-queue iteration in each queue pass and keeps intra-job processing aligned with the active queue context. Signed-off-by: Hajnal Máté <hajnalmt@gmail.com>
There was a problem hiding this comment.
Pull request overview
Updates the scheduler’s preempt action to process queues deterministically using QueueOrderFn (via util.NewPriorityQueue), aligning behavior with other actions and addressing multi-queue preemption correctness.
Changes:
- Replace Go map iteration over queues in preempt with a
PriorityQueueordered byssn.QueueOrderFn. - Fix multi-queue intra-job preemption so per-queue processing doesn’t overwrite/drain
preemptorTasksfor jobs in other queues. - Extend preempt action tests with multi-queue regression coverage and add a queue-order-focused topology-aware scenario.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
pkg/scheduler/actions/preempt/preempt.go |
Switch queue traversal to a PriorityQueue (honoring QueueOrderFn) and scope intra-job preemptor queues to avoid cross-queue state corruption. |
pkg/scheduler/actions/preempt/preempt_test.go |
Add regression test for multi-queue preemptor task overwrite and add a topology-aware scenario with queue order enabled. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Iterate only relevant starving-job queues when building the preempt queue priority structure to avoid unnecessary queue scans and extra no-preemptor iterations. Also fix the topology-aware test case name typo. Signed-off-by: Hajnal Máté <hajnalmt@gmail.com>
|
/area scheduling |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR makes the preempt action honor
QueueOrderFnviautil.NewPriorityQueue, aligning preempt queue processing with allocate/reclaim and removing non-deterministic map-order queue traversal.The PR fixes the
underRequestarray traversal too to honor queue ordering by switching to a mapped structure withunderRequestByQueue. (and not looping trough it at every processed queue, which was a bug I think)Which issue(s) this PR fixes:
Fixes #5139
Special notes for your reviewer:
Does this PR introduce a user-facing change?