Atomic Decoupling and Physical Isolation for AI Scheduling Primitives#5136
Atomic Decoupling and Physical Isolation for AI Scheduling Primitives#5136wangyang0616 wants to merge 1 commit intovolcano-sh:masterfrom
Conversation
… AI Scheduling Primitives Signed-off-by: wangyang0616 <wangyang8126@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request introduces a design proposal for the atomic decoupling and physical isolation of HyperNode and PodGroup controllers in Volcano to support independent compilation and atomic delivery. The review feedback highlights a significant inconsistency between the design document's claims and the actual file changes, noting that the described code movements to the staging directory are missing from the PR. Additionally, it is suggested to define the acronym 'TTFT' (Time To First Token) to improve the clarity of the documentation.
| ## 1. Summary | ||
| This proposal suggests a structural refactoring of the Volcano controller architecture to achieve physical decoupling of **HyperNode** and **PodGroup**. By leveraging a **Staging Mode** and independent Go modules—while maintaining a **Unified Volcano API Library**—we aim to transform core scheduling units into atomic primitives. This evolution empowers hardware ecosystem co-construction and supports complex AI scenarios such as **PD (Prefill & Decoding) separation** and **multi-dimensional Gang scheduling**. |
There was a problem hiding this comment.
There are inconsistencies between the PR description, this design document, and the apparent file changes.
- The PR description and design doc state that
HyperNodeandPodGroupcontrollers are moved to thestagingdirectory, but these changes are not present in the PR. - The
volcano.sh/apispackage seems to have been moved tostaging(based on thego.modfile), but this is not mentioned in the PR description, and the design doc states the API library will not be split.
To avoid confusion, please ensure the PR's content, description, and design document are all aligned. If this PR is only for the design document, the description should be updated to reflect that.
| * **Multi-dimensional Gang Scheduling**: In **PD Separation** scenarios, AI workloads consist of multiple logical groups. PodGroup must evolve to provide **multi-layered resource coordination**. | ||
| * **Primitive-level Joint Scheduling**: Enabling dual-axis concurrency of "Topology Awareness + Gang Coordination." | ||
| * **Training**: Improves resource alignment efficiency in large-scale clusters. | ||
| * **Inference**: Ensures shards are placed on optimal paths (NVLink/RDMA) to reduce **TTFT** and increase throughput. |
There was a problem hiding this comment.
Pull request overview
Adds a design proposal document describing how to physically decouple the HyperNode and PodGroup controllers into independent staging modules while preserving a unified Volcano API contract.
Changes:
- Introduces a new design doc outlining staging-based physical isolation and “atomic delivery” goals for HyperNode/PodGroup.
- Documents a proposed directory layout and dual-delivery (monolithic + standalone) build approach.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ```text | ||
| volcano/ | ||
| ├── apis/ # Unified API Library (Contains all CRDs) | ||
| ├── pkg/ | ||
| │ └── controllers/ # Monolithic business logic (Job, Queue, etc.) | ||
| ├── staging/src/volcano.sh/ | ||
| │ ├── hypernode/ # HyperNode Independent Module | ||
| │ │ ├── go.mod # Independent Module Definition | ||
| │ │ ├── cmd/ # Entry point for vc-hypernode | ||
| │ │ └── pkg/controller/ # Core Topology Primitive Logic | ||
| │ └── podgroup/ # PodGroup Independent Module | ||
| │ ├── go.mod # Independent Module Definition | ||
| │ ├── cmd/ # Entry point for vc-podgroup | ||
| │ └── pkg/controller/ # Core Gang Scheduling Logic | ||
| ├── go.mod # Main Repo Entry (replace directives for staging) |
There was a problem hiding this comment.
The repository layout shown here lists a top-level apis/ directory as the unified API library, but in this repo the canonical APIs module is maintained under staging/src/volcano.sh/apis/ (per existing docs like docs/design/adapt-k8s-todo.md). Updating this tree to reflect the current staging-based API layout would prevent readers from looking for (or creating) a non-existent/incorrect top-level apis/ directory.
| # Volcano AI Scheduling Primitives Atomic Decoupling and Physical Isolation | ||
|
|
||
| **Authors:** wangyang0616 | ||
|
|
There was a problem hiding this comment.
The filename includes an & character (hypernode&podgroup-controller-independent.md). This commonly causes quoting/URL-encoding issues in shells, static-site generators, and links. Consider renaming the file to use a hyphen (e.g., hypernode-podgroup-controller-independent.md) and updating any references accordingly.
| ## 1. Summary | ||
| This proposal suggests a structural refactoring of the Volcano controller architecture to achieve physical decoupling of **HyperNode** and **PodGroup**. By leveraging a **Staging Mode** and independent Go modules—while maintaining a **Unified Volcano API Library**—we aim to transform core scheduling units into atomic primitives. This evolution empowers hardware ecosystem co-construction and supports complex AI scenarios such as **PD (Prefill & Decoding) separation** and **multi-dimensional Gang scheduling**. | ||
|
|
||
| ## 2. Goals & Non-Goals | ||
|
|
||
| ### 2.1 Goals | ||
| * **Physical Isolation**: Move HyperNode and PodGroup logic into `staging/` with independent `go.mod` files to support standalone compilation. | ||
| * **Atomic Delivery**: Provide the capability to build and deploy lightweight, independent images (`vc-hypernode`, `vc-podgroup`) for inference and edge scenarios. | ||
| * **Ecosystem Empowerment**: Simplify the contribution path for hardware vendors to implement topology discovery within a decoupled HyperNode module. | ||
| * **Maintain Monolithic Compatibility**: Ensure the existing `vc-controller-manager` remains the primary delivery vehicle with zero changes to its default behavior. |
There was a problem hiding this comment.
PR description says the physical relocation/standalone modules are already implemented and validated (build + unit tests), but the changes in this PR appear to only add a design doc (no staging hypernode/podgroup modules or controller-manager import changes are present in the repo). Either adjust the PR description to reflect that this is a proposal-only change, or include the actual implementation changes in the PR so the description matches the diff.
|
|
||
|
|
||
| # Volcano AI Scheduling Primitives Atomic Decoupling and Physical Isolation | ||
|
|
There was a problem hiding this comment.
This new design doc starts with leading blank lines (and has trailing whitespace after the title line). Most other files under docs/design/ start directly with the top-level # header; trimming the leading blank lines/extra whitespace will keep formatting consistent and avoid markdown-lint noise.
| # Volcano AI Scheduling Primitives Atomic Decoupling and Physical Isolation | |
| # Volcano AI Scheduling Primitives Atomic Decoupling and Physical Isolation |
| ### 4.3 Engineering Implementation | ||
| * **Independent Modules**: Create `go.mod` files within staging. Use `replace` directives to point to `../../../../apis` locally. | ||
| * **Dependency Cleanup**: Remove strong type references from `HyperNode/PodGroup` to business packages like `pkg/controllers/job`. |
There was a problem hiding this comment.
The document’s proposed module wiring says staging modules should replace the APIs module via a relative path like ../../../../apis, but in this repo the APIs module lives at staging/src/volcano.sh/apis (and main go.mod already uses replace volcano.sh/apis => ./staging/src/volcano.sh/apis). For new staging modules under staging/src/volcano.sh/, the correct relative path to the APIs module would be adjacent (e.g., ../apis) or you should explicitly reference the existing staging location to avoid confusing/incorrect instructions.
|
@hzxuzhonghu @JesseStutler @hajnalmt Please help review this as well. Thanks. |
|
One scope/traceability concern: this PR currently says I suggest changing the linkage to something like |
Yes, it has been updated. |
What type of PR is this?
Feature proposal
What this PR does / why we need it:
This PR implements the physical decoupling of HyperNode and PodGroup controllers as proposed in #5133. It transitions these core AI scheduling primitives into a Staging Mode, enabling independent evolution and atomic delivery without fragmenting the unified API library.
Key Changes:
Which issue(s) this PR fixes:
Refs #5133
Special notes for your reviewer: