-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[Enhancement] Support fast scheduling for AI Agent Workloads #4722
Description
What is the problem you're trying to solve
Agent workloads are latency-sensitive, it requires the scheduler to make ultra-fast scheduling for a large number of Agent tasks, ensuring high throughput and a low latency scheduling for each task. Volcano scheduler provides batch scheduling and advanced scheduling strategies for big data, or AI workloads. However, it lacks sufficient support and optimization for Agent workloads, particularly in terms of fast scheduling and scheduling strategy optimization. Therefore, the goal is to extend the Volcano scheduler to provide fast scheduling capabilities and scheduling strategies tailored to the characteristics of Agent workloads.
Describe the solution you'd like
Proposal: Create a fast scheduler tailored for Agent workloads, collaborating with the Volcano scheduler to provide capabilities such as fast scheduling, resource utilization optimization, and scheduling strategy optimization, to better meet the scheduling requirements of Agent workloads. By optimizing and streamlining the scheduling process and applying Agent-specific scheduling strategies, the agent scheduler can improve Agent task scheduling efficiency and reduce task startup latency.
When Agent workloads coexist with other workloads, the Agent scheduler can be seamlessly integrated with the Volcano scheduler, scheduling different workloads to node-level shard across the cluster, to guarantee scheduling efficiency and improve cluster-wide resource utilization.
Nodes are assigned to different shards based on various strategies and are dynamically adjusted according to the cluster state, improving utilization of node resources by different types of workloads.
Proposal Details
The full proposal details is at: https://docs.google.com/document/d/1NDTEGulhQ__sdZ_c2kN3ojtJIZie5O7_GyZyKIOLz-g/edit?usp=sharing
Additional context
Please feel free the share your thoughts and ideas
Metadata
Metadata
Assignees
Labels
Type
Projects
Status