Skip to content

[Enhancement] Support fast scheduling for AI Agent Workloads #4722

@qi-min

Description

@qi-min

What is the problem you're trying to solve

Agent workloads are latency-sensitive, it requires the scheduler to make ultra-fast scheduling for a large number of Agent tasks, ensuring high throughput and a low latency scheduling for each task. Volcano scheduler provides batch scheduling and advanced scheduling strategies for big data, or AI workloads. However, it lacks sufficient support and optimization for Agent workloads, particularly in terms of fast scheduling and scheduling strategy optimization. Therefore, the goal is to extend the Volcano scheduler to provide fast scheduling capabilities and scheduling strategies tailored to the characteristics of Agent workloads.

agentcube subproject

Describe the solution you'd like

Proposal: Create a fast scheduler tailored for Agent workloads, collaborating with the Volcano scheduler to provide capabilities such as fast scheduling, resource utilization optimization, and scheduling strategy optimization, to better meet the scheduling requirements of Agent workloads. By optimizing and streamlining the scheduling process and applying Agent-specific scheduling strategies, the agent scheduler can improve Agent task scheduling efficiency and reduce task startup latency.

When Agent workloads coexist with other workloads, the Agent scheduler can be seamlessly integrated with the Volcano scheduler, scheduling different workloads to node-level shard across the cluster, to guarantee scheduling efficiency and improve cluster-wide resource utilization.

Nodes are assigned to different shards based on various strategies and are dynamically adjusted according to the cluster state, improving utilization of node resources by different types of workloads.

Proposal Details

The full proposal details is at: https://docs.google.com/document/d/1NDTEGulhQ__sdZ_c2kN3ojtJIZie5O7_GyZyKIOLz-g/edit?usp=sharing

Additional context

Please feel free the share your thoughts and ideas

Metadata

Metadata

Assignees

Labels

kind/featureCategorizes issue or PR as related to a new feature.

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions