Skip to content

Development Roadmap (2025 H1) #4042

Open
@zhyncs

Description

@zhyncs

Here is the development roadmap for 2025 H1. Contributions and feedback are welcome (Join Bi-weekly Development Meeting). The previous 2024 Q4 roadmap can be found in #1487

Focus

  • Throughput-oriented large-scale deployment similar to the deepseek inference system
  • Long context optimizations
  • Low latency speculative decoding
  • Reinforcement learning training framework integration
  • Kernel optimizations

Parallelism

Attention Backend

Caching

Kernel

Quantization

RL Framework integration

Core refactor

Speculative decoding

Multi-LoRA serving

Hardware

Model coverage

Function Calling

Others

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions