Skip to content

Conversation

@Xunzhuo
Copy link
Member

@Xunzhuo Xunzhuo commented Mar 18, 2025

Pull Request Description

refactor: core cache desgin and impl


Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

  • [Bug]: Corrections to existing functionality
  • [CI]: Changes to build process or CI pipeline
  • [Docs]: Updates or additions to documentation
  • [API]: Modifications to aibrix's API or interface
  • [CLI]: Changes or additions to the Command Line Interface
  • [Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

  • PR title includes appropriate prefix(es)
  • Changes are clearly explained in the PR description
  • New and existing tests pass successfully
  • Code adheres to project style and best practices
  • Documentation updated to reflect changes (if applicable)
  • Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

@Xunzhuo Xunzhuo changed the title refactor: core cache impl refactor: core cache desgin and impl Mar 18, 2025
@Xunzhuo Xunzhuo force-pushed the refactor-cache branch 3 times, most recently from 7d19bbe to 6b186ea Compare March 18, 2025 13:07
@Xunzhuo Xunzhuo changed the title refactor: core cache desgin and impl refactor: core cache design and impl Mar 18, 2025
@varungup90
Copy link
Collaborator

@zhangjyr is also working on same implementation.

@Xunzhuo
Copy link
Member Author

Xunzhuo commented Mar 19, 2025

@varungup90 is there a PR for it?

@Jeffwan
Copy link
Collaborator

Jeffwan commented Mar 19, 2025

@Xunzhuo I think jingyuan is working on heterogeneous routing which includes refactor related logics. Here's his RFC issue #868

@Xunzhuo
Copy link
Member Author

Xunzhuo commented Mar 19, 2025

This refactor does not break the previous usage for cache pkg, just improve the code maintainability, which I think is not conflict with the #868.

@Jeffwan
Copy link
Collaborator

Jeffwan commented Mar 19, 2025

@Xunzhuo that's even better. I will take a look tomorrow

@Xunzhuo Xunzhuo force-pushed the refactor-cache branch 4 times, most recently from c840175 to 4619847 Compare March 19, 2025 09:18
@Xunzhuo Xunzhuo marked this pull request as draft March 19, 2025 09:48
@Xunzhuo Xunzhuo marked this pull request as ready for review March 19, 2025 12:52
@Xunzhuo
Copy link
Member Author

Xunzhuo commented Mar 19, 2025

CI has passed, plz help review @varungup90 @Jeffwan

@zhangjyr zhangjyr changed the title refactor: core cache design and impl [API] Refactor: core cache design and impl Mar 19, 2025
@Jeffwan
Copy link
Collaborator

Jeffwan commented Mar 19, 2025

@Xunzhuo I reviewed the PR, and the refactor looks straightforward with a few key changes:

  1. Extracting a root-level interface: Introduced aCache interface with PodCache, ModelCache, MetricCache and TraceCache interfaces.
  2. Refactoring cache.go: Split the lengthy cache.go file into multiple, more manageable files
  3. Method renaming: Updated method names for clarity, e.g., GetModelsForPod -> ListModelsByPod etc.

These changes improve maintainability, especially for new contributors. Let me know if there are any specific areas you'd like me to focus on. BTW, Feel free to write key changes in the PR description next time.

Consider it changes a lots of files, I would like to fetch the branch and spend some time to double test the new code structure before merging. I will come back with some results soon

@zhangjyr
Copy link
Collaborator

Thanks for the PR. I have reviewed the new interfaces. Decoupling different cache data providers is necessary. However, PodCache and ModelCache should be unified as one interface, as they solely depend on pod information. We should maintain the atomicity of updating pod metadata, model(adapter) metadata, and their relation.

Another thing I need to clarify is that RequestCount series methods store real-time workload statistics, which are distinguished from cached metrics pulled from other data sources. I would suggest either:

  1. Using cached metrics as a store and implementing real-time RequestCount updates in extended metrics providers.
  2. Wrap RequestCount logic to a new real-time metrics provider.

The cache and routing part will be refactored to improve concurrent access in #884

@Jeffwan
Copy link
Collaborator

Jeffwan commented Mar 20, 2025

However, PodCache and ModelCache should be unified as one interface, as they solely depend on pod information. We should maintain the atomicity of updating pod metadata, model(adapter) metadata, and their relation.

@zhangjyr The informer code remains the same, I think it's still pod change driven. PodCache and ModelCache are two logical interfaces that only provides Get/List capabilities. Atomicity should be good.

@Xunzhuo
Copy link
Member Author

Xunzhuo commented Mar 20, 2025

@Jeffwan Yep, that's what I originally think.

@Xunzhuo
Copy link
Member Author

Xunzhuo commented Mar 20, 2025

@Xunzhuo I reviewed the PR, and the refactor looks straightforward with a few key changes:

  1. Extracting a root-level interface: Introduced aCache interface with PodCache, ModelCache, MetricCache and TraceCache interfaces.

  2. Refactoring cache.go: Split the lengthy cache.go file into multiple, more manageable files

  3. Method renaming: Updated method names for clarity, e.g., GetModelsForPod -> ListModelsByPod etc.

These changes improve maintainability, especially for new contributors. Let me know if there are any specific areas you'd like me to focus on. BTW, Feel free to write key changes in the PR description next time.

Thanks, it's most of the refactor key points.

I will create a clear issue and PR description in later works.

@zhangjyr
Copy link
Collaborator

However, PodCache and ModelCache should be unified as one interface, as they solely depend on pod information. We should maintain the atomicity of updating pod metadata, model(adapter) metadata, and their relation.

@zhangjyr The informer code remains the same, I think it's still pod change driven. PodCache and ModelCache are two logical interfaces that only provides Get/List capabilities. Atomicity should be good.

@Jeffwan I see your point. The proposal interfaces only defined the read operations. The data provisioning currently is still intact. I think we can merge the PR after #884 merged.

@Jeffwan
Copy link
Collaborator

Jeffwan commented Mar 21, 2025

@zhangjyr I feel this one is straightforward, PR #884 has more feature refactors, let's merge this one first? WDYT?

@Jeffwan Jeffwan merged commit 5a1fd3a into vllm-project:main Mar 21, 2025
11 checks passed
gangmuk pushed a commit to gangmuk/aibrix-gangmuk that referenced this pull request Jun 21, 2025
refactor: core cache design and impl

Signed-off-by: bitliu <[email protected]>
Yaegaki1Erika pushed a commit to Yaegaki1Erika/aibrix that referenced this pull request Jul 23, 2025
refactor: core cache design and impl

Signed-off-by: bitliu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants