Skip to content

Conversation

@Jeffwan
Copy link
Collaborator

@Jeffwan Jeffwan commented Jan 20, 2025

Pull Request Description

  1. If user specify - --enable-runtime-sidecar in controller manager, controller will try to talk with runtime sidecar instead of engine directly. this is to promote our runtime model management api work and this is also good to build abstraction from different engines.

  2. Add /v1/models listing api in runtime which is missing in the past and lora adapter does relies on this feature to fetch existing adapters and base model.

Related Issues

Resolves: #567 #521 (partial) #49 (partial)

Important: Before submitting, please complete the description above and review the checklist below.


Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

  • [Bug]: Corrections to existing functionality
  • [CI]: Changes to build process or CI pipeline
  • [Docs]: Updates or additions to documentation
  • [API]: Modifications to aibrix's API or interface
  • [CLI]: Changes or additions to the Command Line Interface
  • [Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

  • PR title includes appropriate prefix(es)
  • Changes are clearly explained in the PR description
  • New and existing tests pass successfully
  • Code adheres to project style and best practices
  • Documentation updated to reflect changes (if applicable)
  • Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

@brosoul
Copy link
Collaborator

brosoul commented Jan 21, 2025

overall lgtm

1. hugginface protocol shadow assignment bug
2. wrong runtime port
3. wrong host used in buildurls
4. can not forward entire headers due to content length mismatch

Signed-off-by: Jiaxin Shan <[email protected]>
Signed-off-by: Jiaxin Shan <[email protected]>
@Jeffwan Jeffwan force-pushed the jiaxin/switch-to-runtime-api branch from 5805f1e to 0e8542b Compare January 21, 2025 18:58
@Jeffwan Jeffwan merged commit d6319bb into main Jan 21, 2025
13 checks passed
@Jeffwan Jeffwan deleted the jiaxin/switch-to-runtime-api branch January 21, 2025 19:41
gangmuk pushed a commit that referenced this pull request Jan 25, 2025
* Introduce RuntimeConfig to all controllers

* Refactor the logic to construct URLs based on different envs

* Leverage runtime api to manage lora load & unload

* Fix several bugs

1. hugginface protocol shadow assignment bug
2. wrong runtime port
3. wrong host used in buildurls
4. can not forward entire headers due to content length mismatch

* Format files

* Address code review feedback

---------

Signed-off-by: Jiaxin Shan <[email protected]>
Yaegaki1Erika pushed a commit to Yaegaki1Erika/aibrix that referenced this pull request Jul 23, 2025
…t#580)

* Introduce RuntimeConfig to all controllers

* Refactor the logic to construct URLs based on different envs

* Leverage runtime api to manage lora load & unload

* Fix several bugs

1. hugginface protocol shadow assignment bug
2. wrong runtime port
3. wrong host used in buildurls
4. can not forward entire headers due to content length mismatch

* Format files

* Address code review feedback

---------

Signed-off-by: Jiaxin Shan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

integrate the model registration flow with runtime

4 participants