[enhancement] GPU optimizer accumulated fix #598

zhangjyr · 2025-01-24T02:26:24Z

Pull Request Description

This is the accumulated fix to make the GPU optimizer work as designed.
Include changes about how to deploy a debugged version of components: after applying -k config/default, develop the version of components under config/dev can be deployed independently.
2.1 For components created using "kubectl create -k config/default", older component must be deleted using "kubectl delete -k config/dev/xxxx" and then reapplied.

Related Issues

Resolves: #546

Important: Before submitting, please complete the description above and review the checklist below.

Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

[Bug]: Corrections to existing functionality
[CI]: Changes to build process or CI pipeline
[Docs]: Updates or additions to documentation
[API]: Modifications to aibrix's API or interface
[CLI]: Changes or additions to the Command Line Interface
[Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

PR title includes appropriate prefix(es)
Changes are clearly explained in the PR description
New and existing tests pass successfully
Code adheres to project style and best practices
Documentation updated to reflect changes (if applicable)
Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

…/gpu_optimizer

Add test case for make url created from metricSource as expected: endpoint should include port, if not and port is specified, port will be append to endpoint.

Fix an error in benchmark that may cause error when now all token_latencies might missing some data.

Optimize update logic.

Reset benchmark as in main.

Jeffwan · 2025-01-25T02:10:47Z

@nwangfw can you help review this PR?

Signed-off-by: Jingyuan <[email protected]>

Makefile

config/gateway/kustomization.yaml

Jeffwan

I didn't look into the gpu-optimizer detail logics. Since you already run through this version successfully, I think it should be ok. Leave to @nwangfw for further review on that part.

Jeffwan · 2025-01-27T18:32:17Z

config/overlays/vke-dev/manager/kustomization.yaml

+images:
+- name: aibrix/controller-manager
+  newName: aibrix-container-registry-cn-beijing.cr.volces.com/aibrix/controller-manager
+  newTag: v0.2.0-rc.1


minor: we can use v0.2.0-rc.2 now. few issues fixed.

Jeffwan · 2025-01-27T18:32:29Z

config/overlays/dev/gpu-optimizer/kustomization.yaml

+
 resources:
- ../../../default
+- ../../../gpu-optimizer


now, it installed single component instead of the default installation?

Yes, previously, it installed all + specific debug version. I find it is not convenient to debug more than one components. Now, we just deploy the default, and apply debug components if needed.

* Bug fix * Fix configuration for domain podautoscaler Add test case for make url created from metricSource as expected: endpoint should include port, if not and port is specified, port will be append to endpoint. * Lint fix * Add license for new files. * Lint fix on added unit test. * Add authorization support * Support parameterized benchmark * Remove next_in paramter * Bug fix * Fix typo * Bug fix * Apply stream parameter * Cleaning up responses. * Bug fix * If error not reported as a temporary eror, we will not retry. * GPU profile now support TPAT (time per all token) Fix an error in benchmark that may cause error when now all token_latencies might missing some data. * Debug optimizer * bird prompt dataset generation * update benchmark to support prompt dataset loading * Benchmark now support workload parameter * Bug fix * Log control * Improve stability and lint fix. * Bug fix * switch logs for gpu-optimizer to json format * added BIRD dataset with Aruze timestamp script * add BIRD brust pattern workload generation * Visualizer now support workload file * Print out workload input * Bug fix * lint fix * remove timestamp offset * Bug fix: call _parse_profiles without parameter out_records will not add up returns. * Use current ts to load profile may to early, revert to use an interval ago. * Use the larger of average request rate in window and current request rate to get sufficient resources. * Tuning up request rate temporarily. * Bug fix Fix request rate to 8 temporarily * Remove fixed rate * changing load profile back * Provide compatibility to v3 gateway profiles. * Adjust development config * Add config for gateway-plugin development * delayed scale in deployment added * Add trace to benchmark * rollback to old version without delayed scale in * Disregard pending requests for now. * Bug fix * Bug fix * Adapt to latest profile about pending requests and update unittest. * Output correct timestamp * Output pending and total requests from load reader * Ignore pending for now. * Add throughput filter. * bug and lint fix * Fix a bug that when mat_tputs are 0 * Lint fix * fix benchmark on count num_requests * Optimizer now can adopt deployment changes using "kubectl apply" * Add comments * bug fix * Make signature prefer higher index on choose profiles. * Bug fix, watch ScalingReplicaSet for label changes * Bug fix * Change back SLO preference. Optimize update logic. --------- Signed-off-by: Jingyuan <[email protected]> Co-authored-by: Jingyuan Zhang <[email protected]> Co-authored-by: Ning Wang <[email protected]>

Jingyuan Zhang and others added 30 commits December 10, 2024 11:44

Bug fix

5cbd968

Merge commit '0d40fbd19ba01daf1aa6267515814c18f19aaa09' into jingyuan…

11be286

…/gpu_optimizer

Fix configuration for domain podautoscaler

7adb1b7

Add test case for make url created from metricSource as expected: endpoint should include port, if not and port is specified, port will be append to endpoint.

Lint fix

c159726

Add license for new files.

f9f1d99

Lint fix on added unit test.

6f717c5

Add authorization support

5c5225b

Support parameterized benchmark

6a7584d

Remove next_in paramter

bd46cc3

Bug fix

401be9f

Fix typo

f39e4b4

Bug fix

22e7db5

Apply stream parameter

4296ce1

Cleaning up responses.

2c40b7c

Bug fix

17a0798

If error not reported as a temporary eror, we will not retry.

4df3b76

GPU profile now support TPAT (time per all token)

ee494b7

Fix an error in benchmark that may cause error when now all token_latencies might missing some data.

Debug optimizer

36cbf87

bird prompt dataset generation

d59f12a

update benchmark to support prompt dataset loading

deee544

Benchmark now support workload parameter

7da1be8

Bug fix

32e2ba9

Log control

3d3e929

Improve stability and lint fix.

cf47cab

Bug fix

31b4b0e

switch logs for gpu-optimizer to json format

28f2521

added BIRD dataset with Aruze timestamp script

2b672cf

add BIRD brust pattern workload generation

e2f58d8

Visualizer now support workload file

7c2d455

Print out workload input

ab20a58

Jingyuan Zhang added 16 commits January 13, 2025 10:37

Output correct timestamp

81cfb63

Output pending and total requests from load reader

cc2ac94

Ignore pending for now.

4f2b2ee

Add throughput filter.

169047c

bug and lint fix

cd08af9

Fix a bug that when mat_tputs are 0

5dac822

Lint fix

1d3f44d

fix benchmark on count num_requests

6edb872

Optimizer now can adopt deployment changes using "kubectl apply"

8e9d9b0

Add comments

9fda28a

bug fix

921a9fe

Make signature prefer higher index on choose profiles.

ff22ab9

Bug fix, watch ScalingReplicaSet for label changes

186fcdc

Bug fix

ae27970

Change back SLO preference.

ca0f020

Optimize update logic.

Merge branch 'main' into jingyuan/gpu_optimizer

e697adc

Reset benchmark as in main.

zhangjyr requested review from Jeffwan and nwangfw January 24, 2025 02:27

Merge branch 'main' into jingyuan/gpu_optimizer

e50a343

Signed-off-by: Jingyuan <[email protected]>

nwangfw reviewed Jan 27, 2025

View reviewed changes

Makefile Show resolved Hide resolved

config/gateway/kustomization.yaml Show resolved Hide resolved

Jeffwan added this to the v0.2.0 milestone Jan 27, 2025

Jeffwan approved these changes Jan 27, 2025

View reviewed changes

nwangfw approved these changes Jan 27, 2025

View reviewed changes

Jeffwan changed the title ~~[Bug] GPU optimizer accumulated fix~~ [enhancement] GPU optimizer accumulated fix Jan 27, 2025

Jeffwan merged commit 7a7e60a into main Jan 27, 2025
10 checks passed

Jeffwan deleted the jingyuan/gpu_optimizer branch January 27, 2025 22:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[enhancement] GPU optimizer accumulated fix #598

[enhancement] GPU optimizer accumulated fix #598

Uh oh!

zhangjyr commented Jan 24, 2025

Uh oh!

Jeffwan commented Jan 25, 2025

Uh oh!

Uh oh!

Uh oh!

Jeffwan left a comment

Uh oh!

Jeffwan Jan 27, 2025

Uh oh!

Jeffwan Jan 27, 2025

Uh oh!

zhangjyr Jan 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[enhancement] GPU optimizer accumulated fix #598

[enhancement] GPU optimizer accumulated fix #598

Uh oh!

Conversation

zhangjyr commented Jan 24, 2025

Pull Request Description

Related Issues

Pull Request Title Format

Submission Checklist

Uh oh!

Jeffwan commented Jan 25, 2025

Uh oh!

Uh oh!

Uh oh!

Jeffwan left a comment

Choose a reason for hiding this comment

Uh oh!

Jeffwan Jan 27, 2025

Choose a reason for hiding this comment

Uh oh!

Jeffwan Jan 27, 2025

Choose a reason for hiding this comment

Uh oh!

zhangjyr Jan 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants