[Fix] Fix multiple issues for benchmark implementation #1049

happyandslow · 2025-05-05T19:06:12Z

Pull Request Description

This PR fixes multiple existing issues with benchmark implementation

Related Issues

Resolves: #1029

Signed-off-by: Le Xu <[email protected]>

Jeffwan · 2025-05-07T22:21:35Z

benchmarks/benchmark.sh

        --model \"$TARGET_MODEL\" \
        --api-key \"$API_KEY\" \
        --time-scale \"$TIME_SCALE\" \
+        --routing-strategy \"$ROUTING_STRATEGY\" \


if we plan to use benchmark against other framework, this is not needed.

Jeffwan · 2025-05-07T22:23:48Z

Overall looks good to me. @happyandslow can you or du help on this issue so we can test these code in each CI run. #1050

Signed-off-by: Le Xu <[email protected]>

…1049) * move prompt decoration to client dispatch function; remove locks * update multiturn dataset generation by adding shared prefix length * update README for workload output format * move predefined synthetic workload files to autoscaling scenarios * update figure explaination in workload generator README * using shuffling instead of sampling from request finder * Adding stats to workload generation * adding routing strategy knob * bug fix in client.py * print fix in workload generator --------- Signed-off-by: Le Xu <[email protected]> Co-authored-by: Le Xu <[email protected]>

move prompt decoration to client dispatch function; remove locks

3db7d3b

Signed-off-by: Le Xu <[email protected]>

happyandslow mentioned this pull request May 5, 2025

Tracking multiple benchmark code issues #1029

Closed

happyandslow marked this pull request as draft May 5, 2025 20:56

Le Xu added 8 commits May 5, 2025 15:59

fix README

edc9f60

Signed-off-by: Le Xu <[email protected]>

update multiturn dataset generation by adding shared prefix length

bacbe0d

Signed-off-by: Le Xu <[email protected]>

update README for workload output format

350ca96

Signed-off-by: Le Xu <[email protected]>

move predefined synthetic workload files to autoscaling scenarios

5b75cb9

Signed-off-by: Le Xu <[email protected]>

update figure explaination in workload generator README

d937bc1

Signed-off-by: Le Xu <[email protected]>

using shuffling instead of sampling from request finder

452bcda

Signed-off-by: Le Xu <[email protected]>

improve logging

f661c87

Signed-off-by: Le Xu <[email protected]>

improving README

421a880

Signed-off-by: Le Xu <[email protected]>

happyandslow marked this pull request as ready for review May 7, 2025 17:41

Le Xu added 3 commits May 7, 2025 13:23

Adding stats to workload generation

e2e562a

Signed-off-by: Le Xu <[email protected]>

adding routing strategy knob

ba017b7

Signed-off-by: Le Xu <[email protected]>

bug fix in client.py

f8a8550

Signed-off-by: Le Xu <[email protected]>

Jeffwan reviewed May 7, 2025

View reviewed changes

print fix in workload generator

34cee99

Signed-off-by: Le Xu <[email protected]>

Jeffwan approved these changes May 8, 2025

View reviewed changes

Jeffwan merged commit 2b95773 into vllm-project:main May 8, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Fix] Fix multiple issues for benchmark implementation #1049

[Fix] Fix multiple issues for benchmark implementation #1049

Uh oh!

happyandslow commented May 5, 2025

Uh oh!

Jeffwan May 7, 2025

Uh oh!

Jeffwan commented May 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Fix] Fix multiple issues for benchmark implementation #1049

[Fix] Fix multiple issues for benchmark implementation #1049

Uh oh!

Conversation

happyandslow commented May 5, 2025

Pull Request Description

Related Issues

Uh oh!

Jeffwan May 7, 2025

Choose a reason for hiding this comment

Uh oh!

Jeffwan commented May 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants