[Misc] Benchmarks for audio models #16505

NickLucche · 2025-04-11T17:14:02Z

Implements feature requested in #16354.

Test with:

# server
vllm serve openai/whisper-large-v3-turbo

# client
python3 benchmarks/benchmark_serving.py \       
    --backend openai-audio \
    --dataset-name hf \
    --dataset-path edinburghcstr/ami --hf-subset ihm \
    --model openai/whisper-large-v3-turbo \
    --num-prompts 1000 \
    --endpoint /v1/audio/transcriptions \
    --save-result \
    2>&1 | tee benchmark_whisper.txt

~~It's still a draft because I want to sweep through the datasets first, but~~ reviews are welcome!
cc @DarkLight1337

FIX #16354

github-actions · 2025-04-11T17:14:12Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

benchmarks/backend_request_func.py

NickLucche · 2025-04-15T12:15:56Z

PS: these datasets need to be granted access manually on hf. Posting here in case we decide to run performance checks on them:
https://huggingface.co/datasets/speechcolab/gigaspeech
https://huggingface.co/datasets/kensho/spgispeech

DarkLight1337 · 2025-04-18T08:37:52Z

benchmarks/benchmark_serving.py

+            "Multi-modal content is only supported on 'openai-chat' and " \
+            "'openai-audio' backend.")


Can we abstract this to a class-level flag on the dataset class?

Maybe also need this for the backends

Actually that may take more effort, let's just merge this PR first

I can make a list like OPENAI_COMPATIBLE_BACKENDS but I wouldn't tie it to the dataset

actually maybe not even that because current check is on the specific backend key/name rather than on the function it uses

Actually that may take more effort, let's just merge this PR first

right I missed this comment I was answering to an old version lol

tests/entrypoints/openai/correctness/test_transcription_api_correctness.py

DarkLight1337 · 2025-04-18T09:40:09Z

Can you merge from main to try to fix the CI errors?

Signed-off-by: NickLucche <[email protected]>

Signed-off-by: NickLucche <[email protected]> Signed-off-by: Yang Wang <[email protected]>

Signed-off-by: NickLucche <[email protected]>

Signed-off-by: NickLucche <[email protected]> Signed-off-by: Agata Dobrzyniewicz <[email protected]>

Signed-off-by: NickLucche <[email protected]> Signed-off-by: Mu Huai <[email protected]>

DarkLight1337 reviewed Apr 12, 2025

View reviewed changes

benchmarks/backend_request_func.py Outdated Show resolved Hide resolved

benchmarks/backend_request_func.py Outdated Show resolved Hide resolved

NickLucche marked this pull request as ready for review April 14, 2025 15:18

NickLucche requested review from robertgshaw2-redhat and simon-mo as code owners April 14, 2025 15:18

NickLucche requested a review from DarkLight1337 April 18, 2025 08:34

DarkLight1337 reviewed Apr 18, 2025

View reviewed changes

tests/entrypoints/openai/correctness/test_transcription_api_correctness.py Show resolved Hide resolved

DarkLight1337 approved these changes Apr 18, 2025

View reviewed changes

NickLucche added 6 commits April 18, 2025 09:44

first impl

e5dfd04

Signed-off-by: NickLucche <[email protected]>

todo

9c4376e

Signed-off-by: NickLucche <[email protected]>

address review

397fd66

Signed-off-by: NickLucche <[email protected]>

finish sweep

f1b59b6

Signed-off-by: NickLucche <[email protected]>

cruft

15bddab

Signed-off-by: NickLucche <[email protected]>

check multimodal backend prior to dataset instantiation

9f13aff

Signed-off-by: NickLucche <[email protected]>

NickLucche force-pushed the whisper-benchmark branch from 8f7be52 to 9f13aff Compare April 18, 2025 09:44

DarkLight1337 enabled auto-merge (squash) April 18, 2025 09:49

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 18, 2025

vllm-bot merged commit 9d4ca19 into vllm-project:main Apr 19, 2025
26 of 30 checks passed

yangw-dev pushed a commit to yangw-dev/vllm that referenced this pull request Apr 21, 2025

[Misc] Benchmarks for audio models (vllm-project#16505)

5faa188

Signed-off-by: NickLucche <[email protected]> Signed-off-by: Yang Wang <[email protected]>

jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Apr 29, 2025

[Misc] Benchmarks for audio models (vllm-project#16505)

f75a111

Signed-off-by: NickLucche <[email protected]>

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Apr 29, 2025

[Misc] Benchmarks for audio models (vllm-project#16505)

25dee9b

Signed-off-by: NickLucche <[email protected]>

adobrzyn pushed a commit to HabanaAI/vllm-fork that referenced this pull request Apr 30, 2025

[Misc] Benchmarks for audio models (vllm-project#16505)

12a3621

Signed-off-by: NickLucche <[email protected]> Signed-off-by: Agata Dobrzyniewicz <[email protected]>

RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025

[Misc] Benchmarks for audio models (vllm-project#16505)

fa7680e

Signed-off-by: NickLucche <[email protected]> Signed-off-by: Mu Huai <[email protected]>

ckhordiasma mentioned this pull request May 14, 2025

nm vllm ent 0.8.5 sync red-hat-data-services/vllm#139

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Misc] Benchmarks for audio models #16505

[Misc] Benchmarks for audio models #16505

Uh oh!

NickLucche commented Apr 11, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Apr 11, 2025

Uh oh!

Uh oh!

Uh oh!

NickLucche commented Apr 15, 2025

Uh oh!

DarkLight1337 Apr 18, 2025

Uh oh!

DarkLight1337 Apr 18, 2025

Uh oh!

DarkLight1337 Apr 18, 2025

Uh oh!

NickLucche Apr 18, 2025

Uh oh!

NickLucche Apr 18, 2025

Uh oh!

NickLucche Apr 18, 2025

Uh oh!

Uh oh!

DarkLight1337 commented Apr 18, 2025

Uh oh!

Uh oh!

Uh oh!

		"Multi-modal content is only supported on 'openai-chat' and " \
		"'openai-audio' backend.")

Uh oh!

[Misc] Benchmarks for audio models #16505

[Misc] Benchmarks for audio models #16505

Uh oh!

Conversation

NickLucche commented Apr 11, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 11, 2025

Uh oh!

Uh oh!

Uh oh!

NickLucche commented Apr 15, 2025

Uh oh!

DarkLight1337 Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

NickLucche Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

NickLucche Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

NickLucche Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DarkLight1337 commented Apr 18, 2025

Uh oh!

Uh oh!

Uh oh!

NickLucche commented Apr 11, 2025 •

edited by github-actions bot

Loading