[Model][VLM] Add Qwen2.5-Omni model support (thinker only) #15130

fyabc · 2025-03-19T13:58:49Z

This PR adding support for Qwen2.5-Omni model (thinker only).

Requirements

This PR requires this corresponding transformers PR.

pip install git+https://github.com/huggingface/transformers@f742a644ca32e65758c3adb36225aef1731bd2a8

Note: You need to install transformers from source from that branch

Example Usage

# Audio + image + video
python examples/offline_inference/qwen2_5_omni/only_thinker.py -q mixed_modalities

# Read vision and audio inputs from a single video file
# NOTE: V1 engine does not support interleaved modalities yet.
VLLM_USE_V1=0 python examples/offline_inference/qwen2_5_omni/only_thinker.py -q use_audio_in_video

# Process audio inputs
python examples/offline_inference/audio_language.py --model-type qwen2_5_omni

# Process image inputs
python examples/offline_inference/vision_language.py --modality image --model-type qwen2_5_omni

# Process video inputs
python examples/offline_inference/vision_language.py --modality video --model-type qwen2_5_omni

Notes

The whole Qwen2.5-Omni model includes three parts:

thinker: multimodal inputs -> text responses & hidden states
talker: text responses & hidden states from thinker -> speech codes
code2wav (streaming codec decoder): codes -> speech

This PR only implements the thinker part now, it accepts multimodal inputs (images / videos / audios), and generate text responses, similar to other common VLMs.
We have also develped an end-to-end implementation (will be released soon), but due to its significant impact on the vLLM framework architecture, we will not create the related pull request for now.

FIX #15563

github-actions · 2025-03-19T13:58:59Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

DarkLight1337 · 2025-03-19T14:05:21Z

Sorry I don't have time to review in detail tonight, but from a quick glance, can you add this model to the following pages?

Supported Models page
tests/models/registry.py (set is_available_online=False to pass CI until the model repo is released on HF)
tests/models/multimodal/processing/test_common.py
tests/models/decoder_only/vision_language/test_models.py (optional for now)

fyabc · 2025-03-19T14:06:29Z

Sorry I don't have time to review in detail tonight, but from a quick glance, can you add this model to the following pages?

Supported Models page

tests/models/registry.py (set is_available_online=False to pass CI until the model repo is released on HF)

tests/models/multimodal/processing/test_common.py

tests/models/decoder_only/vision_language/test_models.py (optional for now)

OK，I will add them tomorrow.

vllm/model_executor/models/qwen2_5_omni_thinker.py

vllm/transformers_utils/config.py

vllm/model_executor/models/qwen2_5_omni_thinker.py

vllm/model_executor/models/registry.py

vllm/multimodal/inputs.py

yangninghua · 2025-03-21T03:56:52Z

@fyabc Qwen/Qwen2.5-Omni-7B ??

ywang96 · 2025-03-21T05:24:43Z

Sorry for the delay - going to take a look at this PR tonight!

ywang96

Thank you for the contribution! I have left some comments!

vllm/model_executor/models/qwen2_5_omni_thinker.py

vllm/multimodal/inputs.py

vllm/transformers_utils/processor.py

fyabc · 2025-03-24T11:36:32Z

Hi @ywang96 @DarkLight1337 , I update some other examples here, please check the code.

vllm/assets/video.py

tests/models/registry.py

DarkLight1337 · 2025-04-18T13:02:51Z

Can you resolve the failures in the basic models test?

Signed-off-by: fyabc <[email protected]>

…_omni_public_v1

fyabc · 2025-04-18T18:14:00Z

Can you resolve the failures in the basic models test?

Hi @DarkLight1337, I have fixed the test registry, now the api timeout error seems raised outside of thie PR.

ywang96

Very sorry for the long delay - let's get this in!

…ect#15130) Signed-off-by: fyabc <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Xiong Wang <[email protected]> Signed-off-by: Yang Wang <[email protected]>

…ect#15130) Signed-off-by: fyabc <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Xiong Wang <[email protected]>

…ect#15130) Signed-off-by: fyabc <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Xiong Wang <[email protected]> Signed-off-by: Agata Dobrzyniewicz <[email protected]>

…ect#15130) Signed-off-by: fyabc <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Xiong Wang <[email protected]> Signed-off-by: Mu Huai <[email protected]>

Official PR: vllm-project#15130 example: python examples/offline_inference/audio_language.py --model-type qwen2_5_omni python examples/offline_inference/vision_language.py --modality image --model-type qwen2_5_omni python examples/offline_inference/vision_language.py --modality video --model-type qwen2_5_omni Signed-off-by: Chen, Wenbin <[email protected]>

fyabc requested review from WoosukKwon, robertgshaw2-redhat, njhill, ywang96, comaniac, alexm-redhat and DarkLight1337 as code owners March 19, 2025 13:58

mergify bot added documentation Improvements or additions to documentation multi-modality Related to multi-modality (#4194) v1 labels Mar 19, 2025

DarkLight1337 requested review from jeejeelee and Isotr0py March 19, 2025 14:06

DarkLight1337 reviewed Mar 19, 2025

View reviewed changes

vllm/model_executor/models/qwen2_5_omni_thinker.py Outdated Show resolved Hide resolved

Isotr0py reviewed Mar 19, 2025

View reviewed changes

vllm/transformers_utils/config.py Outdated Show resolved Hide resolved

vllm/transformers_utils/config.py Outdated Show resolved Hide resolved

vllm/model_executor/models/qwen2_5_omni_thinker.py Outdated Show resolved Hide resolved

ywang96 self-assigned this Mar 19, 2025

jeejeelee reviewed Mar 19, 2025

View reviewed changes

vllm/model_executor/models/qwen2_5_omni_thinker.py Outdated Show resolved Hide resolved

jeejeelee reviewed Mar 19, 2025

View reviewed changes

vllm/model_executor/models/registry.py Show resolved Hide resolved

DarkLight1337 added this to Multi-modal Model Requests Mar 19, 2025

DarkLight1337 moved this to In Progress in Multi-modal Model Requests Mar 19, 2025

DarkLight1337 reviewed Mar 20, 2025

View reviewed changes

vllm/multimodal/inputs.py Show resolved Hide resolved

ywang96 reviewed Mar 21, 2025

View reviewed changes

DarkLight1337 reviewed Mar 26, 2025

View reviewed changes

vllm/assets/video.py Outdated Show resolved Hide resolved

Isotr0py mentioned this pull request Mar 26, 2025

[New Model]: please surport for Qwen/Qwen2.5-Omni-7B #15563

Closed

1 task

DarkLight1337 reviewed Mar 27, 2025

View reviewed changes

tests/models/registry.py Outdated Show resolved Hide resolved

DarkLight1337 approved these changes Apr 18, 2025

View reviewed changes

DarkLight1337 added ready ONLY add when PR is ready to merge/full CI is needed and removed structured-output speculative-decoding labels Apr 18, 2025

fyabc requested a review from ywang96 April 18, 2025 11:56

fyabc added 2 commits April 18, 2025 22:38

fix test registry

41c5855

Signed-off-by: fyabc <[email protected]>

Merge remote-tracking branch 'origin/qwen2_omni_public_v1' into qwen2…

d1e2046

…_omni_public_v1

ywang96 approved these changes Apr 19, 2025

View reviewed changes

ywang96 merged commit 2c1bd84 into vllm-project:main Apr 19, 2025
49 checks passed

github-project-automation bot moved this from In Progress to Done in Multi-modal Model Requests Apr 19, 2025

ywang96 mentioned this pull request Apr 19, 2025

[Model] Qwen2.5-Omni Cleanup #16872

Merged

fyabc mentioned this pull request Apr 21, 2025

[Bugfix] Fix distributed bug in Qwen2.5-VL & Qwen2.5-Omni #16907

Merged

fyabc mentioned this pull request Apr 22, 2025

[Bugfix] Fix distributed bug again in Qwen2.5-VL & Qwen2.5-Omni #16974

Merged

hiyouga mentioned this pull request Apr 24, 2025

Does LLaMA-Factory support API inference for Qwen2.5-Omni hiyouga/LLaMA-Factory#7841

Closed

1 task

ckhordiasma mentioned this pull request May 14, 2025

nm vllm ent 0.8.5 sync red-hat-data-services/vllm#139

Merged

wenbinc-Bin mentioned this pull request May 19, 2025

Qwen2.5 omni HabanaAI/vllm-fork#1269

Closed

Uh oh!

[Model][VLM] Add Qwen2.5-Omni model support (thinker only) #15130

[Model][VLM] Add Qwen2.5-Omni model support (thinker only) #15130

Uh oh!

Conversation

fyabc commented Mar 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Requirements

Example Usage

Notes

Uh oh!

github-actions bot commented Mar 19, 2025

Uh oh!

DarkLight1337 commented Mar 19, 2025

Uh oh!

fyabc commented Mar 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yangninghua commented Mar 21, 2025

Uh oh!

ywang96 commented Mar 21, 2025

Uh oh!

ywang96 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fyabc commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 commented Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fyabc commented Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ywang96 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

fyabc commented Mar 19, 2025 •

edited by github-actions bot

Loading

DarkLight1337 commented Apr 18, 2025 •

edited

Loading

fyabc commented Apr 18, 2025 •

edited

Loading