Skip to content

[V1] Enable multi-input by default #15799

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Apr 12, 2025

Conversation

DarkLight1337
Copy link
Member

@DarkLight1337 DarkLight1337 commented Mar 31, 2025

This PR enables multiple multi-modal input items for V1 without having to set limit_mm_per_prompt.

Note: This may increase the default memory usage for multi-modal models because max_num_mm_items_decoder_budget no longer limits max_num_mm_items in GPUModelRunner.profile_run. You can explicitly set the limit to one via limit_mm_per_prompt or even disable unused modalities completely by setting the limit of that modality to zero. I have added a section to the Offline Inference docs accordingly.

There is no need to set limits for V1 since encoder and decoder are profiled separately which should avoid OOM during inference time. The only hard limit is the context length which is checked in Processor._validate_model_inputs already.

Note: Users can still limit_mm_per_prompt to exclude individual modalities from being profiled and used in inference.

This is loosely a follow-up to #15703 which removed the direct dependency of various models on multimodal limits.

Some other changes:

  • To reduce memory cost, unused modalities are now fully disabled in examples and the common model tests, instead of using the default limit of that modality.
  • Fix incorrect type annotations of data parsing overrides for MiniCPM-O/V and Qwen2-VL.

Signed-off-by: DarkLight1337 <[email protected]>
@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 31, 2025
@DarkLight1337 DarkLight1337 requested a review from ywang96 as a code owner March 31, 2025 07:15
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mergify mergify bot added frontend multi-modality Related to multi-modality (#4194) labels Mar 31, 2025
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
@mergify mergify bot added the documentation Improvements or additions to documentation label Mar 31, 2025
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
@DarkLight1337 DarkLight1337 removed the ready ONLY add when PR is ready to merge/full CI is needed label Mar 31, 2025
Signed-off-by: DarkLight1337 <[email protected]>
@DarkLight1337 DarkLight1337 changed the title [V1] Disable multimodal limits [V1] Enable multi-input by default Mar 31, 2025
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 11, 2025
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
@DarkLight1337 DarkLight1337 requested review from mgoin and Isotr0py April 12, 2025 00:40
Signed-off-by: DarkLight1337 <[email protected]>
Copy link
Collaborator

@Isotr0py Isotr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall look reasonable to me!

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) April 12, 2025 06:51
@DarkLight1337 DarkLight1337 merged commit d9fc8cd into vllm-project:main Apr 12, 2025
49 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Multi-modality Core Apr 12, 2025
@DarkLight1337 DarkLight1337 deleted the v1-mm-limits branch April 12, 2025 09:36
This was referenced Apr 18, 2025
yangw-dev pushed a commit to yangw-dev/vllm that referenced this pull request Apr 21, 2025
jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Apr 29, 2025
lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Apr 29, 2025
RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation frontend multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants