fix(security): Add VLLM_MAX_N_SEQUENCES environment variable and enforce limit#37952
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
|
Documentation preview: https://vllm--37952.org.readthedocs.build/en/37952/ |
There was a problem hiding this comment.
Code Review
This pull request introduces a security enhancement by adding the VLLM_MAX_N_SEQUENCES environment variable to limit the number of sequences per request, preventing potential resource exhaustion attacks. The changes include updates to SamplingParams to enforce this limit, corresponding documentation, and new tests.
My review found an issue in the newly added tests. The tests do not correctly handle the caching mechanism of vllm.envs, which could lead to flaky and unreliable test results. I've provided a suggestion to fix this to ensure test isolation and correctness.
9953ff7 to
81a735e
Compare
Signed-off-by: jperezde <jperezde@redhat.com>
c66562d to
af70c62
Compare
|
Hi @jperezdealgaba, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
Is the readthedocs error an error on our side?
It really doesn't seem the case |
|
Thanks! Will wait for the merge! |
…rce limit (vllm-project#37952) Signed-off-by: jperezde <jperezde@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
…rce limit (vllm-project#37952) Signed-off-by: jperezde <jperezde@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Nithin Chalapathi <nithin.ch10@gmail.com>
…rce limit (vllm-project#37952) Signed-off-by: jperezde <jperezde@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Russell Bryant <rbryant@redhat.com>
…rce limit (vllm-project#37952) Signed-off-by: jperezde <jperezde@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>
…rce limit (vllm-project#37952) Signed-off-by: jperezde <jperezde@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: zhutaoyu <zhutaoyu97@gmail.com>
…rce limit (vllm-project#37952) Signed-off-by: jperezde <jperezde@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: neweyes <328719365@qq.com>
…rce limit (vllm-project#37952) Signed-off-by: jperezde <jperezde@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: EricccYang <yangyang4991@gmail.com>
…rce limit (vllm-project#37952) Signed-off-by: jperezde <jperezde@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: bhargav-patel-29 <bhargav.patel@tihiitb.org>
…rce limit (vllm-project#37952) Signed-off-by: jperezde <jperezde@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Russell Bryant <rbryant@redhat.com>
…rce limit (vllm-project#37952) Signed-off-by: jperezde <jperezde@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: rishitdholakia13 <rishit+github@cohere.com>
…rce limit (vllm-project#37952) Signed-off-by: jperezde <jperezde@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Rishi Puri <riship@nvidia.com>
…rce limit (vllm-project#37952) Signed-off-by: jperezde <jperezde@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Russell Bryant <rbryant@redhat.com>
MR assissted with: claude-4.6-opus.max
This commit introduces a new environment variable, VLLM_MAX_N_SEQUENCES, which sets the maximum allowed number of output sequences per request to prevent excessive resource consumption. The SamplingParams class has been updated to validate the 'n' parameter against this new limit, raising an error if exceeded.
Purpose
The goal of this MR is to introduce the VLLM_MAX_N_SEQUENCES environment variable to prevent highly large n sequences blocking the main thread and causing denial of service attacks.
Test Plan
We added tests for making sure the upper limits works. To test it:
python -m pytest tests/test_envs.py::TestVllmMaxNSequences
Test Result
=============================== warnings summary ===============================
:488
:488: DeprecationWarning: builtin type SwigPyPacked has no module attribute
:488
:488: DeprecationWarning: builtin type SwigPyObject has no module attribute
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================== 50 passed, 2 warnings in 4.92s ========================