Add "/server_info" endpoint in api_server to retrieve the vllm_config. #16572

Cangxihui · 2025-04-14T06:47:58Z

Add a server_info endpoint to allow users to directly retrieve the vllm configuration parameters without the need to parse logs..

Example APi -http://localhost:8000/server_info
{"vllm_config":"model='deepseek-ai/DeepSeek-R1-Distill-Qwen-7B', speculative_config=None, tokenizer='deepseek-ai/DeepSeek-R1-Distill-Qwen-7B', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=32768, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=True, kv_cache_dtype=auto, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='xgrammar', reasoning_backend=None), observability_config=ObservabilityConfig(show_hidden_metrics=False, otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=None, served_model_name=deepseek-ai/DeepSeek-R1-Distill-Qwen-7B, num_scheduler_steps=1, multi_step_stream_outputs=True, enable_prefix_caching=True, chunked_prefill_enabled=True, use_async_output_proc=False, disable_mm_preprocessor_cache=False, mm_processor_kwargs=None, pooler_config=None, compilation_config={\"splitting_ops\":[],\"compile_sizes\":[],\"cudagraph_capture_sizes\":[],\"max_capture_size\":0}"}

Signed-off-by: Xihui Cang <[email protected]>

github-actions · 2025-04-14T06:48:19Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

DarkLight1337 · 2025-04-14T07:35:35Z

For security reasons, this information should only be dev facing. Can you move this endpoint under the if envs.VLLM_SERVER_DEV_MODE: guard?

…uard Signed-off-by: Xihui Cang <[email protected]>

DarkLight1337 · 2025-04-14T11:13:00Z

vllm/entrypoints/openai/api_server.py

+# Store global states
+@dataclasses.dataclass
+class _GlobalState:
+    vllmconfig: VllmConfig


Suggested change

vllmconfig: VllmConfig

vllm_config: VllmConfig

Also do we really need the whole vLLM config? We can avoid creating a new global state object if we can simply use model_config.

Even if we need the whole vLLM config, we should initialize it in init_app_state.

I think the information provided by model_config is sometimes insufficient. We want to record and display all the parameters used when starting the vllm serve server. On one hand, this allows users to more easily understand all the server's configurations. On the other hand, it facilitates comparisons between different runs and makes it easier to fully reproduce previous experiments based on these parameters. Currently, the only way to obtain and record this information is by parsing logs, which has limitations. Moreover, if the log format changes, the parsing logic also needs to be adjusted accordingly. Thank you for your suggestions, I will try to initialize it in init_app_state.

…E is 1, then add "/server_info" endpoint in api_server. Signed-off-by: Xihui Cang <[email protected]>

vllm/engine/async_llm_engine.py

Signed-off-by: Xihui Cang <[email protected]>

DarkLight1337 · 2025-04-14T14:00:39Z

vllm/entrypoints/openai/api_server.py

+    @router.get("/server_info")
+    async def show_server_info(raw_request: Request):
+        server_info = {"vllm_config": str(raw_request.app.state.vllm_config)}
+        return JSONResponse(content=server_info)


Place this at the top of the block since it's more "basic"?

vllm/entrypoints/openai/api_server.py

…w_server_info, get_vllm_config Signed-off-by: Xihui Cang <[email protected]>

vllm/entrypoints/openai/api_server.py

DarkLight1337

LGTM now, thanks

…. (vllm-project#16572) Signed-off-by: Xihui Cang <[email protected]> Signed-off-by: Yang Wang <[email protected]>

…. (vllm-project#16572) Signed-off-by: Xihui Cang <[email protected]>

…. (vllm-project#16572) Signed-off-by: Xihui Cang <[email protected]> Signed-off-by: Mu Huai <[email protected]>

Add "/server_info" endpoint in api_server to retrieve the vllm_config.

df44ffc

Signed-off-by: Xihui Cang <[email protected]>

mergify bot added the frontend label Apr 14, 2025

move '/server_info' endpoint under the if envs.VLLM_SERVER_DEV_MODE:g…

1a72372

…uard Signed-off-by: Xihui Cang <[email protected]>

Cangxihui force-pushed the main branch from 7b8e296 to 1a72372 Compare April 14, 2025 11:11

DarkLight1337 reviewed Apr 14, 2025

View reviewed changes

Cangxihui requested review from WoosukKwon, robertgshaw2-redhat, njhill, ywang96, comaniac, alexm-redhat, zhuohan123 and youkaichao as code owners April 14, 2025 13:31

Initialize vLLM config in init_app_state, if envs.VLLM_SERVER_DEV_MOD…

41973c3

…E is 1, then add "/server_info" endpoint in api_server. Signed-off-by: Xihui Cang <[email protected]>

mergify bot added the v1 label Apr 14, 2025

Cangxihui force-pushed the main branch from 65076f7 to 41973c3 Compare April 14, 2025 13:32

DarkLight1337 reviewed Apr 14, 2025

View reviewed changes

vllm/engine/async_llm_engine.py Show resolved Hide resolved

Move get_vllm_config above get_model_config

57034d4

Signed-off-by: Xihui Cang <[email protected]>

Cangxihui force-pushed the main branch from 54f8d1a to 57034d4 Compare April 14, 2025 13:53

DarkLight1337 reviewed Apr 14, 2025

View reviewed changes

vllm/entrypoints/openai/api_server.py Show resolved Hide resolved

EngineClient add get_vllm_config function, Adjust the position of sho…

0dc5397

…w_server_info, get_vllm_config Signed-off-by: Xihui Cang <[email protected]>

DarkLight1337 reviewed Apr 15, 2025

View reviewed changes

vllm/entrypoints/openai/api_server.py Show resolved Hide resolved

DarkLight1337 approved these changes Apr 15, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) April 15, 2025 06:47

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 15, 2025

DarkLight1337 merged commit 1666e66 into vllm-project:main Apr 15, 2025
64 checks passed

yangw-dev pushed a commit to yangw-dev/vllm that referenced this pull request Apr 21, 2025

Add "/server_info" endpoint in api_server to retrieve the vllm_config…

8e85b26

…. (vllm-project#16572) Signed-off-by: Xihui Cang <[email protected]> Signed-off-by: Yang Wang <[email protected]>

jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Apr 29, 2025

Add "/server_info" endpoint in api_server to retrieve the vllm_config…

440ec60

…. (vllm-project#16572) Signed-off-by: Xihui Cang <[email protected]>

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Apr 29, 2025

Add "/server_info" endpoint in api_server to retrieve the vllm_config…

74d126a

…. (vllm-project#16572) Signed-off-by: Xihui Cang <[email protected]>

RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025

Add "/server_info" endpoint in api_server to retrieve the vllm_config…

11641bf

…. (vllm-project#16572) Signed-off-by: Xihui Cang <[email protected]> Signed-off-by: Mu Huai <[email protected]>

ckhordiasma mentioned this pull request May 14, 2025

nm vllm ent 0.8.5 sync red-hat-data-services/vllm#139

Merged

thameem-abbas mentioned this pull request May 19, 2025

[Feature Request] Record server config from /server_info endpoint when vLLM 0.9.0 lands neuralmagic/guidellm#166

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add "/server_info" endpoint in api_server to retrieve the vllm_config. #16572

Add "/server_info" endpoint in api_server to retrieve the vllm_config. #16572

Uh oh!

Cangxihui commented Apr 14, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Apr 14, 2025

Uh oh!

DarkLight1337 commented Apr 14, 2025

Uh oh!

DarkLight1337 Apr 14, 2025

Uh oh!

DarkLight1337 Apr 14, 2025

Uh oh!

DarkLight1337 Apr 14, 2025 •

edited

Loading

Uh oh!

Cangxihui Apr 14, 2025

Uh oh!

Uh oh!

DarkLight1337 Apr 14, 2025

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add "/server_info" endpoint in api_server to retrieve the vllm_config. #16572

Add "/server_info" endpoint in api_server to retrieve the vllm_config. #16572

Uh oh!

Conversation

Cangxihui commented Apr 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 14, 2025

Uh oh!

DarkLight1337 commented Apr 14, 2025

Uh oh!

DarkLight1337 Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Cangxihui Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DarkLight1337 Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Cangxihui commented Apr 14, 2025 •

edited by github-actions bot

Loading

DarkLight1337 Apr 14, 2025 •

edited

Loading