Closed
Description
Is this planned? Seems like good idea to support full OpenAI behavior and any batching is already handled well by vLLM, so should be relatively easy I would guess?
vllm/vllm/entrypoints/openai/api_server.py
Lines 396 to 401 in acbed3e
Metadata
Metadata
Assignees
Labels
No labels